Skip to main content

How Database Index works 'n' improves performance

Today we will understand indexes which we have been reading and listening for a long time. Whenever we query something, and it runs slow we hear many steps of performance tuning. And “Use Proper Index” will always make the list. 

 

What is an Index? 

Google says . 

Lets start our analysis!! 

Alphabetical it says. What is alphabetical? Sorted? Yes. Index is some sort of sorting. :) 

“Reference to pages on which they are mentioned”. But we are not dealing with books here. We are dealing with data. lets say our data is stored as pages on computer. (Don’t want to assume? Want to see how data is stored in memory? Check this link). Now index is some sorted object which holds reference to the data stored either in-memory or disk. 

What benefit I get on creating on extra object which consumes my computer resources?? 

Well... lets go into our classical dictionary example. 

1st question.. Do I need indexes always?


I gave you a page with words and meanings. A nano size dictionary. Asked you to FIND the meaning of one word. What you will do?? Will just read through all the words in the page as they are very few and find the word and give me the meaning. To put in some terms, you “SCAN” your page and find out the word or we can say you will “SEARCH” the data.  Will you go and create and new index, saying these words are on 1st half of page and words from 2nd half of page? No. We don’t need that extra object/index for our small data (can we say small table? or a small data object if we are dealing with programing here).


Now my words increased, now I have 100 pages of words to “SEARCH”.  Now we have two things to consider. Let say I gave you word “INDEX” to search. You started your SCAN from the 1st word and found the word “INDEX” in the 1st page itself. Lucky us. Best case scenario. But what if there is another meaning for index and it was written at the end of the dictionary? Even if we find the “INDEX” word, we have to SCAN through entire dictionary and see if there is another “INDEX”.  Hardwork. But we did it. Again I gave you another word “SEARCH” to “SEARCH” in your dictionary. Now what? REPEAT”.


Wait! Wait! I cant do it again and again, “THIS DATA IS BEING LOOKED FREQUENTLY. Let's be smart and analyze the COST and OPTIMIZE my working process. What makes my life simple here? Come on ninjas think. Yeah!! I will work hard once, make this data “SORTED”. Atleast I don’t need to “SCAN” the entire data always. All the occurrences will be found along the first occurrence. Job done. 

Now I have 1K pages? 1M pages? Superficial for a dictionary. But let's say!! Even if the data is sorted, I can't start from beginning everytime and SCAN till the word. Time to brainstorm again. Ok, as my words/data is sorted, I will make a note of starting letter and ending letter of each page, and I can just skim through the “NOTES” I made and directly land on the page where my starting letter of word is present?  or I will just call it a “SEEK”. 

Let's take a look back. Simply speaking, indexing is the mechanism where we “SORT” our data and maintain address of the data (where it is stored) to make our queries run faster. 

So, what kind of indexes are out there? Let's explore indexes in SQL Server in our next meet!! 

Please comment your feedback and queries! 

 


Comments

Popular posts from this blog