Tuesday, August 27, 2013

Understanding Index INCLUDED COLUMNS - SS SLUG Aug 2013 – Demo III

Indexes are added mostly based on WHERE conditions used with queries execute frequently. Once the index is added, it speeds up the search on conditions added but does not improve the speed of loading values related to non-indexed columns. Adding non-indexed columns to the index and making all queries as part of the index is considered as Covering Index Pattern. However this post is not about Covering Index Pattern but something similar to it, called as Included Columns Pattern.

This was discussed in August user group meeting. This code is related to one of demos.

For testing purposes, I have used one of Microsoft sample databases which has millions of records. It is called ContosoRetailDW and it can be downloaded at: http://www.microsoft.com/en-us/download/details.aspx?displaylang=en&id=18279

Let’s create a new table using one of existing tables and add a clustered index on it.

USE ContosoRetailDW
GO
 
SELECT *
INTO dbo.FactSalesNew
FROM  dbo.FactSales
 
CREATE UNIQUE CLUSTERED INDEX IX_FactSalesNew ON dbo.FactSalesNew (SalesKey)
GO

Enable “Actual Execution Plan” and run the query below.

SET STATISTICS IO ON
 
SELECT SalesKey, SalesAmount, ReturnAmount, DiscountAmount
FROM dbo.FactSalesNew f
    INNER JOIN dbo.DimDate d
        ON d.Datekey = f.DateKey
            AND CalendarYear = 2007
WHERE ReturnAmount BETWEEN 1000 AND 1500
 
SET STATISTICS IO OFF

Have a look on pages read for loading the resultset into memory and have a look on execution plan.

image

image

This table has 3 million records and they are held using 52491 pages. As you see with the first result, SQL Server reads all pages for searching records (Clustered Index Scan) and returns only 2061 records. Let’s try to decrease the number of pages to be read by adding an index on ReturnAmount column.

CREATE INDEX IX_FactSalesNew_ReturnAmount ON dbo.FactSalesNew (ReturnAmount)

Once added, run the same query and see the result.

image

image

SQL Server uses the index added and number of pages read is 19930 now. It has improved the overall operation. However, can we improve little bit more? Yes, it is possible with included columns. You notice with the plan that even though there is an index, SQL Server still reads the clustered index because all other columns except RetuenAmount are not exist with the index. What if we include other columns into same index? See the below query.

DROP INDEX IX_FactSalesNew_ReturnAmount ON dbo.FactSalesNew 
GO
 
CREATE INDEX IX_FactSalesNew_ReturnAmount ON dbo.FactSalesNew (ReturnAmount)
    INCLUDE (DateKey, SalesAmount, DiscountAmount)

As you see, other columns used in the query are added to index now using INCLUDE clause. Remember, they have not been added as key columns, hence they cannot be used for searching. If a query is executed with “WHERE SalesAmount BETWEEN 1000 AND 1500”, SQL Server cannot use the index for searching. This is the key difference between Covering Index Pattern and Included Columns Patter. See the result of the SELECT query now.

image

image

Less number of pages to be read and no need to go through the clustered index. In this case, improvement is very high, however everything has pros and cons. Since additional columns are maintained in the index, remember that cost is involved with this when updating records.

No comments: