Indexes are added mostly based on WHERE conditions used with queries execute frequently. Once the index is added, it speeds up the search on conditions added but does not improve the speed of loading values related to non-indexed columns. Adding non-indexed columns to the index and making all queries as part of the index is considered as Covering Index Pattern. However this post is not about Covering Index Pattern but something similar to it, called as Included Columns Pattern.
This was discussed in August user group meeting. This code is related to one of demos.
For testing purposes, I have used one of Microsoft sample databases which has millions of records. It is called ContosoRetailDW and it can be downloaded at: http://www.microsoft.com/en-us/download/details.aspx?displaylang=en&id=18279
Let’s create a new table using one of existing tables and add a clustered index on it.
USE ContosoRetailDW
GO
SELECT *
INTO dbo.FactSalesNew
FROM dbo.FactSales
CREATE UNIQUE CLUSTERED INDEX IX_FactSalesNew ON dbo.FactSalesNew (SalesKey)
GO
Enable “Actual Execution Plan” and run the query below.
SET STATISTICS IO ON
SELECT SalesKey, SalesAmount, ReturnAmount, DiscountAmount
FROM dbo.FactSalesNew f
INNER JOIN dbo.DimDate d
ON d.Datekey = f.DateKey
AND CalendarYear = 2007
WHERE ReturnAmount BETWEEN 1000 AND 1500
SET STATISTICS IO OFF
Have a look on pages read for loading the resultset into memory and have a look on execution plan.
This table has 3 million records and they are held using 52491 pages. As you see with the first result, SQL Server reads all pages for searching records (Clustered Index Scan) and returns only 2061 records. Let’s try to decrease the number of pages to be read by adding an index on ReturnAmount column.
CREATE INDEX IX_FactSalesNew_ReturnAmount ON dbo.FactSalesNew (ReturnAmount)
Once added, run the same query and see the result.
SQL Server uses the index added and number of pages read is 19930 now. It has improved the overall operation. However, can we improve little bit more? Yes, it is possible with included columns. You notice with the plan that even though there is an index, SQL Server still reads the clustered index because all other columns except RetuenAmount are not exist with the index. What if we include other columns into same index? See the below query.
DROP INDEX IX_FactSalesNew_ReturnAmount ON dbo.FactSalesNew
GO
CREATE INDEX IX_FactSalesNew_ReturnAmount ON dbo.FactSalesNew (ReturnAmount)
INCLUDE (DateKey, SalesAmount, DiscountAmount)
As you see, other columns used in the query are added to index now using INCLUDE clause. Remember, they have not been added as key columns, hence they cannot be used for searching. If a query is executed with “WHERE SalesAmount BETWEEN 1000 AND 1500”, SQL Server cannot use the index for searching. This is the key difference between Covering Index Pattern and Included Columns Patter. See the result of the SELECT query now.
Less number of pages to be read and no need to go through the clustered index. In this case, improvement is very high, however everything has pros and cons. Since additional columns are maintained in the index, remember that cost is involved with this when updating records.
No comments:
Post a Comment