Saturday, August 24, 2013

Understanding Index Fragmentation – SS SLUG Aug 2013 – Demo I

Index fragmentation is something we need to monitor frequently specifically on databases that are heavily indexed. If indexes added are fragmented, you will not get the expected performance from indexes. This is one of the areas discussed with last user group meeting, here is the sample codes that shows how indexes are getting fragmented.

Run below code to create a sample table with values.

USE tempdb
GO
 
-- creating test table
IF OBJECT_ID(N'dbo.TestTable', 'U') IS NOT NULL
BEGIN
DROP TABLE dbo.TestTable
END
GO
CREATE TABLE TestTable (Id int IDENTITY(1,1) PRIMARY KEY, [Value] varchar(900))
GO
 
-- Inserting sample values
DECLARE @a int, @b int
SET @a = 65
WHILE (@a < 91)
BEGIN
 
    SET @b = 0
    WHILE (@b < 20)
    BEGIN
 
        INSERT INTO TestTable ([Value])
        SELECT REPLICATE(CHAR(@a), 445) + CONVERT(VARCHAR(10), @b)
        SET @b = @b + 1
    END
    SET @a = @a + 2
END
GO
 
-- See the values inserted, [Value] column contains
-- values like AAAA..., CCCC..., EEEE..., etc
SELECT * FROM dbo.TestTable ORDER BY Id
 
-- making an index on [Value] column
CREATE INDEX IX_TestTable ON dbo.TestTable([Value])
GO
 
-- checking for internal fragmentation
SELECT * FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID(N'dbo.TestTable', N'U'), 2, NULL, 'DETAILED')

The last statement which is DMF used to check the fragementation of indexes. Two columns used for checking the fragmentation, avg_fragmentation_in_percent for External fragmentation and avg_page_space_used_in_percent for Internal fragmentation.

Here are few points discussed regarding fragmentation;

  • Internal fragmentation
    Inefficient use of pages within an index because the amount of data stored within each page is less than the data page can contain.
    • <= 30% reorganize index
    • > 30% rebuild index
  • External fragmentation (Logical and Extent)
    Inefficient use of pages within an index because the logical order of the page is wrong.
    • <= 75% and >= 60% reorganize index
    • < 60% rebuild index

image

As per result, we do not need to worry much on fragmentation. SQL Server uses 16 pages for holding 260 records (averagely 16 records per page). For testing purposes, let’s fragment the index . Here is the code;

-- reducing the zize of the index key for some records
UPDATE dbo.TestTable
SET [Value] = LEFT([Value], 1)
WHERE Id % 2 = 0
 
-- checking for internal fragmentation
SELECT * FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID(N'dbo.TestTable', N'U'), 2, NULL, 'DETAILED')

And the result is;

image

As you see, since we removed some characters from Value column, pages have space now, only 47% of space has been used. You really do not need 16 pages for holding 260 records now but SQL Server still holds records in 260 pages which is not efficient, making more IO and using more memory for your queries, decreasing the performance of them. Rebuilding the index sorts this out;

-- removing the fragmentation
ALTER INDEX IX_TestTable ON dbo.TestTable REBUILD

What you have to remember is, update like above makes your indexes fragmented, hence try to avoid such updates, if performed, make sure you check for fragmentation and defragment if required.

Let’s start from the beginning for understanding External fragmentation. It is all about breaking the order of the records. Run the below code for understanding the current order. Make sure you run the first code segment above (table creation and inserting)again  before running this.

-- checking for external fragmentation
SELECT * FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID(N'dbo.TestTable', N'U'),2, NULL, 'DETAILED')
 
-- run this undocumented command to see the NextPageID 
DBCC IND (tempdb, 'TestTable', 2)

Here are the results;

image

image

Look at the first result, it says that External fragmentation of the clustered index is 18.5% which is okay. Next result is from DBCC IND command (For more info on DBCC IND, refer: http://dinesql.blogspot.com/2013/08/dbcc-ind.html) shows how pages are linked. Focus on PagePID and NextPagePID columns, and PageType column. Page type 2 represents index pages. Index level 0 represents leaf pages. This says that next page of 312is 314 and next page of 314 is 316, which is a perfect order.

Let’s make the table externally fragmented. Run the code below;

-- inserting new values such as
-- Bs, Ds, Fs, etc....
-- Making page-splits
DECLARE @a INT
SET @a = 66
WHILE (@a < 86)
BEGIN
 
    INSERT INTO TestTable ([Value])
    SELECT REPLICATE(CHAR(@a), 445) + CONVERT(VARCHAR(10), @a)
    SET @a = @a + 2
END
GO

When records are inserted in between existing values (B has to be inserted between A and C), SQL Server needs to check and see the space availability of the correct page. If no space available, page-split occurs and 50% of the records are moved to a new page. This breaks the order of pages. Run the sys.dm_db_index_physical_stats and DBCC IND again and see;

image

image

We can clearly see what has happened to the order of pages, it is broken, it is fragmented. When you query data from this table (for example: data from pages 312, 341, 50699, 316), it has to read forward (which is fine) and read back (which is very costly) making more IOs and slowing down the generation of the resultset. If you rebuild the index, this can be fixed, however what you have to remember is, not to make your indexes externally fragmented from your actions, making query performance very poor. Here are some points you need to consider when dealing with index keys. These reduce external fragmentation.

  • Select a non-volatile columns for the keys.
  • Make sure column values are increasing values.

Presentation related to this demo is available at: SQL Server Universe.

2 comments:

jamiet said...

Really good explanation. Thanks Dinesh.

jamiet said...

Really good explanation, thanks Dinesh.