Thursday, April 13, 2017

SQL Server Splits the page when I change an Integer Column Value

I have written few posts on Index Fragmentation and how they can be checked. External Fragmentation occurs as a result of Page-Split and it happens when there is no space in the page to accommodate the change (new record or an update). When no space, it moves approximately 50% of records to a new page, accommodating the change to relevant page.

Read Understanding Index Fragmentation – SS SLUG Aug 2013 – Demo I post for understanding fragmentation.

Read SQL Server - Checking Index Fragmentation - Understanding LIMITED, SAMPLED and DETAILED for understanding various modes that can be used for checking fragmentation.

For understanding page-split, lets take an example. Have a look on below table.

-- creating test table
CREATE TABLE dbo.TestTable
(
    Id int PRIMARY KEY
    , Value char(4000) NOT NULL
)
GO
 
-- inserting 2 records
INSERT INTO dbo.TestTable
    VALUES 
    (2, replicate('a', 4000)), (5, replicate('b', 4000));

It has two records. Since the approximate size of a record is 4004 bytes (4 for int column and 4000 for char column), SQL Server can use one page for holding both records. The size of a data page is 8KB. The below query proves that both records are in the same page.

SELECT Id, Value, ph.*
FROM dbo.TestTable
CROSS APPLY fn_PhysLocCracker(%%PHYSLOC%%) ph;


If I make a change to Value column, SQL Server does not need to split the page as the current value uses 4000 bytes. Example, if I change the first record value as REPLICATE('x', 4000), it can simply replace the old value, hence no page split. See below code and the result.

UPDATE dbo.TestTable
    SET Value = replicate('x', 4000)
WHERE id = 2
 
SELECT Id, Value, ph.*
FROM dbo.TestTable
CROSS APPLY fn_PhysLocCracker(%%PHYSLOC%%) ph;



Just like that, if we make a change to the integer value, we should not see any change on pages used. Let's make a small change and see.

UPDATE dbo.TestTable
    SET Id = 3
WHERE id = 2;
 
SELECT Id, Value, ph.*
FROM dbo.TestTable
CROSS APPLY fn_PhysLocCracker(%%PHYSLOC%%) ph;



Did you notice the change? Why the second record has been moved to a new page? Mean Page-Split has been occurred with this changed?

Is it because of the data type? It is not. Note that this is not just an integer column. It is the PRIMARY KEY; The clustered key. What happened was, SQL Server updated the Value column as an in-place update. It does not require additional space. However SQL Server does not perform in-place update for key columns. It performs delete-plus-insert operation for key column updates. It requires space for the INSERT. That is the reason for the page split. Understand that this behavior is not only for integer, it is for all key columns.

This is why we discourage you to select a column as the key column if it can be changed by a business operation. It is always recommended to use a non-volatile column as the key column that do not expect changes on values.

No comments: