Monday, March 15, 2010

Does your DELETE operation deallocate empty data pages?

When we execute DELETE operations against tables, SQL Server is supposed to delete records and deallocate emptied, associated data pages. Deallocation is done, in order to make those emptied data pages available for other objects. If the pages are not released, the spaced used by the deleted records will be unusable. Because of that we need to make sure that pages are deallocated.

Why bother? SQL Server deallocates empty pages at the deletion!!! Yes, but there is an exception. It might not deallocate empty pages if the table is on the Heap. This is because SQL Server places locks on rows and pages when deleting on the heap, resulting empty pages remained allocated. Let me show you with an example:

Let’s create a table and insert some records. Note that no clustered index is defined, hence table is created on the heap.

   1: CREATE TABLE TestTable
   2: (
   3:     Id int identity(1,1)
   4:     ,Description varchar(4000)
   5: )
   6: GO
   7: INSERT INTO TestTable
   8:     (Description)
   9: VALUES
  10:     (REPLICATE('A', 4000))
  11:     ,(REPLICATE('B', 4000))
  12:     ,(REPLICATE('C', 4000))
  13:     ,(REPLICATE('D', 4000))
  14:     ,(REPLICATE('E', 4000))
  15:     ,(REPLICATE('F', 4000))
  16:     ,(REPLICATE('G', 4000))
  17:     ,(REPLICATE('H', 4000))
  18:     ,(REPLICATE('I', 4000))

When the table is queried, while IO statistics are on, you can see that 5 pages have been read for retrieving data.

It can be further checked by using DBCC IND too. When execute, it lists out 6 rows; 5 rows representing data pages (PageType = 1) and 1 row representing the IAM page (PageType = 10).

   1: SET STATISTICS IO ON
   2: SELECT * FROM TestTable
   3: SET STATISTICS IO OFF
   4:  
   5: DBCC IND('tempdb', TestTable, -1)

Now delete all records and run above codes again. You will still see that it reads 5 pages for data retrieval. If you run the DBCC IND again, you will see that same pages numbers are still allocated to this table. If you need to check and see whether the pages contain data, try with DBCC PAGE.

   1: -- Delete all records
   2: DELETE TestTable
   3:  
   4: -- This shows that it still
   5: -- reads 5 pages
   6: SET STATISTICS IO ON
   7: SELECT * FROM TestTable
   8: SET STATISTICS IO OFF
   9:  
  10: -- This shows that still the 
  11: -- pages are not deallocated
  12: DBCC IND('tempdb', TestTable, -1)
  13:  
  14: -- Check one page and see whether
  15: -- it contains data.
  16: -- Note that you have turn 3604 flag
  17: -- on to run DBCC PAGE
  18: -- Parameters are: Database, FileNumber,
  19: -- PageNumber, DisplayOption
  20: DBCC TRACEON(3604)
  21: DBCC PAGE ('tempdb', 1, 167, 3) WITH TABLERESULTS

See, pages are not released. There are few ways to make sure that data pages are deallocated at deletion of this scenario. First way is, using TABLOCK hint with DELETE statement.

   1: DELETE TestTable WITH (TABLOCK)
   2:  

This changes the way of placing locks, SQL Server places shared locks on the table instead of rows or pages, hence pages are deallocated.

Second way is, use TRUNCATE instead of DELETE. This deallocates pages too.

Another way is, adding a clustered index on the table and doing the deletion, which might time consuming.

Anyway, most of the tables we design have clustered indexes, if you have tables without clustered indexes, make sure you use one of above methods when deleting.

No comments: