Showing posts with label GUID. Show all posts
Showing posts with label GUID. Show all posts

Monday, December 28, 2015

GUID or Uniqueidentifier as the key - II - Issues with it

My first post related to GUID discussed the main issue we face when using GUID as the key, specifically as the Clustered Key. Here is another concern we should consider when maintaining a table with GUID as the key. This issue does need to be considered as a critical issue if the table contains low number of records. But if it contains millions of records, this is something we need to consider for sure.

Let's create two tables just as the first post. The below code creates two tables named dbo.Sales_WithGUID and dbo.Sales_WithoutGUID and loads same data set for both tables.

USE tempdb;
GO

IF OBJECT_ID('dbo.Sales_WithGUID') IS NOT NULL
 DROP TABLE dbo.Sales_WithGUID;
GO

-- Creating the table
CREATE TABLE dbo.Sales_WithGUID
(
 SalesKey uniqueidentifier PRIMARY KEY DEFAULT(NewID()),
 ProductKey int NOT NULL,
 OrderDateKey int NOT NULL,
 DueDateKey int NOT NULL,
 ShipDateKey int NOT NULL,
 CustomerKey int NOT NULL,
 PromotionKey int NOT NULL,
 CurrencyKey int NOT NULL,
 SalesTerritoryKey int NOT NULL,
 SalesOrderNumber nvarchar(20) NOT NULL,
 SalesOrderLineNumber tinyint NOT NULL,
 RevisionNumber tinyint NOT NULL,
 OrderQuantity smallint NOT NULL,
 UnitPrice money NOT NULL,
 ExtendedAmount money NOT NULL,
 UnitPriceDiscountPct float NOT NULL,
 DiscountAmount float NOT NULL,
 ProductStandardCost money NOT NULL,
 TotalProductCost money NOT NULL,
 SalesAmount money NOT NULL,
 TaxAmt money NOT NULL,
 Freight money NOT NULL,
 CarrierTrackingNumber nvarchar(25) NULL,
 CustomerPONumber nvarchar(25) NULL,
 OrderDate datetime NULL,
 DueDate datetime NULL,
 ShipDate datetime NULL
);
GO

INSERT INTO dbo.Sales_WithGUID
  (
   ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey
    ,CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber
    ,OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount
    ,ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight
    ,CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate
  )
SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey
      ,CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber
      ,OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount
      ,ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight
      ,CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate
FROM AdventureWorksDW2014.dbo.FactInternetSales;


IF OBJECT_ID('dbo.Sales_WithoutGUID') IS NOT NULL
 DROP TABLE dbo.Sales_WithoutGUID;
GO

-- Create the second table
CREATE TABLE dbo.Sales_WithoutGUID
(
 SalesKey int identity(1,1),
 CompanyCode char(2) Default ('SL'),
 constraint pk_Sales_WithoutGUID primary key (SalesKey, CompanyCode),
 ProductKey int NOT NULL,
 OrderDateKey int NOT NULL,
 DueDateKey int NOT NULL,
 ShipDateKey int NOT NULL,
 CustomerKey int NOT NULL,
 PromotionKey int NOT NULL,
 CurrencyKey int NOT NULL,
 SalesTerritoryKey int NOT NULL,
 SalesOrderNumber nvarchar(20) NOT NULL,
 SalesOrderLineNumber tinyint NOT NULL,
 RevisionNumber tinyint NOT NULL,
 OrderQuantity smallint NOT NULL,
 UnitPrice money NOT NULL,
 ExtendedAmount money NOT NULL,
 UnitPriceDiscountPct float NOT NULL,
 DiscountAmount float NOT NULL,
 ProductStandardCost money NOT NULL,
 TotalProductCost money NOT NULL,
 SalesAmount money NOT NULL,
 TaxAmt money NOT NULL,
 Freight money NOT NULL,
 CarrierTrackingNumber nvarchar(25) NULL,
 CustomerPONumber nvarchar(25) NULL,
 OrderDate datetime NULL,
 DueDate datetime NULL,
 ShipDate datetime NULL
);
GO

INSERT INTO dbo.Sales_WithoutGUID
  (
   ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey
    ,CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber
    ,OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount
    ,ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight
    ,CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate
  )
SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey
      ,CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber
      ,OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount
      ,ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight
      ,CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate
FROM AdventureWorksDW2014.dbo.FactInternetSales;

Let's check the space used by both tables.


As you see, first table which is based on GUID consumes 10.7MB for data and 80KB for indexes. Second table which is based on int+char(2) as the key consumes 10.0MB for data and 56KB for indexes. Now, let's additional non-clustered indexes for both tables.

CREATE INDEX IX_Sales_WithGUID_ShipDateKey ON dbo.Sales_WithGUID (ShipDateKey);
CREATE INDEX IX_Sales_WithGUID_CurrencyKey ON dbo.Sales_WithGUID (CurrencyKey);
CREATE INDEX IX_Sales_WithGUID_CustomerKey ON dbo.Sales_WithGUID (CustomerKey);
CREATE INDEX IX_Sales_WithGUID_DueDateKey ON dbo.Sales_WithGUID (DueDateKey);
CREATE INDEX IX_Sales_WithGUID_OrderDateKey ON dbo.Sales_WithGUID (OrderDateKey);
CREATE INDEX IX_Sales_WithGUID_ProductKey ON dbo.Sales_WithGUID (ProductKey);
CREATE INDEX IX_Sales_WithGUID_PromotionKey ON dbo.Sales_WithGUID (PromotionKey);

CREATE INDEX IX_Sales_WithoutGUID_ShipDateKey ON dbo.Sales_WithoutGUID (ShipDateKey);
CREATE INDEX IX_Sales_WithoutGUID_CurrencyKey ON dbo.Sales_WithoutGUID (CurrencyKey);
CREATE INDEX IX_Sales_WithoutGUID_CustomerKey ON dbo.Sales_WithoutGUID (CustomerKey);
CREATE INDEX IX_Sales_WithoutGUID_DueDateKey ON dbo.Sales_WithoutGUID (DueDateKey);
CREATE INDEX IX_Sales_WithoutGUID_OrderDateKey ON dbo.Sales_WithoutGUID (OrderDateKey);
CREATE INDEX IX_Sales_WithoutGUID_ProductKey ON dbo.Sales_WithoutGUID (ProductKey);
CREATE INDEX IX_Sales_WithoutGUID_PromotionKey ON dbo.Sales_WithoutGUID (PromotionKey);

Here is the result now.


Focus on the space used for indexes. First table uses 11MB while second table uses only 7MB. Note that this is only for 60,000 records but if the table has more records, gap will be significantly huge.

How can it be happened when we have same number of records and same types of indexes. This is because of every non-clustered key value is maintained with the clustered key. When it comes to the first table, since it is 16 bytes, the size of the non-clustered key value will be 16+4. But with the second table, it is 6+4 bytes. This is the reason for high usage of the spave for non-clustered indexes.

Here is the code that shows the number of data pages used by each index. Large number of pages increases IO operations and usage of resources.

SELECT OBJECT_NAME (i.object_id) AS TableName 
 , i.name AS IndexName
 , ps.index_id AS IndexID 
 , ps.alloc_unit_type_desc AS DataStructure
 , ps.page_count AS Pages
 , ps.record_count AS Rows
FROM sys.indexes AS i
 CROSS APPLY sys.dm_db_index_physical_stats 
  (DB_ID ()
  , i.object_id
  , NULL
  , NULL
  , N'DETAILED') AS ps
WHERE i.object_id = ps.object_id
  AND i.index_id = ps.index_id
  AND i.object_id 
   IN (OBJECT_ID (N'Sales_WithGUID')
    , OBJECT_ID (N'Sales_WithoutGUID'))
  AND ps.index_level = 0 -- only lef level
ORDER BY TableName, IndexID;
GO


Now you see issue we may face when using GUIDs, hence use it knowing these issues and manage it properly if it is really required.

Saturday, December 26, 2015

GUID or Uniqueidentifier as the key - I - Issues with it

While analyzing a database related to one of the companies, I noticed that the key of most tables is based on GUID (uniqueidentifier) that drastically hinders the performance unless it is managed properly. Unfortunately, as per my experience in consultancy, 90% of databases that use GUID as the key for tables are not properly managed and many are unaware of the damages it does though they experience poor performance on data manipulation and retrieval.

Most reason out that they need GUID as the key because of a need of an uniqueness among records specifically when records are joined from multiple companies or regions. But with many cases, it can be handled without using Unqiueidentifier but using a composite key that is formed using an integer value and a code that represents the company or region. Understanding the cost of it and understanding how it slows down your queries will surely guide you to take the right decision, hence here is a post on it.

Let's try to understand the most common issue related to GUID as the key. Following code creates a table called Sales_WithGUID and loads data from AdventureWorks. The key of the table, which is SalesKey is a GUID. Note that I do not load records as a batch, but load one by one simulating the real operation.

USE tempdb;
GO

IF OBJECT_ID('dbo.Sales_WithGUID') IS NOT NULL
 DROP TABLE dbo.Sales_WithGUID;
GO

-- Creating the table
CREATE TABLE dbo.Sales_WithGUID
(
 SalesKey uniqueidentifier PRIMARY KEY DEFAULT(NewID()),
 ProductKey int NOT NULL,
 OrderDateKey int NOT NULL,
 DueDateKey int NOT NULL,
 ShipDateKey int NOT NULL,
 CustomerKey int NOT NULL,
 PromotionKey int NOT NULL,
 CurrencyKey int NOT NULL,
 SalesTerritoryKey int NOT NULL,
 SalesOrderNumber nvarchar(20) NOT NULL,
 SalesOrderLineNumber tinyint NOT NULL,
 RevisionNumber tinyint NOT NULL,
 OrderQuantity smallint NOT NULL,
 UnitPrice money NOT NULL,
 ExtendedAmount money NOT NULL,
 UnitPriceDiscountPct float NOT NULL,
 DiscountAmount float NOT NULL,
 ProductStandardCost money NOT NULL,
 TotalProductCost money NOT NULL,
 SalesAmount money NOT NULL,
 TaxAmt money NOT NULL,
 Freight money NOT NULL,
 CarrierTrackingNumber nvarchar(25) NULL,
 CustomerPONumber nvarchar(25) NULL,
 OrderDate datetime NULL,
 DueDate datetime NULL,
 ShipDate datetime NULL
);
GO

-- Loading records one by one
DECLARE cur CURSOR 
FOR
SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey
      ,CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber
      ,OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount
      ,ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight
      ,CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate
  FROM AdventureWorksDW2014.dbo.FactInternetSales;

DECLARE @ProductKey int 
  , @OrderDateKey int
  , @DueDateKey int
  , @ShipDateKey int 
  , @CustomerKey int 
  , @PromotionKey int 
  , @CurrencyKey int 
  , @SalesTerritoryKey int 
  , @SalesOrderNumber nvarchar(20) 
  , @SalesOrderLineNumber tinyint 
  , @RevisionNumber tinyint 
  , @OrderQuantity smallint 
  , @UnitPrice money 
  , @ExtendedAmount money 
  , @UnitPriceDiscountPct float 
  , @DiscountAmount float 
  , @ProductStandardCost money 
  , @TotalProductCost money 
  , @SalesAmount money 
  , @TaxAmt money 
  , @Freight money 
  , @CarrierTrackingNumber nvarchar(25) 
  , @CustomerPONumber nvarchar(25)  
  , @OrderDate datetime  
  , @DueDate datetime  
  , @ShipDate datetime

OPEN cur

FETCH NEXT FROM cur INTO @ProductKey, @OrderDateKey, @DueDateKey, @ShipDateKey, @CustomerKey, @PromotionKey
      , @CurrencyKey, @SalesTerritoryKey, @SalesOrderNumber, @SalesOrderLineNumber, @RevisionNumber
      , @OrderQuantity, @UnitPrice, @ExtendedAmount, @UnitPriceDiscountPct, @DiscountAmount
      , @ProductStandardCost, @TotalProductCost, @SalesAmount, @TaxAmt, @Freight
      , @CarrierTrackingNumber, @CustomerPONumber, @OrderDate, @DueDate, @ShipDate

WHILE (@@FETCH_STATUS = 0)
BEGIN

 INSERT INTO dbo.Sales_WithGUID
  (
   ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey
    ,CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber
    ,OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount
    ,ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight
    ,CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate
  )
 VALUES
  (@ProductKey, @OrderDateKey, @DueDateKey, @ShipDateKey, @CustomerKey, @PromotionKey
    , @CurrencyKey, @SalesTerritoryKey, @SalesOrderNumber, @SalesOrderLineNumber, @RevisionNumber
    , @OrderQuantity, @UnitPrice, @ExtendedAmount, @UnitPriceDiscountPct, @DiscountAmount
    , @ProductStandardCost, @TotalProductCost, @SalesAmount, @TaxAmt, @Freight
    , @CarrierTrackingNumber, @CustomerPONumber, @OrderDate, @DueDate, @ShipDate)

 FETCH NEXT FROM cur INTO @ProductKey, @OrderDateKey, @DueDateKey, @ShipDateKey, @CustomerKey, @PromotionKey
      , @CurrencyKey, @SalesTerritoryKey, @SalesOrderNumber, @SalesOrderLineNumber, @RevisionNumber
      , @OrderQuantity, @UnitPrice, @ExtendedAmount, @UnitPriceDiscountPct, @DiscountAmount
      , @ProductStandardCost, @TotalProductCost, @SalesAmount, @TaxAmt, @Freight
      , @CarrierTrackingNumber, @CustomerPONumber, @OrderDate, @DueDate, @ShipDate
END

-- Cleaning
CLOSE cur
DEALLOCATE cur

Let's create another table called Sales_WithoutGUID and loads the same. Note that its primary key is a combination of int and char column. This is what I explained above and suggested to the company: using two columns as a composite primary key rather than using uniqueidentifier. Using this composite key, company can be identified and records can be uniquely identified.

USE tempdb;
GO

IF OBJECT_ID('dbo.Sales_WithoutGUID') IS NOT NULL
 DROP TABLE dbo.Sales_WithoutGUID;
GO

-- Create the second table
CREATE TABLE dbo.Sales_WithoutGUID
(
 SalesKey int identity(1,1),
 CompanyCode char(2) Default ('SL'),
 constraint pk_Sales_WithoutGUID primary key (SalesKey, CompanyCode),
 ProductKey int NOT NULL,
 OrderDateKey int NOT NULL,
 DueDateKey int NOT NULL,
 ShipDateKey int NOT NULL,
 CustomerKey int NOT NULL,
 PromotionKey int NOT NULL,
 CurrencyKey int NOT NULL,
 SalesTerritoryKey int NOT NULL,
 SalesOrderNumber nvarchar(20) NOT NULL,
 SalesOrderLineNumber tinyint NOT NULL,
 RevisionNumber tinyint NOT NULL,
 OrderQuantity smallint NOT NULL,
 UnitPrice money NOT NULL,
 ExtendedAmount money NOT NULL,
 UnitPriceDiscountPct float NOT NULL,
 DiscountAmount float NOT NULL,
 ProductStandardCost money NOT NULL,
 TotalProductCost money NOT NULL,
 SalesAmount money NOT NULL,
 TaxAmt money NOT NULL,
 Freight money NOT NULL,
 CarrierTrackingNumber nvarchar(25) NULL,
 CustomerPONumber nvarchar(25) NULL,
 OrderDate datetime NULL,
 DueDate datetime NULL,
 ShipDate datetime NULL
);
GO

-- Loading
DECLARE cur CURSOR 
FOR
SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey
      ,CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber
      ,OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount
      ,ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight
      ,CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate
  FROM AdventureWorksDW2014.dbo.FactInternetSales;

DECLARE @ProductKey int 
  , @OrderDateKey int
  , @DueDateKey int
  , @ShipDateKey int 
  , @CustomerKey int 
  , @PromotionKey int 
  , @CurrencyKey int 
  , @SalesTerritoryKey int 
  , @SalesOrderNumber nvarchar(20) 
  , @SalesOrderLineNumber tinyint 
  , @RevisionNumber tinyint 
  , @OrderQuantity smallint 
  , @UnitPrice money 
  , @ExtendedAmount money 
  , @UnitPriceDiscountPct float 
  , @DiscountAmount float 
  , @ProductStandardCost money 
  , @TotalProductCost money 
  , @SalesAmount money 
  , @TaxAmt money 
  , @Freight money 
  , @CarrierTrackingNumber nvarchar(25) 
  , @CustomerPONumber nvarchar(25)  
  , @OrderDate datetime  
  , @DueDate datetime  
  , @ShipDate datetime

OPEN cur

FETCH NEXT FROM cur INTO @ProductKey, @OrderDateKey, @DueDateKey, @ShipDateKey, @CustomerKey, @PromotionKey
      , @CurrencyKey, @SalesTerritoryKey, @SalesOrderNumber, @SalesOrderLineNumber, @RevisionNumber
      , @OrderQuantity, @UnitPrice, @ExtendedAmount, @UnitPriceDiscountPct, @DiscountAmount
      , @ProductStandardCost, @TotalProductCost, @SalesAmount, @TaxAmt, @Freight
      , @CarrierTrackingNumber, @CustomerPONumber, @OrderDate, @DueDate, @ShipDate

WHILE (@@FETCH_STATUS = 0)
BEGIN

 INSERT INTO dbo.Sales_WithoutGUID
  (
   ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey
    ,CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber
    ,OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount
    ,ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight
    ,CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate
  )
 VALUES
  (@ProductKey, @OrderDateKey, @DueDateKey, @ShipDateKey, @CustomerKey, @PromotionKey
    , @CurrencyKey, @SalesTerritoryKey, @SalesOrderNumber, @SalesOrderLineNumber, @RevisionNumber
    , @OrderQuantity, @UnitPrice, @ExtendedAmount, @UnitPriceDiscountPct, @DiscountAmount
    , @ProductStandardCost, @TotalProductCost, @SalesAmount, @TaxAmt, @Freight
    , @CarrierTrackingNumber, @CustomerPONumber, @OrderDate, @DueDate, @ShipDate)

 FETCH NEXT FROM cur INTO @ProductKey, @OrderDateKey, @DueDateKey, @ShipDateKey, @CustomerKey, @PromotionKey
      , @CurrencyKey, @SalesTerritoryKey, @SalesOrderNumber, @SalesOrderLineNumber, @RevisionNumber
      , @OrderQuantity, @UnitPrice, @ExtendedAmount, @UnitPriceDiscountPct, @DiscountAmount
      , @ProductStandardCost, @TotalProductCost, @SalesAmount, @TaxAmt, @Freight
      , @CarrierTrackingNumber, @CustomerPONumber, @OrderDate, @DueDate, @ShipDate
END

-- Cleaning
CLOSE cur
DEALLOCATE cur

Now let's analyze and see. The below code has two SELECT statements that check the fragmentation of both tables. As you see, first table is Externally Fragmented and second table is not fragmented. This is the biggest and most common issue with GUIDs. Since there is no sequential values when generating, when inserting records, in order to maintain the clustered key order, SQL Server splits data pages for moving records to new page, causing External Fragmentation (Read more on fragmentation at: http://dinesql.blogspot.com/2013/08/understanding-index-fragmentation-ss.html). Because of this, reading causes accessing larger number of data pages, slowing down the query and consuming resource.

Other issue is the space required. As you see, second table requires only 10MB for maintaining records while first table requires 15MB.

There is another considerable impact on non-clustered indexes if you, will make another post on it.

As you see the disadvantages with GUID now, try to minimize the usage of it if possible. There is a workaround if the order of the key is only the issue. The function NEWSEQUENTIALID can be used instead of NEWID for generating GUIDs but still the second issue discussed is remained unsorted.

SELECT * FROM sys.dm_db_index_physical_stats 
 (DB_ID(), OBJECT_ID(N'dbo.Sales_WithGUID', N'U'), 1, NULL, 'DETAILED')
SELECT * FROM sys.dm_db_index_physical_stats 
 (DB_ID(), OBJECT_ID(N'dbo.Sales_WithoutGUID', N'U'), 1, NULL, 'DETAILED')

EXEC sp_spaceused N'Sales_WithGUID';
EXEC sp_spaceused N'Sales_WithoutGUID';


Sunday, August 25, 2013

Using GUID column as Clustered Index Key– SS SLUG Aug 2013 – Demo II

GUIDs are commonly used in distributed applications which require “uniqueness” across the entire world. Unfortunately I have seen the usage of GUIDs with clustered keys in non-distributed applications, where global uniqueness is not required. This was discussed in my presentation and showed how useful the GUIDs as clustered key as well as how it makes the index fragmented. This post is for the demo code related to the discussion.

Before going through the code, we must understand that GUIDs are not as bad as we think if it is managed well. You can make an uniqueidentifier column as PRIMARY KEY, as clustered key. Although it does not as efficient as int data type, it gives moderate efficiency. Let’s look at how this makes the clustered index fragmented and how it can be avoided.

USE tempdb
GO
 
-- create a table with uniqueidentifier
-- and make it as the clustered key
IF OBJECT_ID('dbo.GUID_Table') IS NOT NULL
    DROP TABLE dbo.GUID_Table
GO
CREATE TABLE dbo.GUID_Table 
(
    Id uniqueidentifier PRIMARY KEY
    , name char(2000)
)
GO
 
-- insert 100 records with default values
INSERT INTO dbo.GUID_Table
VALUES
    (NEWID(), 'a')
GO 100
 
 
SELECT * FROM sys.dm_db_index_physical_stats (
    DB_ID(), OBJECT_ID('dbo.GUID_Table'), NULL, NULL, 'DETAILED')

Once the last statement is run, you will see how fragmented your table is. For more info on fragmentation, please refer: http://dinesql.blogspot.com/2013/08/understanding-index-fragmentation-ss.html.

image

Both external fragmentation (97%) and internal fragmentation (54%) are very high. The reason is, page splits  and records movement during insertions. Since GUIDs are not sequentially generated, record placement in pages is always an issue for SQL Server. What happen is, when a GUID is to be inserted as the key (entire record in this case), it looks for the page which record needs to be placed, and if no space in the page, it splits the page, moving 50% of records in the page to a new page, breaking the order of the pages which are ordered based on keys. Run the code below for seeing the linkage between pages.

DBCC IND (tempdb, [GUID_Table], -1)

image

As you see, SQL Server has to do many “read-back” for reading data sequentially, making all queries slowing down. The only way to avoid this with GUIDs is, use NEWSEQUENTIALID instead of NEWID. It generates GUIDs that are sequential to the last generated GUID. If Insertion is made using NEWSEQUENTIALID, external fragmentation will be lesser because of its sequential order on generation. Re-create the table and run the INSERT statement as below;

-- create table again
DROP TABLE dbo.GUID_Table 
GO
CREATE TABLE dbo.GUID_Table 
(
    Id uniqueidentifier PRIMARY KEY DEFAULT (NEWSEQUENTIALID())
    , name char(2000)
)
 
INSERT INTO dbo.GUID_Table
VALUES
    (DEFAULT, 'a')
GO 100
 
SELECT * FROM sys.dm_db_index_physical_stats (
    DB_ID('tempdb'), OBJECT_ID('GUID_Table'), NULL, NULL, 'DETAILED')
 
DBCC IND (tempdb, [GUID_Table], -1)

If you analyze PagePID and NextPagePID in DBCC IND result-set now, you will see how pages are ordered and no “read-backward” is needed. And the SELECT statement proves that no fragmentation has happened too. This clearly shows that with NEWSEQUENTIALID, split is not required as the value generated is always greater than the value exist. There are two key things not remember on it;

  • NEWSEQUENTIALID always generates a higher value greater than the one generated before by same server.
  • The uniqueness is limited to the server used only. Duplication can happen if values are generated with two servers.