Saturday, December 26, 2015

GUID or Uniqueidentifier as the key - I - Issues with it

While analyzing a database related to one of the companies, I noticed that the key of most tables is based on GUID (uniqueidentifier) that drastically hinders the performance unless it is managed properly. Unfortunately, as per my experience in consultancy, 90% of databases that use GUID as the key for tables are not properly managed and many are unaware of the damages it does though they experience poor performance on data manipulation and retrieval.

Most reason out that they need GUID as the key because of a need of an uniqueness among records specifically when records are joined from multiple companies or regions. But with many cases, it can be handled without using Unqiueidentifier but using a composite key that is formed using an integer value and a code that represents the company or region. Understanding the cost of it and understanding how it slows down your queries will surely guide you to take the right decision, hence here is a post on it.

Let's try to understand the most common issue related to GUID as the key. Following code creates a table called Sales_WithGUID and loads data from AdventureWorks. The key of the table, which is SalesKey is a GUID. Note that I do not load records as a batch, but load one by one simulating the real operation.

  1. USE tempdb;  
  2. GO  
  3.   
  4. IF OBJECT_ID('dbo.Sales_WithGUID'IS NOT NULL  
  5.  DROP TABLE dbo.Sales_WithGUID;  
  6. GO  
  7.   
  8. -- Creating the table  
  9. CREATE TABLE dbo.Sales_WithGUID  
  10. (  
  11.  SalesKey uniqueidentifier PRIMARY KEY DEFAULT(NewID()),  
  12.  ProductKey int NOT NULL,  
  13.  OrderDateKey int NOT NULL,  
  14.  DueDateKey int NOT NULL,  
  15.  ShipDateKey int NOT NULL,  
  16.  CustomerKey int NOT NULL,  
  17.  PromotionKey int NOT NULL,  
  18.  CurrencyKey int NOT NULL,  
  19.  SalesTerritoryKey int NOT NULL,  
  20.  SalesOrderNumber nvarchar(20) NOT NULL,  
  21.  SalesOrderLineNumber tinyint NOT NULL,  
  22.  RevisionNumber tinyint NOT NULL,  
  23.  OrderQuantity smallint NOT NULL,  
  24.  UnitPrice money NOT NULL,  
  25.  ExtendedAmount money NOT NULL,  
  26.  UnitPriceDiscountPct float NOT NULL,  
  27.  DiscountAmount float NOT NULL,  
  28.  ProductStandardCost money NOT NULL,  
  29.  TotalProductCost money NOT NULL,  
  30.  SalesAmount money NOT NULL,  
  31.  TaxAmt money NOT NULL,  
  32.  Freight money NOT NULL,  
  33.  CarrierTrackingNumber nvarchar(25) NULL,  
  34.  CustomerPONumber nvarchar(25) NULL,  
  35.  OrderDate datetime NULL,  
  36.  DueDate datetime NULL,  
  37.  ShipDate datetime NULL  
  38. );  
  39. GO  
  40.   
  41. -- Loading records one by one  
  42. DECLARE cur CURSOR   
  43. FOR  
  44. SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey  
  45.       ,CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber  
  46.       ,OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount  
  47.       ,ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight  
  48.       ,CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate  
  49.   FROM AdventureWorksDW2014.dbo.FactInternetSales;  
  50.   
  51. DECLARE @ProductKey int   
  52.   , @OrderDateKey int  
  53.   , @DueDateKey int  
  54.   , @ShipDateKey int   
  55.   , @CustomerKey int   
  56.   , @PromotionKey int   
  57.   , @CurrencyKey int   
  58.   , @SalesTerritoryKey int   
  59.   , @SalesOrderNumber nvarchar(20)   
  60.   , @SalesOrderLineNumber tinyint   
  61.   , @RevisionNumber tinyint   
  62.   , @OrderQuantity smallint   
  63.   , @UnitPrice money   
  64.   , @ExtendedAmount money   
  65.   , @UnitPriceDiscountPct float   
  66.   , @DiscountAmount float   
  67.   , @ProductStandardCost money   
  68.   , @TotalProductCost money   
  69.   , @SalesAmount money   
  70.   , @TaxAmt money   
  71.   , @Freight money   
  72.   , @CarrierTrackingNumber nvarchar(25)   
  73.   , @CustomerPONumber nvarchar(25)    
  74.   , @OrderDate datetime    
  75.   , @DueDate datetime    
  76.   , @ShipDate datetime  
  77.   
  78. OPEN cur  
  79.   
  80. FETCH NEXT FROM cur INTO @ProductKey, @OrderDateKey, @DueDateKey, @ShipDateKey, @CustomerKey, @PromotionKey  
  81.       , @CurrencyKey, @SalesTerritoryKey, @SalesOrderNumber, @SalesOrderLineNumber, @RevisionNumber  
  82.       , @OrderQuantity, @UnitPrice, @ExtendedAmount, @UnitPriceDiscountPct, @DiscountAmount  
  83.       , @ProductStandardCost, @TotalProductCost, @SalesAmount, @TaxAmt, @Freight  
  84.       , @CarrierTrackingNumber, @CustomerPONumber, @OrderDate, @DueDate, @ShipDate  
  85.   
  86. WHILE (@@FETCH_STATUS = 0)  
  87. BEGIN  
  88.   
  89.  INSERT INTO dbo.Sales_WithGUID  
  90.   (  
  91.    ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey  
  92.     ,CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber  
  93.     ,OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount  
  94.     ,ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight  
  95.     ,CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate  
  96.   )  
  97.  VALUES  
  98.   (@ProductKey, @OrderDateKey, @DueDateKey, @ShipDateKey, @CustomerKey, @PromotionKey  
  99.     , @CurrencyKey, @SalesTerritoryKey, @SalesOrderNumber, @SalesOrderLineNumber, @RevisionNumber  
  100.     , @OrderQuantity, @UnitPrice, @ExtendedAmount, @UnitPriceDiscountPct, @DiscountAmount  
  101.     , @ProductStandardCost, @TotalProductCost, @SalesAmount, @TaxAmt, @Freight  
  102.     , @CarrierTrackingNumber, @CustomerPONumber, @OrderDate, @DueDate, @ShipDate)  
  103.   
  104.  FETCH NEXT FROM cur INTO @ProductKey, @OrderDateKey, @DueDateKey, @ShipDateKey, @CustomerKey, @PromotionKey  
  105.       , @CurrencyKey, @SalesTerritoryKey, @SalesOrderNumber, @SalesOrderLineNumber, @RevisionNumber  
  106.       , @OrderQuantity, @UnitPrice, @ExtendedAmount, @UnitPriceDiscountPct, @DiscountAmount  
  107.       , @ProductStandardCost, @TotalProductCost, @SalesAmount, @TaxAmt, @Freight  
  108.       , @CarrierTrackingNumber, @CustomerPONumber, @OrderDate, @DueDate, @ShipDate  
  109. END  
  110.   
  111. -- Cleaning  
  112. CLOSE cur  
  113. DEALLOCATE cur  

Let's create another table called Sales_WithoutGUID and loads the same. Note that its primary key is a combination of int and char column. This is what I explained above and suggested to the company: using two columns as a composite primary key rather than using uniqueidentifier. Using this composite key, company can be identified and records can be uniquely identified.

  1. USE tempdb;  
  2. GO  
  3.   
  4. IF OBJECT_ID('dbo.Sales_WithoutGUID'IS NOT NULL  
  5.  DROP TABLE dbo.Sales_WithoutGUID;  
  6. GO  
  7.   
  8. -- Create the second table  
  9. CREATE TABLE dbo.Sales_WithoutGUID  
  10. (  
  11.  SalesKey int identity(1,1),  
  12.  CompanyCode char(2) Default ('SL'),  
  13.  constraint pk_Sales_WithoutGUID primary key (SalesKey, CompanyCode),  
  14.  ProductKey int NOT NULL,  
  15.  OrderDateKey int NOT NULL,  
  16.  DueDateKey int NOT NULL,  
  17.  ShipDateKey int NOT NULL,  
  18.  CustomerKey int NOT NULL,  
  19.  PromotionKey int NOT NULL,  
  20.  CurrencyKey int NOT NULL,  
  21.  SalesTerritoryKey int NOT NULL,  
  22.  SalesOrderNumber nvarchar(20) NOT NULL,  
  23.  SalesOrderLineNumber tinyint NOT NULL,  
  24.  RevisionNumber tinyint NOT NULL,  
  25.  OrderQuantity smallint NOT NULL,  
  26.  UnitPrice money NOT NULL,  
  27.  ExtendedAmount money NOT NULL,  
  28.  UnitPriceDiscountPct float NOT NULL,  
  29.  DiscountAmount float NOT NULL,  
  30.  ProductStandardCost money NOT NULL,  
  31.  TotalProductCost money NOT NULL,  
  32.  SalesAmount money NOT NULL,  
  33.  TaxAmt money NOT NULL,  
  34.  Freight money NOT NULL,  
  35.  CarrierTrackingNumber nvarchar(25) NULL,  
  36.  CustomerPONumber nvarchar(25) NULL,  
  37.  OrderDate datetime NULL,  
  38.  DueDate datetime NULL,  
  39.  ShipDate datetime NULL  
  40. );  
  41. GO  
  42.   
  43. -- Loading  
  44. DECLARE cur CURSOR   
  45. FOR  
  46. SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey  
  47.       ,CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber  
  48.       ,OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount  
  49.       ,ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight  
  50.       ,CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate  
  51.   FROM AdventureWorksDW2014.dbo.FactInternetSales;  
  52.   
  53. DECLARE @ProductKey int   
  54.   , @OrderDateKey int  
  55.   , @DueDateKey int  
  56.   , @ShipDateKey int   
  57.   , @CustomerKey int   
  58.   , @PromotionKey int   
  59.   , @CurrencyKey int   
  60.   , @SalesTerritoryKey int   
  61.   , @SalesOrderNumber nvarchar(20)   
  62.   , @SalesOrderLineNumber tinyint   
  63.   , @RevisionNumber tinyint   
  64.   , @OrderQuantity smallint   
  65.   , @UnitPrice money   
  66.   , @ExtendedAmount money   
  67.   , @UnitPriceDiscountPct float   
  68.   , @DiscountAmount float   
  69.   , @ProductStandardCost money   
  70.   , @TotalProductCost money   
  71.   , @SalesAmount money   
  72.   , @TaxAmt money   
  73.   , @Freight money   
  74.   , @CarrierTrackingNumber nvarchar(25)   
  75.   , @CustomerPONumber nvarchar(25)    
  76.   , @OrderDate datetime    
  77.   , @DueDate datetime    
  78.   , @ShipDate datetime  
  79.   
  80. OPEN cur  
  81.   
  82. FETCH NEXT FROM cur INTO @ProductKey, @OrderDateKey, @DueDateKey, @ShipDateKey, @CustomerKey, @PromotionKey  
  83.       , @CurrencyKey, @SalesTerritoryKey, @SalesOrderNumber, @SalesOrderLineNumber, @RevisionNumber  
  84.       , @OrderQuantity, @UnitPrice, @ExtendedAmount, @UnitPriceDiscountPct, @DiscountAmount  
  85.       , @ProductStandardCost, @TotalProductCost, @SalesAmount, @TaxAmt, @Freight  
  86.       , @CarrierTrackingNumber, @CustomerPONumber, @OrderDate, @DueDate, @ShipDate  
  87.   
  88. WHILE (@@FETCH_STATUS = 0)  
  89. BEGIN  
  90.   
  91.  INSERT INTO dbo.Sales_WithoutGUID  
  92.   (  
  93.    ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey  
  94.     ,CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber  
  95.     ,OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount  
  96.     ,ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight  
  97.     ,CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate  
  98.   )  
  99.  VALUES  
  100.   (@ProductKey, @OrderDateKey, @DueDateKey, @ShipDateKey, @CustomerKey, @PromotionKey  
  101.     , @CurrencyKey, @SalesTerritoryKey, @SalesOrderNumber, @SalesOrderLineNumber, @RevisionNumber  
  102.     , @OrderQuantity, @UnitPrice, @ExtendedAmount, @UnitPriceDiscountPct, @DiscountAmount  
  103.     , @ProductStandardCost, @TotalProductCost, @SalesAmount, @TaxAmt, @Freight  
  104.     , @CarrierTrackingNumber, @CustomerPONumber, @OrderDate, @DueDate, @ShipDate)  
  105.   
  106.  FETCH NEXT FROM cur INTO @ProductKey, @OrderDateKey, @DueDateKey, @ShipDateKey, @CustomerKey, @PromotionKey  
  107.       , @CurrencyKey, @SalesTerritoryKey, @SalesOrderNumber, @SalesOrderLineNumber, @RevisionNumber  
  108.       , @OrderQuantity, @UnitPrice, @ExtendedAmount, @UnitPriceDiscountPct, @DiscountAmount  
  109.       , @ProductStandardCost, @TotalProductCost, @SalesAmount, @TaxAmt, @Freight  
  110.       , @CarrierTrackingNumber, @CustomerPONumber, @OrderDate, @DueDate, @ShipDate  
  111. END  
  112.   
  113. -- Cleaning  
  114. CLOSE cur  
  115. DEALLOCATE cur  

Now let's analyze and see. The below code has two SELECT statements that check the fragmentation of both tables. As you see, first table is Externally Fragmented and second table is not fragmented. This is the biggest and most common issue with GUIDs. Since there is no sequential values when generating, when inserting records, in order to maintain the clustered key order, SQL Server splits data pages for moving records to new page, causing External Fragmentation (Read more on fragmentation at: http://dinesql.blogspot.com/2013/08/understanding-index-fragmentation-ss.html). Because of this, reading causes accessing larger number of data pages, slowing down the query and consuming resource.

Other issue is the space required. As you see, second table requires only 10MB for maintaining records while first table requires 15MB.

There is another considerable impact on non-clustered indexes if you, will make another post on it.

As you see the disadvantages with GUID now, try to minimize the usage of it if possible. There is a workaround if the order of the key is only the issue. The function NEWSEQUENTIALID can be used instead of NEWID for generating GUIDs but still the second issue discussed is remained unsorted.

  1. SELECT * FROM sys.dm_db_index_physical_stats   
  2.  (DB_ID(), OBJECT_ID(N'dbo.Sales_WithGUID', N'U'), 1, NULL'DETAILED')  
  3. SELECT * FROM sys.dm_db_index_physical_stats   
  4.  (DB_ID(), OBJECT_ID(N'dbo.Sales_WithoutGUID', N'U'), 1, NULL'DETAILED')  
  5.   
  6. EXEC sp_spaceused N'Sales_WithGUID';  
  7. EXEC sp_spaceused N'Sales_WithoutGUID';  


No comments: