Monday, December 28, 2015

GUID or Uniqueidentifier as the key - II - Issues with it

My first post related to GUID discussed the main issue we face when using GUID as the key, specifically as the Clustered Key. Here is another concern we should consider when maintaining a table with GUID as the key. This issue does need to be considered as a critical issue if the table contains low number of records. But if it contains millions of records, this is something we need to consider for sure.

Let's create two tables just as the first post. The below code creates two tables named dbo.Sales_WithGUID and dbo.Sales_WithoutGUID and loads same data set for both tables.

  1. USE tempdb;  
  2. GO  
  3.   
  4. IF OBJECT_ID('dbo.Sales_WithGUID'IS NOT NULL  
  5.  DROP TABLE dbo.Sales_WithGUID;  
  6. GO  
  7.   
  8. -- Creating the table  
  9. CREATE TABLE dbo.Sales_WithGUID  
  10. (  
  11.  SalesKey uniqueidentifier PRIMARY KEY DEFAULT(NewID()),  
  12.  ProductKey int NOT NULL,  
  13.  OrderDateKey int NOT NULL,  
  14.  DueDateKey int NOT NULL,  
  15.  ShipDateKey int NOT NULL,  
  16.  CustomerKey int NOT NULL,  
  17.  PromotionKey int NOT NULL,  
  18.  CurrencyKey int NOT NULL,  
  19.  SalesTerritoryKey int NOT NULL,  
  20.  SalesOrderNumber nvarchar(20) NOT NULL,  
  21.  SalesOrderLineNumber tinyint NOT NULL,  
  22.  RevisionNumber tinyint NOT NULL,  
  23.  OrderQuantity smallint NOT NULL,  
  24.  UnitPrice money NOT NULL,  
  25.  ExtendedAmount money NOT NULL,  
  26.  UnitPriceDiscountPct float NOT NULL,  
  27.  DiscountAmount float NOT NULL,  
  28.  ProductStandardCost money NOT NULL,  
  29.  TotalProductCost money NOT NULL,  
  30.  SalesAmount money NOT NULL,  
  31.  TaxAmt money NOT NULL,  
  32.  Freight money NOT NULL,  
  33.  CarrierTrackingNumber nvarchar(25) NULL,  
  34.  CustomerPONumber nvarchar(25) NULL,  
  35.  OrderDate datetime NULL,  
  36.  DueDate datetime NULL,  
  37.  ShipDate datetime NULL  
  38. );  
  39. GO  
  40.   
  41. INSERT INTO dbo.Sales_WithGUID  
  42.   (  
  43.    ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey  
  44.     ,CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber  
  45.     ,OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount  
  46.     ,ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight  
  47.     ,CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate  
  48.   )  
  49. SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey  
  50.       ,CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber  
  51.       ,OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount  
  52.       ,ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight  
  53.       ,CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate  
  54. FROM AdventureWorksDW2014.dbo.FactInternetSales;  
  55.   
  56.   
  57. IF OBJECT_ID('dbo.Sales_WithoutGUID'IS NOT NULL  
  58.  DROP TABLE dbo.Sales_WithoutGUID;  
  59. GO  
  60.   
  61. -- Create the second table  
  62. CREATE TABLE dbo.Sales_WithoutGUID  
  63. (  
  64.  SalesKey int identity(1,1),  
  65.  CompanyCode char(2) Default ('SL'),  
  66.  constraint pk_Sales_WithoutGUID primary key (SalesKey, CompanyCode),  
  67.  ProductKey int NOT NULL,  
  68.  OrderDateKey int NOT NULL,  
  69.  DueDateKey int NOT NULL,  
  70.  ShipDateKey int NOT NULL,  
  71.  CustomerKey int NOT NULL,  
  72.  PromotionKey int NOT NULL,  
  73.  CurrencyKey int NOT NULL,  
  74.  SalesTerritoryKey int NOT NULL,  
  75.  SalesOrderNumber nvarchar(20) NOT NULL,  
  76.  SalesOrderLineNumber tinyint NOT NULL,  
  77.  RevisionNumber tinyint NOT NULL,  
  78.  OrderQuantity smallint NOT NULL,  
  79.  UnitPrice money NOT NULL,  
  80.  ExtendedAmount money NOT NULL,  
  81.  UnitPriceDiscountPct float NOT NULL,  
  82.  DiscountAmount float NOT NULL,  
  83.  ProductStandardCost money NOT NULL,  
  84.  TotalProductCost money NOT NULL,  
  85.  SalesAmount money NOT NULL,  
  86.  TaxAmt money NOT NULL,  
  87.  Freight money NOT NULL,  
  88.  CarrierTrackingNumber nvarchar(25) NULL,  
  89.  CustomerPONumber nvarchar(25) NULL,  
  90.  OrderDate datetime NULL,  
  91.  DueDate datetime NULL,  
  92.  ShipDate datetime NULL  
  93. );  
  94. GO  
  95.   
  96. INSERT INTO dbo.Sales_WithoutGUID  
  97.   (  
  98.    ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey  
  99.     ,CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber  
  100.     ,OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount  
  101.     ,ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight  
  102.     ,CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate  
  103.   )  
  104. SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey  
  105.       ,CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber  
  106.       ,OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount  
  107.       ,ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight  
  108.       ,CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate  
  109. FROM AdventureWorksDW2014.dbo.FactInternetSales;  

Let's check the space used by both tables.


As you see, first table which is based on GUID consumes 10.7MB for data and 80KB for indexes. Second table which is based on int+char(2) as the key consumes 10.0MB for data and 56KB for indexes. Now, let's additional non-clustered indexes for both tables.

  1. CREATE INDEX IX_Sales_WithGUID_ShipDateKey ON dbo.Sales_WithGUID (ShipDateKey);  
  2. CREATE INDEX IX_Sales_WithGUID_CurrencyKey ON dbo.Sales_WithGUID (CurrencyKey);  
  3. CREATE INDEX IX_Sales_WithGUID_CustomerKey ON dbo.Sales_WithGUID (CustomerKey);  
  4. CREATE INDEX IX_Sales_WithGUID_DueDateKey ON dbo.Sales_WithGUID (DueDateKey);  
  5. CREATE INDEX IX_Sales_WithGUID_OrderDateKey ON dbo.Sales_WithGUID (OrderDateKey);  
  6. CREATE INDEX IX_Sales_WithGUID_ProductKey ON dbo.Sales_WithGUID (ProductKey);  
  7. CREATE INDEX IX_Sales_WithGUID_PromotionKey ON dbo.Sales_WithGUID (PromotionKey);  
  8.   
  9. CREATE INDEX IX_Sales_WithoutGUID_ShipDateKey ON dbo.Sales_WithoutGUID (ShipDateKey);  
  10. CREATE INDEX IX_Sales_WithoutGUID_CurrencyKey ON dbo.Sales_WithoutGUID (CurrencyKey);  
  11. CREATE INDEX IX_Sales_WithoutGUID_CustomerKey ON dbo.Sales_WithoutGUID (CustomerKey);  
  12. CREATE INDEX IX_Sales_WithoutGUID_DueDateKey ON dbo.Sales_WithoutGUID (DueDateKey);  
  13. CREATE INDEX IX_Sales_WithoutGUID_OrderDateKey ON dbo.Sales_WithoutGUID (OrderDateKey);  
  14. CREATE INDEX IX_Sales_WithoutGUID_ProductKey ON dbo.Sales_WithoutGUID (ProductKey);  
  15. CREATE INDEX IX_Sales_WithoutGUID_PromotionKey ON dbo.Sales_WithoutGUID (PromotionKey);  

Here is the result now.


Focus on the space used for indexes. First table uses 11MB while second table uses only 7MB. Note that this is only for 60,000 records but if the table has more records, gap will be significantly huge.

How can it be happened when we have same number of records and same types of indexes. This is because of every non-clustered key value is maintained with the clustered key. When it comes to the first table, since it is 16 bytes, the size of the non-clustered key value will be 16+4. But with the second table, it is 6+4 bytes. This is the reason for high usage of the spave for non-clustered indexes.

Here is the code that shows the number of data pages used by each index. Large number of pages increases IO operations and usage of resources.

  1. SELECT OBJECT_NAME (i.object_id) AS TableName   
  2.  , i.name AS IndexName  
  3.  , ps.index_id AS IndexID   
  4.  , ps.alloc_unit_type_desc AS DataStructure  
  5.  , ps.page_count AS Pages  
  6.  , ps.record_count AS Rows  
  7. FROM sys.indexes AS i  
  8.  CROSS APPLY sys.dm_db_index_physical_stats   
  9.   (DB_ID ()  
  10.   , i.object_id  
  11.   , NULL  
  12.   , NULL  
  13.   , N'DETAILED'AS ps  
  14. WHERE i.object_id = ps.object_id  
  15.   AND i.index_id = ps.index_id  
  16.   AND i.object_id   
  17.    IN (OBJECT_ID (N'Sales_WithGUID')  
  18.     , OBJECT_ID (N'Sales_WithoutGUID'))  
  19.   AND ps.index_level = 0 -- only lef level  
  20. ORDER BY TableName, IndexID;  
  21. GO  


Now you see issue we may face when using GUIDs, hence use it knowing these issues and manage it properly if it is really required.

No comments: