Here is another common question appeared in forums, how to change the data type of column, specifically a large table. If you search for it, the most common one is, handling this with a new column (which is my 3rd way in the code). However, situation to situation, things can be different, hence tested few ways of changing the type with a table that contains 7 millions records.
Here are 3 ways I tried with. The code uses 3 tables called dbo.Sales_1, dbo.Sales_2, and dbo.Sales_3. The structure and the number of records are same for all 3 tables. Here is the test I did.
-- Method 1
-- ========
-- Check the fragmentation of the table
SELECT * FROM sys.dm_db_index_physical_stats(db_id(), object_id('dbo.Sales_1'), null, null, 'detailed');
-- Result
-- External Fragmentation - 0.3% (Good)
-- Internal Fragmentation - 98% (Good)
-- Changing the column type of Product key from int to bigint
ALTER TABLE dbo.Sales_1
ALTER COLUMN ProductKey bigint not null;
-- Result
-- Time spent: 4:47
-- Checking fragmentation again
SELECT * FROM sys.dm_db_index_physical_stats(db_id(), object_id('dbo.Sales_1'), null, null, 'detailed');
-- Result
-- External Fragmentation - 99% (Bad)
-- Internal Fragmentation - 51% (Bad)
-- Rebulding the index
ALTER INDEX PK__Sales_1 ON dbo.Sales_1 REBUILD;
-- Result
-- Time spent: 3:43
-- **
-- Total time spent: 8:30
-- **
-- Method 2
-- ========
-- Check the fragmentation of the table
SELECT * FROM sys.dm_db_index_physical_stats(db_id(), object_id('dbo.Sales_2'), null, null, 'detailed');
-- Result
-- External Fragmentation - 0.3% (Good)
-- Internal Fragmentation - 98% (Good)
-- Rebuilding the index with filfactor 80.
ALTER INDEX PK__Sales_2 ON dbo.Sales_2 REBUILD WITH (FILLFACTOR = 80);
GO
-- Result
-- Time spent: 1:38
-- Changing the column type of Product key from int to bigint
ALTER TABLE dbo.Sales_2
ALTER COLUMN ProductKey bigint not null
-- Result
-- Time spent: 1:13
-- Check for fragmentation
SELECT * FROM sys.dm_db_index_physical_stats(db_id(), object_id('dbo.Sales_2'), null, null, 'detailed');
-- Result
-- External Fragmentation - 0.1% (Good)
-- Internal Fragmentation - 85% (Good)
-- **
-- Total time spent: 2:51
-- **
-- Method 3
-- ========
-- Check the fragmentation of the table
SELECT * FROM sys.dm_db_index_physical_stats(db_id(), object_id('dbo.Sales_3'), null, null, 'detailed');
-- Result
-- External Fragmentation - 0.3% (Good)
-- Internal Fragmentation - 98% (Good)
-- Add a new column with required data type: bigint
-- Setting it as a nullable column
ALTER TABLE dbo.Sales_3
ADD NewProductKey bigint null
-- Changing the recovery model
ALTER DATABASE TestDatabase SET RECOVERY Bulk_Logged;
-- Updating the new column with old column value
DECLARE @Count int = 1
WHILE (@Count != 0)
BEGIN
UPDATE TOP (500000) dbo.Sales_3 WITH (TABLOCK)
SET NewProductKey = ProductKey
WHERE NewProductKey IS NULL
SET @Count = @@ROWCOUNT
END
-- Result
-- Time spent: 1:23
-- Drop the old column
ALTER TABLE dbo.Sales_3
DROP COLUMN ProductKey
-- Rename the new column as old column
sp_rename 'dbo.Sales_3.NewProductKey', 'ProductKey', 'COLUMN';
-- Alter the current column as non-nullable
ALTER TABLE dbo.Sales_3
ALTER COLUMN ProductKey bigint NOT NULL
-- Result
-- Time spent: 0.03
-- Check the fragmentation of the table
SELECT * FROM sys.dm_db_index_physical_stats(db_id(), object_id('dbo.Sales_3'), null, null, 'detailed');
-- Result
-- External Fragmentation - 99% (Bad)
-- Internal Fragmentation - 51% (Bad)
ALTER INDEX PK__Sales_3 ON dbo.Sales_4 REBUILD;
-- Result
-- Time spent: 2:55
-- Change back the recovery model
ALTER DATABASE TestDatabase SET RECOVERY Full;
-- **
-- Total time spent: 4:21
-- **
Here is the summary of it;
| Method | Time Spent | Comments |
|---|---|---|
| 1 - Just alter the column | 8:30 | Takes long time |
| 2 - Alter the column with fill-factor | 2:51 | Fastest, Probably the best |
| 3 - With a new column | 4:21 | Good, but need more extra work |
My preference is for the highlighted one as it gives the best performance and I do not see any issue with it.
No comments:
Post a Comment