Showing posts with label Data Type. Show all posts
Showing posts with label Data Type. Show all posts

Monday, February 27, 2017

NULL values consume storage in SQL Server Tables?

While we were discussing on data types to be used and null-able columns, a question was raised, asking whether the space is used for NULLs as SQL Server uses for other data types. My answer was Yes and No because it depends on the data type we have used.

NULL is not exactly a value. It indicates that the value is unknown hence it requires some bits/bytes to maintain it. However, if I set one of my columns value as NULL, can I assume that it will not use the space that data type suppose to use? For example, if I have a column with data type int that uses 4 bytes per value and I inserted a record with NULL for the column, will SQL Server still uses 4 bytes or few bits for the NULL?

It is always better to write some codes for testing and come to a conclusion. Therefore, let's test this with four tables. The below code creates;
  1. Customer_Without_NullValues_FixedWidthType table
  2. Customer_With_NullValues_FixedWidthType table
  3. Customer_Without_NullValues_VaryWidthType table
  4. Customer_With_NullValues_VaryWidthType table
The last three columns of First and Second tables are set with date data type and all are null-able. And last three columns of Third and Forth tables are set with varchar(4000) data type and all are null-able.

USE tempdb;
GO

CREATE TABLE dbo.Customer_Without_NullValues_FixedWidthType
(
 Id int identity(1,1) PRIMARY KEY
 , Name varchar(100) NOT NULL
 , DateOfBirth date NULL
 , RegisteredDate date NULL
 , LastPurchasedDate date NULL
);
GO

CREATE TABLE dbo.Customer_With_NullValues_FixedWidthType
(
 Id int identity(1,1) PRIMARY KEY
 , Name varchar(100) NOT NULL
 , DateOfBirth date NULL
 , RegisteredDate date NULL
 , LastPurchasedDate date NULL
);
GO


CREATE TABLE dbo.Customer_Without_NullValues_VaryWidthType
(
 Id int identity(1,1) PRIMARY KEY
 , Name varchar(100) NOT NULL
 , Details1 varchar(4000) NULL
 , Details2 varchar(4000) NULL
 , Details3 varchar(4000)  NULL
);
GO

CREATE TABLE dbo.Customer_With_NullValues_VaryWidthType
(
 Id int identity(1,1) PRIMARY KEY
 , Name varchar(100) NOT NULL
 , Details1 varchar(4000) NULL
 , Details2 varchar(4000) NULL
 , Details3 varchar(4000)  NULL
);
GO

Next code insert 100,000 records for each table. However, last three columns of Second and Forth tables are filled with NULLs instead of known values.

INSERT INTO dbo.Customer_Without_NullValues_FixedWidthType
 (Name, DateOfBirth, RegisteredDate, LastPurchasedDate)
 VALUES
 ('a', getdate(), getdate(), getdate());

INSERT INTO dbo.Customer_With_NullValues_FixedWidthType
 (Name, DateOfBirth, RegisteredDate, LastPurchasedDate)
 VALUES
 ('a', null, null, null);

INSERT INTO dbo.Customer_Without_NullValues_VaryWidthType
 (Name, Details1, Details2, Details3)
 VALUES
 ('a', REPLICATE('a', 4000), REPLICATE('a', 4000), REPLICATE('a', 4000));

INSERT INTO dbo.Customer_With_NullValues_VaryWidthType
 (Name, Details1, Details2, Details3)
 VALUES
 ('a', null, null, null);

GO 100000

In order to see the space usage, easiest way is, check number of pages read for data retrieval.


As you see, space usage of First and Second table is same regardless of the value stored. I means, Fixed data types need the defined space whether the value is null or not. However, Third and Forth clearly shows that it is not the same with data type with vary length. When the data type is vary in length, it does not need the space defined with the type.

Wednesday, February 8, 2017

How do I set the size of SqlParameter of SqlCommand for varchar(max)?

If you need to execute a Stored Procedure that implemented with varchar(max) parameter, using .Net Application, how do you set the size of the parameter?

Since the constructor of SqlParameter does not accept string value for size property, we have no way of passing max as the size;

// This will not compile
SqlParameter parameterNote = new SqlParameter("Note", SqlDbType.NVarChar, max);

But we can pass -1 as the size that indicates it is max;

SqlParameter parameterNote = new SqlParameter("Note", SqlDbType.NVarChar, -1);


Wednesday, August 3, 2016

Arithmetic overflow error converting expression to data type int - It works for larger number but not for a smaller number

One of my colleagues sent me a simple code that has two lines. The first line works without any issue but second throws an error. This is the code;

SELECT 9*10000000000000000
SELECT 9*1000000000

This is the output;


What is the issue, the first calculation is much larger to second one but it works. The reason is, how SQL Server evaluates the expression and decides the data types. If we consider the second line, the largest value in the expression is 1,000,000,000 which is lower to the highest value that int can hold. The maximum number int can hold is 2,147,483,647. Since both 9 and 1,000,000,000 fall into the range related int, it sets the data type of the return value has int as well. However, when 1,000,000,000 gets multiplied by 9, it exceeds the range. That is the reason for the error.

With the first line, 10,000,000,000,000,000 is considered as a bigint type number and the result becomes the same. Therefore it works without any issue.

If we write the query as below, it will work without any issue.

DECLARE @a bigint
SET @A = 10000000000000000
SELECT  9*@A

SET @A = 1000000000
SELECT  9*@A

Saturday, March 7, 2015

How to change the data type of a column of a large table

Here is another common question appeared in forums, how to change the data type of column, specifically a large table. If you search for it, the most common one is, handling this with a new column (which is my 3rd way in the code). However, situation to situation, things can be different, hence tested few ways of changing the type with a table that contains 7 millions records.

Here are 3 ways I tried with. The code uses 3 tables called dbo.Sales_1, dbo.Sales_2, and dbo.Sales_3. The structure and the number of records are same for all 3 tables. Here is the test I did.

-- Method 1
-- ========

-- Check the fragmentation of the table
SELECT * FROM sys.dm_db_index_physical_stats(db_id(), object_id('dbo.Sales_1'), null, null, 'detailed');
-- Result
-- External Fragmentation - 0.3% (Good)
-- Internal Fragmentation - 98% (Good)

-- Changing the column type of Product key from int to bigint
ALTER TABLE dbo.Sales_1
 ALTER COLUMN ProductKey bigint not null;
-- Result
-- Time spent: 4:47

-- Checking fragmentation again
SELECT * FROM sys.dm_db_index_physical_stats(db_id(), object_id('dbo.Sales_1'), null, null, 'detailed');
-- Result
-- External Fragmentation - 99% (Bad)
-- Internal Fragmentation - 51% (Bad)

-- Rebulding the index
ALTER INDEX PK__Sales_1 ON dbo.Sales_1 REBUILD;
-- Result
-- Time spent: 3:43

-- **
-- Total time spent: 8:30
-- **



-- Method 2
-- ========

-- Check the fragmentation of the table
SELECT * FROM sys.dm_db_index_physical_stats(db_id(), object_id('dbo.Sales_2'), null, null, 'detailed');
-- Result
-- External Fragmentation - 0.3% (Good)
-- Internal Fragmentation - 98% (Good)

-- Rebuilding the index with filfactor 80.
ALTER INDEX PK__Sales_2 ON dbo.Sales_2 REBUILD WITH (FILLFACTOR = 80);
GO
-- Result
-- Time spent: 1:38

-- Changing the column type of Product key from int to bigint
ALTER TABLE dbo.Sales_2
 ALTER COLUMN ProductKey bigint not null
-- Result
-- Time spent: 1:13

-- Check for fragmentation
SELECT * FROM sys.dm_db_index_physical_stats(db_id(), object_id('dbo.Sales_2'), null, null, 'detailed');
-- Result
-- External Fragmentation - 0.1% (Good)
-- Internal Fragmentation - 85% (Good)

-- **
-- Total time spent: 2:51
-- **



-- Method 3
-- ========

-- Check the fragmentation of the table
SELECT * FROM sys.dm_db_index_physical_stats(db_id(), object_id('dbo.Sales_3'), null, null, 'detailed');
-- Result
-- External Fragmentation - 0.3% (Good)
-- Internal Fragmentation - 98% (Good)

-- Add a new column with required data type: bigint
-- Setting it as a nullable column
 ALTER TABLE dbo.Sales_3
 ADD NewProductKey bigint null

-- Changing the recovery model
ALTER DATABASE TestDatabase SET RECOVERY Bulk_Logged;

-- Updating the new column with old column value
DECLARE @Count int = 1

WHILE (@Count != 0)
BEGIN

 UPDATE TOP (500000) dbo.Sales_3 WITH (TABLOCK)
  SET NewProductKey = ProductKey
 WHERE NewProductKey IS NULL

 SET @Count = @@ROWCOUNT
END
-- Result
-- Time spent: 1:23

-- Drop the old column
ALTER TABLE dbo.Sales_3
 DROP COLUMN ProductKey

-- Rename the new column as old column
sp_rename 'dbo.Sales_3.NewProductKey', 'ProductKey', 'COLUMN';

-- Alter the current column as non-nullable
ALTER TABLE dbo.Sales_3
 ALTER COLUMN ProductKey bigint NOT NULL
-- Result
-- Time spent: 0.03

-- Check the fragmentation of the table
SELECT * FROM sys.dm_db_index_physical_stats(db_id(), object_id('dbo.Sales_3'), null, null, 'detailed');
-- Result
-- External Fragmentation - 99% (Bad)
-- Internal Fragmentation - 51% (Bad)


ALTER INDEX PK__Sales_3 ON dbo.Sales_4 REBUILD;
-- Result
-- Time spent: 2:55

-- Change back the recovery model
ALTER DATABASE TestDatabase SET RECOVERY Full;

-- **
-- Total time spent: 4:21
-- **

Here is the summary of it;
Method Time Spent Comments
1 - Just alter the column8:30 Takes long time
2 - Alter the column with fill-factor 2:51 Fastest, Probably the best
3 - With a new column 4:21 Good, but need more extra work

My preference is for the highlighted one as it gives the best performance and I do not see any issue with it.