Saturday, May 23, 2015

Should I consider Collation when creating a temporary table?

We always create temporary tables for manipulating data for temporary processes and reports. However, unawareness of collation related issues with tempdb might introduce unexpected result with processes and reports, should we really consider it?

What is collation? Collation refers to set of rules that describe how to store and compare data. These rules includes order of characters, case-sensitivity of them, accent-sensitivity of them, kana character types and character width. SQL Server supports on setting collation at instance level, database level or even at column level.

Now, why we need to consider tempdb collation? Have a look on following example. It creates a Database. Note that collation of it is SQL_Latin1_General_CP1_CS_AS (Case Sensitive). Then the table is created and two sample records are added. Assume that this is the business requirement and last SELECT should be based on case and it returns only one record.

USE master;
GO

-- creating database with case-sensitive collation
CREATE DATABASE TestDatabase100 COLLATE SQL_Latin1_General_CP1_CS_AS;
GO

USE TestDatabase100;
GO

-- create a table
CREATE TABLE dbo.TestTable
(
 Id int PRIMARY KEY
 , Name varchar(100)
);
GO

-- insert two records
INSERT INTO dbo.TestTable 
VALUES
(1, 'abc'), (2, 'ABC');

-- Looking only for the second record
-- and this will return ONLY second one
-- as database is case sensitive (collation
-- for the columns are derived from top level)
SELECT * FROM dbo.TestTable WHERE Name = 'ABC';

Assume that same logic has to be implemented with another code using a temporary table, as a part of newly created database.

USE TestDatabase100;
GO

-- create a temporary table
CREATE TABLE #TestTable
(
 Id int PRIMARY KEY
 , Name varchar(100)
);
GO

-- Load data from original table
INSERT INTO #TestTable 
SELECT Id, Name FROM TestTable;

-- Check for result
SELECT * FROM #TestTable WHERE Name = 'ABC';

This will be the result of this.



It returns both records instead of last records. The reason for this behavior is, tempdb collation. Since temporary table has been created inside the tempdb, collation of the tempdb has been used instead of collation of user-defined database. This is why we need to consider the collation of tempdb.

Below code shows how to check the collation of tempdb and how to create the temporary table properly.

-- Checking collation of tempdb
SELECT name, collation_name
FROM sys.databases
WHERE name = 'tempdb'
GO

-- create the temporary table properly
CREATE TABLE #TestTable
(
 Id int PRIMARY KEY
 , Name varchar(100) COLLATE SQL_Latin1_General_CP1_CS_AS
);
GO


No comments: