Dinesh's Blog :::: Being Compiled ::::: Execution Plans

Saturday, September 17, 2016

How SQL Server finds the Execution Plan in the Plan Cache Store

You know that SQL Server creates a plan for your query and keeps them in the cache for reusing when it is required. When you make a request again, how SQL Server knows that the plan is exist in the cache or not? What sort of techniques it uses for finding the plan? Let's try to understand it.

There are four main plan cache stores. These plan cache stores maintain different types of plans and plans are reused when the same request is made by users or plans are discarded when SQL Server needs space in the store (Let's discuss the techniques it uses for discarding with another post). These for stores are;

Object plans store for storing plans for procedures, functions and triggers.
SQL plans store for storing plans for ad-hoc queries.
Bound Trees plan store for storing structures created by Algebrizer for views, defaults and constraints.
Extended Stored Procedure plan store for storing plans for Extended Stored Procedures.

A store is a Hash Table and it contains series of Buckets. A Bucket contains zero or more plans and the Hash value is calculated;

Using databaseId and objectId for Object plan, Bound Tree plan and Extended Stored Procedure plan stores.
Using databaseId and the statement for SQL plan store.

The assignment of execution plan to store bucket is done using the Hash Value. The Plan Handle (or Cache Key) which is made up with various attributes, is used for referencing the plan in the cache.

When you make a request to SQL Server, it does following steps for find the plan in the cache;

Generate the Hash Value for the request for the appropriate store.
Search all execution plans in the bucket along with the cache key.
If found, it uses it, else create a new plan.

For understanding the process, let's check with a simple code;

-- Clear the cache
DBCC FREEPROCCACHE

-- Checking the number of buckets for each store
-- and how many have been used.
SELECT cc.name, buckets_count, buckets_in_use_count
FROM sys.dm_os_memory_cache_hash_tables AS ht 
 JOIN sys.dm_os_memory_cache_counters AS cc 
  ON ht.cache_address = cc.cache_address 
WHERE cc.name IN ('Object Plans','SQL Plans','Bound Trees','Extended Stored Procedures');

SELECT * FROM Sales.SalesOrderHeader
WHERE OrderDate = '01/06/2014';

SELECT * FROM Sales.SalesOrderHeader
WHERE OrderDate = '01/07/2014';

-- Check the cached plans
-- You will see two plans for above two statements
SELECT p.*, t.text
FROM sys.dm_exec_cached_plans p  
CROSS APPLY sys.dm_exec_sql_text(p.plan_handle) t  
WHERE --p.usecounts > 0 AND  
 text LIKE '%SELECT * FROM Sales.SalesOrderHeader%'  
 AND text NOT LIKE '%SELECT p.*%'
ORDER BY p.usecounts DESC;

-- Check how the cache key is formed
-- Note the two bucket ids
-- And note that only object id is different
-- For SQLs, it is the internal hash of the batch text
-- not an id in sys.objects table
SELECT p.bucketid, t.text  , ph.*
FROM sys.dm_exec_cached_plans p  
 CROSS APPLY sys.dm_exec_sql_text(p.plan_handle) t  
 CROSS APPLY sys.dm_exec_plan_attributes(p.plan_handle) AS ph
WHERE --p.usecounts > 0 AND  
 text LIKE '%SELECT * FROM Sales.SalesOrderHeader%'  
 AND text NOT LIKE '%SELECT p.*%'
 AND is_cache_key = 1
ORDER BY p.usecounts DESC;

Monday, May 11, 2015

I want an execution plan optimized for my value, not for the value passed for parameters with first execution.

Since I got few questions on the posts written on stored procedure in last two weeks, thought to make few more posts on the same based on questions asked. One question asked was Can we make sure that the stored procedure is optimized for one range of values, not for exceptions. If we need to get the stored procedure optimized for one set of values, then instruction has to be passed during the creation. We should only do this if we know that there are different best plans for different values.

I have written a post called How to determine whether RECOMPILE is required? If so, at the creation or execution? and it shows the way of seeing whether different best plans are exist for different values. Have a look on it if you want to know the way.

Let's see how we can optimize a procedure for our values, not for values passed with first execution. Below code creates a stored procedure, an index on OrderDate.

-- Clearing the cache  
DBCC FREEPROCCACHE  
GO  
  
USE AdventureWorks2014;  
GO  
  
-- Drop if it is exist  
IF OBJECT_ID('dbo.GetOrderDetails', 'P') IS NOT NULL  
 DROP PROC dbo.GetOrderDetails;  
GO  
  
-- Create the procedure  
CREATE PROC dbo.GetOrderDetails @FromDate datetime, @ToDate datetime  
AS  
    SELECT h.* 
 FROM Sales.SalesOrderHeader h
 WHERE h.OrderDate BETWEEN @FromDate AND @ToDate
 ORDER BY h.OrderDate
GO  

-- Create an index on OrderDate
CREATE INDEX IX_SalesOrderHeader_OrderDate ON Sales.SalesOrderHeader (OrderDate DESC)
GO

Nest code executes the stored procedure with three different values. As you see, WITH RECOMPILE has been added to see whether same plan or different plans are used for the execution.

EXEC dbo.GetOrderDetails '2014-06-29T00:00:00', '2014-06-30T00:00:00' WITH RECOMPILE;  
EXEC dbo.GetOrderDetails '2014-06-26T00:00:00', '2014-06-30T00:00:00' WITH RECOMPILE;  
EXEC dbo.GetOrderDetails '2014-06-01T00:00:00', '2014-06-30T00:00:00' WITH RECOMPILE;  
GO

Here are the plans generated.

As you see, plan is same for first two executions, and a different plan is for third execution. The reason for this is, first and second one try to get orders within few number of days and third one is for many days, hence newly added index is used for first and second one with seek operation and clustered index is used for third one with scanning.

Assume that most requests made on this procedure are based on small number of days and we need to optimize it for small number of days. If so, we need to alter the procedure instructing that create a plan optimized for a sample range that contains small number of days. If we do not do it and the first execution is based on large number of days, plan will be created for it and the consecutive executions will use the plan created even for small number of days which is not the optimized plan for them.

Okay, how can we instruct then? Here is the way.

-- Alter the procedure  
ALTER PROC dbo.GetOrderDetails @FromDate datetime, @ToDate datetime  
AS  
    SELECT h.* 
 FROM Sales.SalesOrderHeader h
 WHERE h.OrderDate BETWEEN @FromDate AND @ToDate
 ORDER BY h.OrderDate
 OPTION (OPTIMIZE FOR (@FromDate = '2014-06-29T00:00:00', @ToDate = '2014-06-30T00:00:00'))
GO 

-- Execute with a large range and see
EXEC dbo.GetOrderDetails '2014-01-01T00:00:00', '2014-06-30T00:00:00';  
GO

Here is the plan;

I have used OPTIMIZE FOR hint by passing two values for parameters. Note that values are set for a small range and this instructs engine to create the plan based on these values without considering the values submitted with executions. As you see, when the procedure is executed even with a wide range, still engine uses the plan optimized for small range.

Read more on this hint at: https://msdn.microsoft.com/en-us/library/ms181714.aspx.