Thursday, October 21, 2010

TechInsights – Malaysia – Session Ratings

I just received ratings for sessions I delivered in Malaysia, glad to see that the sessions have been useful and valuable for participants. Ratings received are as follows;

Session: Strengthening DW/BI system loading
Overall Rating: 7.0

Session: Simplifying Master Data Management with SQL Server 2008 R2
Overall Rating: 6.8

Thanks for Ervin and Lilian for inviting me for TechInsights, thanks for the support given by all event organizer, and thanks for all feedbacks from participants.

Wednesday, October 20, 2010

Understanding Ownership Chaining and Impersonation

Being a DBA or DBE, understanding how Ownership Chains work and how impersonate another principal for changing the execution context are really important. This topic was discussed while I was conducting the course 6232A on last Saturday, thought to blog on it as it would be useful to you all.

What is Ownership Chain?
When we perform an action with set of statements, objects used are accessed sequentially and this sequence is called as a Chain. Once the chain is established,  permissions on objects in the chain are evaluated differently, not the way it checks permissions when they are separately accessed. If the objects accessed in the chain are owned by a same owner, Ownership Chain is established. Within the ownership chain, SQL Server checks the permission for the first object and skips checking on others. This is the advantage of Ownership Chain because it offers slight performance benefit skipping permission checks on other objects. If SQL Server finds a broken ownership chain, it starts evaluating permission on them.

What is Impersonation?
SQL Server Impersonation means, allowing to change the execution context with another principal. Simply, perform some actions (execute set of queries) using someone else privileges. This can be done explicitly by using the stand-alone EXECUTE AS statement or implicitly by using the EXECUTE AS clause on modules.

Let’s first check with Ownership Chaining and then go for Impersonation. I did this in a TestDatabase, you can do the same. The first code snippet creates two logins and two users. Note that SalesPerson is added to db_owner role.

-- Create two logins
CREATE LOGIN SalesPerson WITH PASSWORD = '123'
, CHECK_POLICY = OFF
GO
CREATE LOGIN MarketingPerson WITH PASSWORD = '123'
, CHECK_POLICY = OFF
GO
 
-- Create users for logins.
USE TestDatabase
GO
CREATE USER SalesPerson FROM LOGIN SalesPerson
GO
sp_addrolemember 'db_owner', 'SalesPerson'
GO
 
CREATE USER MarketingPerson FROM LOGIN MarketingPerson
GO

With next code you can see that Impersonation has been done explicitly using EXECUTE AS. What I do here is, impersonate SalesPerson and then create a schema and a table. Note that SalesPerson becomes the owner of these objects.

EXECUTE AS USER = 'SalesPerson'
GO
CREATE SCHEMA Sales
GO
 
CREATE TABLE Sales.Sales
(
    Date date
    , Amount money
)
GO
 
REVERT
GO

Next code impersonates MarketingPerson and sees whether he can access the table Sales.Sales. Since he has no permission on this table, error is thrown for SELECT query.

The SELECT permission was denied on the object 'Sales', database 'TestDatabase', schema 'Sales'.

EXECUTE AS USER = 'MarketingPerson'
GO
SELECT * FROM Sales.Sales
 
REVERT
GO

Let’s check Ownership Chaining. With next code, SalesPerson creates a stored procedure that returns data from Sales.Sales, and he grants EXECUTE permission to MarketingPerson.

EXECUTE AS USER = 'SalesPerson'
GO
 
CREATE PROC Sales.GetSales
AS
    SELECT * FROM Sales.Sales
GO
 
GRANT EXECUTE ON Sales.GetSales TO MarketingPerson
 
REVERT
GO

We know that MarketingPerson has no read permission on Sales.Sales table but with next code, you can see that he can read the table through Sales.GetSales procedure.

EXECUTE AS USER = 'MarketingPerson'
GO
-- MarketingPerson can access the table
-- through the SP
EXEC Sales.GetSales
 
-- This throws the same error
SELECT * FROM Sales.Sales
 
REVERT
GO

The reason for this is, the Ownership Chain. As MarketingPerson has EXECUTE permission, he can execute the SP. Although he has no permission on Sales.Sales, SQL Server does not evaluate permission on it for MarketingPerson because it is owned by the same owner who owns Sales.GetSales. Note that we have not explicitly set owners for both Sales.GetSales and Sales.Sales, hence owners of them are considered as Sales schema’s owner; SalesPerson. Below code shows it;

SELECT USER_NAME( OBJECTPROPERTY(OBJECT_ID('Sales.Sales'), 'OwnerId'))

If the SP contains an object owned by someone else, it breaks Ownership Chain. Let’s test this too. Below code adds a new table, note that no impersonation is established, so, it is created by dbo. Since it is under Sales schema too, the owner of the schema (SalesPerson) becomes the owner of the new table.

CREATE TABLE Sales.SalesCommisions
(
    SaleNumber int
    , Amount money
)
GO

Next, the code of the SP is altered by SalesPerson. Now it contains both tables but still MarketingPerson can execute the SP without any issues even though he has no permission on new table.

EXECUTE AS USER = 'SalesPerson'
GO
 
ALTER PROC Sales.GetSales    
AS
BEGIN
    SELECT * FROM Sales.Sales
    SELECT * FROM Sales.SalesCommisions
END
GO
 
REVERT
GO
 
EXECUTE AS USER = 'MarketingPerson'
GO
-- Can see both tables
EXEC Sales.GetSales
 
-- Both statement throw errors
SELECT * FROM Sales.Sales
SELECT * FROM Sales.SalesCommisions
 
REVERT
GO

Still the Ownership Chain works. Let’s break it. Let’s change the owner of the new table Sales.SalesCommisions as dbo.

ALTER AUTHORIZATION ON OBJECT::Sales.SalesCommisions TO dbo

Now if MarketingPerson executes the SP, he gets an error for Sales.SalesCommisions table:
The SELECT permission was denied on the object 'SalesCommisions', database 'TestDatabase', schema 'Sales'.

The reason is, the second object, Sales.SaleCommisions is not owned by the same person who owns the SP, hence SQL Server checks and sees whether MarketingPerson has permission on Sales.SaleCommissions.

How can we let MarketingPerson to execute the SP without giving explicit permission on Sales.SaleCommissions? Impersonation is the answer. We can use EXECUTE AS explicitly with the stored procedure, instructing SQL Server to execute the stored procedure with someone else privileges without using CALLER’s privileges. See the code below;

EXECUTE AS USER = 'SalesPerson'
GO
 
ALTER PROC Sales.GetSales    
WITH EXECUTE AS 'SalesPerson'
AS
BEGIN
    SELECT * FROM Sales.Sales
    SELECT * FROM Sales.SalesCommisions
END
GO
 
REVERT
GO

Now if MarketingPerson executes the SP, he will see the result without seeing an error because the execution context for the SP’s execution is set with SalesPerson not with MarketingPerson.

Monday, October 4, 2010

HP Business Intelligence Sizer for Microsoft SQL Server

One of the challenges in Business Intelligence solutions is sizing the infrastructure. Many go with various calculations and assumptions and decide the infrastructure but later find that the capacity selected is not sufficient. One reason could be, most try to define the required hardware spec at the planning or proposal stage, without considering the workloads and processing them. If you plan for HP box for your BI solution, you can use the HP Business Intelligence Sizer for determining the spec. It interviews you for gathering factors related to your solution and provides you a recommendation;

HP

I feel that this is not for small type of BI solution and suitable for large-scale solutions but you can try and see.

Here is the link for downloading the tool: http://h71019.www7.hp.com/activeanswers/us/en/sizers/microsoft-sql-bi.html

Sunday, October 3, 2010

Looking for a BI Product suite? Have a look on “The BI Survey 9”

Business Application Research Center – BARC publishes various types of survey reports; has published “The BI Verdict”, the latest BI survey report that contains comprehensive collection of in-depth BI product evaluations and market analysis. Here is a quote from the site:

“The BI Survey 9 uncovers the truth about why real-world organizations select BI products, what they do with them, and why they eventually abandon them.  It also compares the leading products on the market across a number of key criteria such as performance, scalability and vendor support.”

You need to spend around USD 4995 for purchasing this survey. It will be worthwhile to spend for it if you are planning for buying a BI suite for your organization.

Here is the link: http://www.bi-survey.com/home.html

Thursday, September 30, 2010

PivotViewer Extension for SQL Server Reporting Services

You might have already seen Microsoft Silverlight PivotViewer for Excel (see my post on it). Just like that you are going to see PivotViewer for SSRS. Good news? Yes, but there is a bad news too, it is not yet available for downloading, it a kind of prototype, here how it goes;

pp

Anyway, have a look on this video, this shows it.

Wednesday, September 29, 2010

SQL Server 2008 Service Pack 2 is available for downloading

SQL Server 2008 SP 2 is available for downloading now. Enhancements includes;

  • 15K partitioning Improvement
  • Reporting Services in SharePoint Integrated Mode
  • SQL Server 2008 R2 Application and Multi-Server Management Compatibility with SQL Server 2008
  • Data-tier Application (DAC) Support
Read more info on this and download from here.

Table variable slower than Temporary table???

I used to use table variables when the resultset is smaller and want to hold it temporarily. The main reason for this selection is, data manipulation with table variables is faster than manipulation with temporary tables when the container contains a smaller amount of data. There were some instances, specially on JOINs, table variables show slowness, have been thinking to investigate but had no time. Thought start investigating on it again when I saw this post of Jack Li.

Simple Test
Let me start with an example, similar to an issue I had. The following code creates a table variable and loads sales data into it. Later, the table variable joins with a permanent table, to get a resultset. Note that there is no logic on the query, this is just for testing. Turn Include Actual Execution Plan on and run the query below.

DECLARE @TestTable TABLE 
(
    SalesOrderId int 
    , SalesOrderDetailID int
    , LineTotal money 
    , PRIMARY KEY (SalesOrderId, SalesOrderDetailID)
)
 
INSERT INTO @TestTable
SELECT SalesOrderID
    , SalesOrderDetailID
    , LineTotal 
FROM Sales.SalesOrderDetail
 
SELECT t.SalesOrderId, s.UnitPrice
FROM @TestTable t
INNER JOIN Sales.SalesOrderDetail s
    ON t.SalesOrderId = s.SalesOrderID
        AND t.SalesOrderDetailID = s.SalesOrderDetailID

plan
Now investigate the execution plan for SELECT query. See Estimated Number of Rows, it shows as 1 which is incorrect. This causes Optimizer to use Nested Loops instead of Merge Join and reduces the performance of data retrieval.

Is Merge Join better than Nested Loops?
According to the BOL, Query Optimizer determines the cost of query plan and selects based on two factors;

  • The total number of rows processed at each level of a query plan, referred to as the cardinality of the plan.
  • The cost model of the algorithm dictated by the operators used in the query.

The second factor depends on first one. Therefore having a correct cardinality will make sure that faster execution plan is generated. As the plan generated for our code does not provide a correct cardinality, the algorithm used for joining is not a optimized way to join two results.

Optimizer selects Nested Loop if one of tables contain smaller number of records, based on Estimated Number of Rows, that is the reason for selecting Nested Loop algorithm. But the reality is, number of records in variable table is not small (in this case).

Merge Join works differently. Usually Optimizer selects this algorithm when both resultsets are sorted, in our case, yes, it is. The matching operation for this join works differently and it is usually faster than Nested Loop. For more info, read this.

Using #Temporary Table
The below code does the same, only different is a temporary table has been used instead of table variable.

USE AdventureWorks2008
GO
 
CREATE TABLE #TestTable 
(
    SalesOrderId int 
    , SalesOrderDetailID int
    , LineTotal money
    ,CONSTRAINT pk_TestTable PRIMARY KEY
        (SalesOrderId, SalesOrderDetailID)
)
 
INSERT INTO #TestTable
SELECT SalesOrderID
    , SalesOrderDetailID
    , LineTotal 
FROM Sales.SalesOrderDetail
 
SELECT t.SalesOrderId, s.UnitPrice
FROM #TestTable t
INNER JOIN Sales.SalesOrderDetail s
    ON t.SalesOrderId = s.SalesOrderID
        AND t.SalesOrderDetailID = s.SalesOrderDetailID

plan2

Now you can see that cardinality of the plan is correct and Merge Join is used as the algorithm for joining two results.

A reason for wrong “Estimated Number of Rows” with table variable
As Jack Li explains, when the code is compiled as a batch, since the table variable does not contain data at the initial stage, optimizer makes an assumption, saying that table variable contains very low number of data, resulting a plan with Nested Loop. One other reason could be unavailability of statistics. SQL Server does not maintain statistics for table variables.

Solving the issue
Again, solution from Jack Li, all we have to do is, recompiling the SELECT before executing the SELECT code. At this time table variable contains data, hence Estimated Number of Rows is correctly determined. See the code below;

SELECT t.SalesOrderId, s.UnitPrice
FROM @TestTable t
INNER JOIN Sales.SalesOrderDetail s
    ON t.SalesOrderId = s.SalesOrderID
        AND t.SalesOrderDetailID = s.SalesOrderDetailID
OPTION (RECOMPILE)    

plan3

There will be a cost for recompiling, hence always compare the cost and the improvement, then select the best way.