Saturday, April 23, 2016

How SQL Server handles our requests for modifications

This is not something new with latest Microsoft SQL Server but this is still being discussed and it is unknown or unclear to many of database developers. While I was discussing transaction log of SQL Server database with few, as a part of it, how SQL Server accepts our requests and modifies records was discussed. Thought it is useful to everyone, hence making a post on it;

What really happens when we send a request to SQL Server? It can be an update or a delete. Request might be related to one record or many records. Have a look on below image;


This starts with the request. Either using an application or connecting directly to SQL Server using something like SSMS, we send the request. Once SQL Server received the request, it checks data pages related to the records. If data pages required are not in the memory (or buffer cache), it loads relevant data pages from the disk to memory. Then, remember, it modifies records in pages that are in the memory, not pages in the disk. That is what 1st and 2nd steps in the image explains.

Third step is, updating the transaction log in the disk. Once the page (or pages) in the memory are modified, they become dirty pages. Then SQL Server writes redo and undo information to the log file. During this update, pages related are locked until the transaction log is completely updated.

Once the log is updated, the acknowledgement is sent to the application. Note that, even though the data files are not updated, we receive a message saying records are successfully updated. But we do not want worry, even something happen after we received the message, SQL Server can recover committed records, making sure durability which is one of the properties of the transaction, is satisfied with SQL Server.

Later, after one ore more transactions, a process called Checkpoint writes all dirty pages back to the disk, updating data files. This is how SQL Server handles our update requests.

Thursday, April 21, 2016

SQL Server 2016 - Manipulating data in Azure Storage using PolyBase

When PolyBase is opened with SQL Server editions other than APS or PDW, I tested it with SQL Server 2016 CTP 2.1. It had some issues but was able to get it worked with some workaround. Thought to do the same with RC3 since there are many enhancements, and it works fine now. If you are interested in Polybase, have a look on the post I have written in June 2015; SQL Server 2016 features: Polybase: SQL Server connector to Hadoop.

What is PolyBase? It is a feature that is available with SQL Server 2016. It facilitates us to use TSQL against data stored in Hadoop or Azure Blob Storage, allowing us to combine structured data with semi-structured data. Data warehousing benefits a lot from this as it reduces the time spending on ETLing and supports real-time analysis. Other than that it can be used with OLTP databases and can be used for archiving data as well.

Currently this feature is available with SQL Server 2016 RC3, I did not come across Edition Comparison related to 2016, hence no idea which Edition will support with final release. Note that there are few prerequisites like .Net framework 4.5 or later, Oracle Java SE runtime. Make sure you have installed all required before installing SQL Server and enabling Polybase. You can read a post I wrote on installation SQL Server RC3: Installing SQL Server 2016 RC1 - Sharing issues.

Generally, we should install PolyBase feature as Standard-alone Polybase. If you want to make a collection of SQL Server instances as a PolyBase Scale-Out Group, then select PolyBase Scale-Out Group during the installation.

PolyBase support two Hadoop providers: Hortonwork’s Data Platform (HDP) and Cloudera’s CDH either on Linux or Windows. In addition to that it can connect with Azure Blob Storage too.

Let's start with a simple one. I have uploaded a file to my Azure Storage. This file holds small number of CDR records and let's see how we can read these records using TSQL.



In order to access this file, we need to make sure PolyBase is configured properly. Make sure two new services: SQL Server PolyBase Data Movement and SQL Server PolyBase Engine are running. In addition to that, make sure that it is enabled and connectivity is set as below.


For understanding the value that needs to be set with Hadoop Connectivity, read this.

Let's create a database and configure Data Source. Note that, we need to create a Database Credential first for using it with the Data Source.

CREATE DATABASE Telco;
GO

USE Telco;
GO

-- Create a master key to encrypt credential secret.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'Pa$$w0rd';

-- Create a database scoped credential (Azure Storage).
-- Enter your azure storage primary key as the secret.
CREATE DATABASE SCOPED CREDENTIAL AzureStorageCredential WITH 
        IDENTITY = 'AzureStorageUser'
  , Secret = 'primary key';

-- create the external data source
CREATE EXTERNAL DATA SOURCE DinesqlStorage
WITH
(
 TYPE = HADOOP,
 LOCATION = 'wasbs://cdrcontainer@dinesqlstorage.blob.core.windows.net/'
 , CREDENTIAL = AzureStorageCredential
);


Next step is creating a File Format for the file we gonna read. PolyBase supports Delimited Text, Hive RCFile, Hive ORC, and Parquet. Let's create the File Format for our file.

-- Create an external file format (delimited text file).
CREATE EXTERNAL FILE FORMAT CDRFileFormat WITH (
        FORMAT_TYPE = DELIMITEDTEXT, 
        FORMAT_OPTIONS (FIELD_TERMINATOR =';', 
                USE_TYPE_DEFAULT = False)
);


Last step is creating an External Table matching with the file uploaded combining the File Format. Read more on it with CREATE EXTERNAL TABLE. Here is the code I used.

-- create the table using file format created
-- and for the file uploaded
CREATE EXTERNAL TABLE CDRData
(
 MSIDN nvarchar(100)
 , IMSI nvarchar(100)
 , IMEI nvarchar(100)
 , [Plan] nvarchar(10)
 , CALL_TYPE nvarchar(20)
 , CORRESP_TYPE nvarchar(10)
 , CORRESP_ISDN nvarchar(100)
 , Duration int
 , [Time] nvarchar(20)
 , [Date] nvarchar(20)
 )
WITH
(
 LOCATION = '/cdrfiles'
 , DATA_SOURCE = DinesqlStorage
 , FILE_FORMAT = CDRFileFormat
);

Now I can simply query data in my Azure Storage using the table I created. Not only that I can join this dataset with my other tables and do more analysis. I will add more complex examples with next set of posts.


Wednesday, April 20, 2016

SQL Server 2016 - Connecting to Azure SQL Server using SSMS - RC3

Please note that there is an issue with connecting to Azure SQL Server using latest RC3 build. If you are experiencing the following error, note that it is an issue related your system.

An error occurred while creating a new firewall rule. (ConnectionDlg)...


There is no other options, if you need to connect with your Azure SQL Server, you need to open the portal and add your IP address using Firewall Settings.


Tuesday, April 19, 2016

SQL Server 2016 - System-Versioned Temporal Tables

Although the word "Versioning" was not used, we have been maintaining versions of records, or more precisely history of records using different techniques. SQL Server offers various ways of handling or maintaining history records, or changes that have been done to our records, using features like CDC, CT, Optimistic Isolation Levels. Most of these features do not support actual "versioning" of records but these features can be used for handling different scenarios.

SQL Server 2016 introduces a new feature called System-Versioned Temporal Tables that provides the entire history of our records related to the changes done. This feature records versions of the records based on  update and delete operations with the validity period of the version, allowing us to see not only the current record, state of the record during any given period, or allowing us to do point-in-time analysis. This feature is based on ANSI SQL 2011 standard but SQL Server current 2016 release does not support all the features that describes with it.

Let's write a code and see how it works. The following code creates a Database called Sales and a Table called Customer in Sales. Code creates Customer as a Temporal Table that requires few additional elements with the CREATE TABLE statement. It requires two datetime2 columns for maintaining the validity period. In addition to that, it needs SYSTEM_VERSIONING = ON and optionally a name for the history table. If the name is not mentioned, system will create one for us.

CREATE DATABASE Sales;
GO

USE Sales;
GO

CREATE TABLE dbo.Customer
(
 CustomerId int identity(1,1) Primary Key
 , FirstName varchar(100) null
 , LastName varchar(100) not null
 , CreditLimit money not null
 , ValidFrom datetime2(0) GENERATED ALWAYS AS ROW START -- datetime2(any precistion)
 , ValidTo datetime2(0) GENERATED ALWAYS AS ROW END -- datetime2(any precistion)
 , PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo)
)
WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.CustomerHistory));

Let's insert three records and query both tables;

-- Inserting records
INSERT INTO dbo.Customer
 (FirstName, LastName, CreditLimit)
VALUES
 ('Dinesh', 'Priyankara', 10000)
 , ('Jack', 'Wilson', 15000)
 , ('John', 'Knight', 3500);

-- Checking records
SELECT * FROM dbo.Customer;
SELECT * FROM dbo.CustomerHistory;


As you see, history table does not show any records and last two datetime2 columns have been automatically filled. Now let's make some changes. Note that records have been inserted on 14/04/2016 and will be updating records on 16th, 17th and 19th.

-- Modifying a record - 2016-04-14 01:56:23
UPDATE dbo.Customer
 SET CreditLimit = 12000
WHERE CustomerId = 1;

-- Deleting a record - 2016-04-17 01:57:17
DELETE dbo.Customer
WHERE CustomerId = 3;

-- Modifying the same record - 2016-04-19 01:57:26
UPDATE dbo.Customer
 SET CreditLimit = 20000
WHERE CustomerId = 1;

-- Checking records
SELECT * FROM dbo.Customer;
SELECT * FROM dbo.CustomerHistory;


As you see, three records are in the history table; 2 records for the Customer Id 1 for two modifications made and 1 record for the Customer Id 2 for the deletion. This table exactly says how recorded are changed and when they have been changed, not only that it allows us to see the state of the record based on the validity period.

Let's see how we can retrieve records. There are multiple ways for querying records. We simply query the table without considering the history or we can go through the history using new clause given; FOR SYSTEM_TIME. Note the different between BETWEEN and FROM.

-- Retrieving the current record
SELECT *
FROM dbo.Customer
WHERE CustomerId = 1;

-- Retrieving for a date
-- Better include time too
SELECT *
FROM dbo.Customer
FOR SYSTEM_TIME
 AS OF '2016-04-17 00:00:00'
WHERE CustomerId = 1;

SELECT *
FROM dbo.Customer
FOR SYSTEM_TIME
 BETWEEN '2016-04-16 00:00:00' AND '2016-04-19 00:38:43'
WHERE CustomerId = 1
ORDER BY ValidFrom;

SELECT *
FROM dbo.Customer
FOR SYSTEM_TIME
 FROM '2016-04-16 00:00:00' TO '2016-04-19 00:38:43'
WHERE CustomerId = 1
ORDER BY ValidFrom;


You can read more on this at msdn.

Here is the cleaning code if required;

ALTER TABLE dbo.Customer SET ( SYSTEM_VERSIONING = OFF );
DROP TABLE dbo.Customer;
DROP TABLE [dbo].[CustomerHistory];
GO
USE master;
GO
DROP DATABASE Sales;



Saturday, April 16, 2016

SQL Server 2016 New Three Releases: SQL Server RC 3, Management Studio, SQL Server Data Tools

Microsoft SQL Server team has announced three new releases related to SQL Server product suite;

  1. SQL Server Release Candidate 3
    This is the last Release Candidate and it is available for downloading now.
    Click here to download it and click here to see the release note.
  2. SQL Server Management Studio April Preview
    New Preview is available with bug fixes and some enhancements. Click here to read more and click here to download it.
  3. SQL Server Data Tools Preview Update
    An update has been released for SQL Server Data Tools specifically for SQL Server Release Candidate 3. Click here to download the Preview.

Friday, April 15, 2016

SQL Server 2016 - Parameter Sniffing can be controlled at database level

SQL Server uses Parameter Sniffing for creating the best plan for your query (or stored procedure) that sniffs values passed for parameters and generates the plan accordingly. In most of the cases, this is the best for most queries and stored procedures unless the values for parameters are vary with each execution. There was no easy way of enabling and disabling this with previous versions but trace flags 4136 allows you to disable Parameter Sniffing at server level. However the question is, should we disable it, or why should we disable it? 

Let's try to understand this before looking at the setting given with SQL Server 2016. Have a look on the following code;

USE AdventureWorks2016CTP3
GO
SELECT * FROM Sales.SalesOrderDetail 
    WHERE SalesOrderDetailID = 1
SELECT * FROM Sales.SalesOrderDetail 
    WHERE SalesOrderDetailID = 1000

SQL Server generates two different plans for these two queries because they are ad-hoc statements and two different values are passed to SalesOrderDetailID. The plans were generated based on the values passed, basically using Parameter Sniffing. That is why we see two different plans for the same query.


With stored procedures, this is bit different. The plan for the procedure is created with its initial execution, based on the values passed for parameters and it stores the plan in the cache. SQL Server does not sniff parameter values with subsequent executions for generating the plan again unless it cannot find the cached plan. This speeds up the execution because SQL Server does not need to spend time and use resources for generating the plan again and again. But, remember, it uses the same plan for all type of values passed, it may not be the optimal plan for all values.

CREATE PROC GetOrderDetails @Number int
AS
    SELECT * FROM Sales.SalesOrderDetail 
    WHERE SalesOrderDetailID <= @Number
GO

EXEC GetOrderDetails 1
EXEC GetOrderDetails 1000


Okay, now why we should disable this behavior? If we disable this, SQL Server uses average distribution statistics for generating the plan which is not best for almost all implementation. If you really need a plan that does not depend on values passed and you need sort of average type plan for all executions, then you will benefit disabling Parameter Sniffing.

Read more on this with my article: Do we need to recompile Stored Procedures when an index is added or parameter value is vary greatly?

As I mentioned above, the trace flag can be used for disabling but it applies to all databases. With SQL Server 2016, whether you need it or not, this setting is available at database level and can be enabled/disabled anytime.

This is how it display with database properties.


Let's disable this for AdventureWorks database and see how it works.


Let's clear the cache and execute the procedure again and see.

-- Clearing the cache
DBCC FREEPROCCACHE;

EXEC GetOrderDetails 1;
EXEC GetOrderDetails 1000;



As you see, a plan has been generated without considering values passed and it is being reused. If you think that your codes benefit with this, then this new option can be applied to your database, otherwise, keeping the default is the best.

Thursday, April 14, 2016

Installing SQL Server 2016 RC1 - Sharing issues

Just sharing my experiences with installing SQL Server 2016 RC1;

If you are planning to install SQL Server 2016 with Polybase support, make sure you have installed Oracle JRE 7 or highest, otherwise you will be facing this issue;


It can be downloaded at: http://www.oracle.com/technetwork/java/javase/downloads/index.html

In addition to that, you might get time-out  issues during installation, in my case, though it occurred several times, retry worked me and installation was successful.



Wednesday, April 13, 2016

I renamed my computer, Will my SQL Server instance work? Should I do something on it?

This is something we always worry, what will happen if the name of the host machine is renamed? After renaming, can I connect with my SQL Server instance using new name set with the host, or I have to re-install SQL Server, or I have to do some setting changes? Let me share what I just did.

My computer name was initially set as DESKTOP-QOKBL3L because I forgot to set the name properly when installing the OS. And not only that, I installed an instance of SQL Server 2012 as a named instance too. After realizing that name of the computer is not as I want, I just changed it as DINESH-HM-LAP01. Now can I connect with my SQL Server instance?

I just tried with the old name as I used to;


As you see, it did not allow me to connect. Then I tried with my new computer name;



And I was able to connect with my instance, yes without doing anything. But, does it mean that I do not want to anything?

Let's check the server name using TSQL;


As you see, not all metadata is updated, means it needs some updates. For updating metadata, we need to drop the server and add again. After that, a service restart is required, once done, if the above query is run again, both will show the new name instead of old name.

sp_dropserver 'DESKTOP-QOKBL3L\SQL2012';
GO
sp_addserver 'DINESH-HM-LAP01\SQL2012', local;
GO

For more info on this, read this article in MSDN: https://msdn.microsoft.com/en-us/library/ms143799.aspx

Wednesday, April 6, 2016

Point-In-Time restoring with Azure SQL Database

We know that Point-In-Time restoration is possible with SQL Server and it is one of useful ways of recovering data in a disaster-recovery situation. This requires backup taken that contains data that needs to be recovered. If no backup is available, we can still achieve this by taking a backup of the database, may be tail-log backup.

However, Azure SQL Database works differently. For Point-In-Time restoration, you do not need backups if the recovery related to certain time period. Azure SQL Database allows you to recover your database (or restore) to any restore point within 7 days if the tier is Basic, 14 days for Standard, 35 days for Premium. This is a very simple process and it restores the database with a different name like Database Name + TimeStamp.

Here are the steps for Point-In-Time restoration;

First login to your Azure Portal and get SQL Servers blade opened. Select the server and get the preferred database server blade opened.


The select the database from Databases section and get Properties of database opened. Click on Restore button for opening the blade for resorting.


You can see oldest restore point based on your tier. Select the Point-In-Time as you want, change other properties if need (example, different server) and click OK to get it restored.

Once restored, it can be accessed via Management Studio just like you access the other database. One thing you need to remember is, you cannot see both databases by connecting to the server, hence when connecting, make sure you mention the database name for connecting to newly created database.


Sunday, April 3, 2016

Troubleshooting agent jobs

Administrators always try to automate routine tasks, making sure that all required operations are done on time without failures. The main component used for automation is SQL Server Agent, which is used for creating jobs. If you have noticed that a scheduled job has not run or all jobs are not running, how do you troubleshoot? How do you start troubleshooting?



Here are some guidelines for troubleshooting based on a discussion I had;
  • First thing needs to be checked is whether SQL Server Agent service is running or not. You need to make sure that Startup type is set to automatic, this makes sure that service is started when the server is restarted. If, for some reasons, service is not running, and you cannot even manually start it, check following;
    • Check and see whether account assigned for the Agent is valid and no issues with the password. The account assigned may have been expired, changed or disabled. Check the system log for more details.
    • Check the msdb database. If it is corrupted or offline, Agent will not be started.
  • Check the job history. Check whether last run was successful. There can be some issues with business logic implemented.
  • Check whether the job is enabled. Someone might have disabled it.
  • Check whether the schedule set is either expired or disabled.
  • Check and see whether proxy accounts are working properly if you have used. Check credentials used for proxy accounts.
  • Check dependencies. There can be steps in the job that run without any issues but some. Check whether all required items, such as files, folders and all required permissions for accessing are available.

Saturday, April 2, 2016

SQL Server Developer Edition is free for us

Thought it was part of April fool's joke but it looks like it is not. The Developer Edition which is the exact copy of Enterprise Edition is freely available for developers, it is available for Visual Studio Dev Essentials members. Here is the link for that: https://www.visualstudio.com/products/visual-studio-dev-essentials-vs?wt.mc_id=WW_CE_BD_OO_SCL_TW_DESQLBenefitAnnouncement_SQL.


This is still for SQL Server 2014, once SQL Server 2016 is release, Microsoft will make 2016 available too.


Friday, April 1, 2016

SQL Server 2016 Release Candidate 2 available for downloading

Have not you tried this yet? Here are some reasons for downloading and trying this out;

  • In-memory performance and built-in operational analytics for faster transaction queries and deeper insights.
  • Protection and security on data at rest and in motion using Always Encrypted technology.
  • Rich-Enterprise ready high availability and disaster recovery solutions with AlwaysOn technology.
  • Advance analytics capabilities and rich visualization supportability on any device.
  • And much more....
Read more on this: