Friday, November 27, 2015

Microsoft SQL Server Code Names for versions and features

Microsoft SQL Server team used/uses various code names for SQL Server versions and some of its features, here are some of them found in the web. I tried to match the best image can find in the web, not exactly sure whether it is the right picture, hence please let me know if I have used wrong pictures.


Version/Feature Code Name
SQL Server 6.0 Enterprise ManagerStarfighter
SQL Server 6.0 AgentStarfighter II
SQL Server 6.5
SQL Server 7.0Sphinx
SQL Server 7.0 OLAP ServicesPlato
SQL Server 2000 (32-bit)Shiloh
SQL Server 2000 (64-bit)Liberty
SQL Server Reporting Services 2008Rosetta
SQL Server 2005Yukon
SQL Server Analysis Services 2005Picasso
Database Design and Query ToolsDaVinci
SQL Server 2005 Mobile EditionLaguna
SQL Server 2008Katmai
SQL Report Designer 2.0Blue
SQL Server 2008 R2Killimanjaro
SQL Server 2012Denali
SQL Server Developers (Data) ToolsJuneau
SQL Server Columnstore IndexAppollo
SQL Server 2014 (In-memory OLTP)Hekaton

Wednesday, November 25, 2015

How can I see the pages used by my table?

Not always, but there are instances that you need to see the data pages used by SQL Server for holding data in our tables. There are few ways of seeing them, the easiest way is, using two undocumented functions called sys.fn_PhysLocFormatter and sys.fn_PhysLocCracker. These two were initially released with SQL Server 2008 and they are still available. Not sure why they are still marked as undocumented but information throw from them is really useful.

Here is a test code on it;

USE AdventureWorksDW2014;
GO

-- Checking with fn_PhysLocFormatter function.
-- This shows file id, page id and slot id together.
SELECT sys.fn_PhysLocFormatter (%%physloc%%) AS RID, * FROM dbo.FactInternetSales;


-- If you need file, page and slot separately, 
-- fn_PhysLocCracker function can be used.
SELECT sys.fn_PhysLocFormatter (%%physloc%%) AS RID, RID_c.file_id, RID_c.page_id, RID_c.slot_id, * 
FROM dbo.FactInternetSales
 CROSS APPLY sys.fn_PhysLocCracker (%%physloc%%) AS RID_c
ORDER BY RID_c.file_id, RID_c.page_id, RID_c.slot_id;


Tuesday, November 24, 2015

Changing from FULL to SIMPLE recovery model reduces the log file size?

A concern related to large log file or ldf file still appears in many forums, it is very common to hear/see a questions like my database log file has grown for GBs, why and how can I reduce it, can I just reduce it by changing the recovery model from FULL to SIMPLE.

First thing you need to understand is, how log file truncation works with FULL and SIMPLE recovery model. FULL Recovery model does not truncate the log file until a transaction log backup is performed. But SIMPLE recovery model automatically truncates the log making the log vacant for next transactions rather expanding or growing the file. There are few exceptions in FULL, read my article Log is getting automatically truncated with FULL Recovery Model  for more info.

Now if your database in FULL recovery model and you see a large log file, can you reduce it by just changing the recovery model. Recommendation is not change as you will lose all you transaction records in log file but if you do not want them you can change. However it does NOT reduce the size of the log file. Once changed, it will truncate the file but will not release the free space unless you perform the SHRINKFILE operation for the log file. So the conclusion is, changing the recovery model from FULL to SIMPLE does not reduce the size but truncates. Shrinking file is required for reducing the file size.

Here is a small test for verification.

This code creates a database and a table. The default recovery model is FULL for this database. It backs up the database once. It makes sure that the backup chain is established.

CREATE DATABASE TestDatabase;
GO

USE TestDatabase;
GO

CREATE TABLE dbo.TestTable
(
 Id int identity(1,1) primary key
 , Value1 char(4000) not null
 , Value2 char(4000) not null
);

-- take a backup
BACKUP DATABASE [TestDatabase] TO  DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL12.SQL2014\MSSQL\Backup\TestDatabase.bak' 
WITH NOFORMAT, NOINIT,  NAME = N'TestDatabase-Full Database Backup', SKIP, NOREWIND, NOUNLOAD,  STATS = 10
GO

Here is the current file size.


This code adds records growing both data and log files.

INSERT INTO dbo.TestTable
 (Value1, Value2) VALUES (REPLICATE('a', 4000), REPLICATE('a', 4000));
GO 1000000

As you see, log file has been grown and it is not getting either truncated or shrunk automatically.


Let's change the recovery model and see.

USE master;
GO
ALTER DATABASE TestDatabase SET RECOVERY SIMPLE WITH NO_WAIT;
GO

If you check the size again, you will see the same as it does not release the free space back to OS. Let's shrink the file and see.

USE TestDatabase;
GO
DBCC SHRINKFILE (TestDatabase_log, TRUNCATEONLY);

All as we wanted.


Monday, November 23, 2015

Can I use Standard Edition of SQL Server for implementing my Business Intelligence solution?

I always get this questions from my clients, companies I consult and people who work on BI implementation. Most of the companies initially purchase Standard Edition of SQL Server because of the cost and gradually upgrade to Enterprise. Can we really implement a BI solution with Standard Edition?


Microsoft offers multiple editions of SQL Server, mainly two premium editions and two core editions. Microsoft Analytics Platform System (Parallel Data Warehouse + HDInsight) and Enterprise are premium editions and Business Intelligence and Standard are core editions. If you have purchased Standard Edition, yes you can still implement a BI solution but with many limitations.

Generally, it is not recommended to have Standard Edition for large BI implementations. For managing large volume, you need Enterprise Features like partitioning, data compression and columnstore indexing that are not available with Standard Edition. Therefore, if it is a either small or middle-scale, then Standard Edition can be used.

There less number of functionalities with Standard Edition Integration Services too. Some of main missing features are Persistence Lookup, data mining query transformations and fuzzy lookup and grouping.

It is very important to consider Analysis Services features. As you know, it supports two types of Models; Multidimensional and Tabular. Standard Edition does not support Tabular Model. This forces you to go for multidimensional model though it is bit complex and time-consumed that Tabular. In addition to that, scalable shared databases, synchronizing, perspective, proactive caching and write-back features are not supported with Standard Edition.

Now you know whether you can implement your BI solution with Standard Edition or not. You may consider Business Intelligence Edition if you need more features specifically on Analysis Services end.

Sunday, November 22, 2015

SQL Server Brain Basher of the Week #035 - Database size

Microsoft SQL Server supports creating small databases as well as very large databases. This week question is based on database sizes, here is the one;

What is the minimum relational database size and maximum relational database size that can be set when creating a database?

When you create a database, system takes a copy of the model database as a template for creating it. The default size of the data file set with model database is 5MB. This stops you for reducing the size of the data file when creating your database. If you try to reduce it for 1MB, this is what you get;


You can reduce the model database data file to minimally 3MB. Once it is set, you can create your database with a data file sized to 3MB.


Based on above tests, we know that the smallest size that can be set with our data files is 3MB (not sure, but there can be some other way to reduce this, will see, however, there is no necessity for having a database like that unless there is a specific and special requirement).

The largest database size support by SQL Server is 524 PB. Almost all editions support creating database with this size, only exception is Express Edition that support only 10GB size databases.


Saturday, November 21, 2015

SQL Server Brain Basher of the Week #034 - Analysis Services Security

Security is the most important area that needs to be addressed with many solution and same goes to your Business Intelligence solutions implemented with Microsoft Analysis Services. Typically, all database management systems provide multiple server roles for managing permission, and allow us to extend it either creating custom server roles or database roles. This week question is based Analysis Services server roles;

How many server roles are exist with Analysis Services and what are they?

You do not want to think about this much. Analysis Services offers only one server level role called as Administrators. This cannot be modified and we cannot create any additional roles, but we can add members to this role providing admin rights to anyone required. However system allows us to create additional database level roles which are limited to specific databases.

You can access Administrators role by getting properties of the instance and selecting Security property.


Database roles can be created by accessing Roles node under the database.



Friday, November 20, 2015

Storing datetime with time zone offset and retrieve with required time zone offset

Nothing new but posting as it is something important when working with multiple time zone offsets. If you have to store requests coming from all around the world with the time local to them, sysdatetimeoffset type has to be used with the column and when convert to a different time zone offset, switchoffset function has to be used. Here is an example;

USE tempdb;
GO

-- creating table for holding date and time with time zone offset
CREATE TABLE dbo.TestTable
(
 Id int primary key,
 DateEntered datetimeoffset(0) not null index ix_TestTable clustered
);

-- inserting values
INSERT INTO dbo.TestTable
VALUES
(1, '2015-11-17T02:10:33.430+10:30') -- receiving from Sydney
, (2, '2015-11-17T05:30:21.110+05:30') -- receiving from Sri Lanka
, (3, '2015-11-17T14:24:22.660+00:00') -- receiving from USA

-- Retrieving all dates for Sri Lanka time zone offset
SELECT Id, CONVERT(datetime, SWITCHOFFSET(DateEntered, '+05:30')) DateEntered
FROM dbo.TestTable;

Thursday, November 19, 2015

SQL Server Management Studio - October 2015 Release

Have you tried out latest SQL Server Management Studio? It is available as the October release. It can be used with SQL Server 2016 through SQL Server 2005 and supports side-by-side installation.

You can download it from here.

Here are some of the features/functionalities added with this release (from MSDN)
  • New modernized lightweight web installer
  • New Always Encrypted column encryption wizard 
  • New Column Master Key (CMK) rotation dialog
  • New stretch database monitor 

Wednesday, November 18, 2015

Power BI: Showing YTD values

Still not all time intelligence functions are working but some of them can be used with DAX. If you have a requirement for showing YTD, QTD and MTD, and they are not available with your either multi-dimensional or tabular model, DAX can be used just like we use in PowerPivot for creating them. However, if you try to use functions like PREVIOUSYEAR, PREVIOUSDAY, it might not work (it did not work for me) as all functionalities related to time intelligence have not been added to Power BI yet.

Here is the way of adding YTD measure to a model created in Power BI extracting data from SQL Server. Once data is loaded, go to data tab (green colored box) and click on new measure (red colored box) button for creating a new measure. Then use the following expression for calculating YTD.

SalesAmountYTD = CALCULATE(SUM(FactInternetSales[SalesAmount]), DATESYTD(DimDate[FullDateAlternateKey]), ALL(DimDate))


Once added, it can be used with desired graphical element;



Tuesday, November 17, 2015

Can we use user-defined hierarchies in the model with Power BI?

Power BI is becoming richer and richer with its functionalities and will be surly the best in terms of BI client tools. Since it is still being completed, some important functionalities are still missing and sometime it makes us very uncomfortable to use it. One of the key object we build with models is hierarchy, that is very usable when navigating via a set defined with levels using dimension attributes. This navigation is called as Drilling Down. Most common and famous example is, a hierarchy on date dimension that starts from Year, then Semester, Quarter, Month and Date. Power BI does not recognize these user-defined hierarchies defined in models, hence drilling down was initially impossible. I believe that this is being addressed and it will be available in future releases, until that a workaround is given.

If you have installed AsdventureWorks Tabular Model, it has some hierarchies defined. The following shows a hierarchy defined with Date dimension.



And if you connect with this source using live connection;



And expand date dimension, you will not see the hierarchy.



However, latest version of Power BI allows you to configure a hierarchy within the chart. All you have to do is, drag and drop attributes to Axis box, forming a hierarchy. The following shows how it looks like when it is added.




Now drilling-down is possible with the chart. It is not enabled by default, the icon in the right-top has to be used for enabling it. Once enabled, you can click on the bar, example, bar that represents 2013 and go into semesters related to it. For drilling-up, the icon in the left-top has to be used.

This works fine without any issue and this is the way of implementing drilling-down until we get a version that supports user-defined hierarchies in models.




Monday, November 16, 2015

Understanding Reporting and Analysis in Business Intelligence solutions

Every company implements a business intelligence solution for tracking and improving the business performance through reporting and analysis, setting it as the ultimate goal of business intelligence solution. Some solutions completely focus on analysis only and some focus on reporting only but most focus on both, making sure that solution delivered addresses requirements coming of all level of the company. What is the different between Reporting and Analysis with BI?

Reporting


This is the main communication channel used for getting information from BI solution. Formal reports or operational reports are not recommended through BI solution but production reports, interactive reports (up to some extent) and dashboard reports are common. Most reports show summarized values and they are printable. It is not uncommon to see tabular reports but reports with graphical elements are the most common. When planning, audience needs to be understood, need to have a good knowledge on how reports need to be viewed and delivered. There are mainly two types of reports that can be delivered through a BI solution;
  • IT-provided reports
    Reports that are developed and managed by developers are called as IT-provided reports. These reports are developed based on requirements given by business users. Business users have no facilities to modify reports but content of the reports can be changed using filters added or using interacting features such as expanding/collapsing enabled in the report.
  • Self-service reports
    Self-service reports are authored by business users without getting support from IT. Special tools that are more user-friendly for business users have to be given, empowering business users. Tools should support authoring simple reports such as production report as well as complex reports such as dashboard reports. Though IT has no involvement with this, repository that holds data required with business terms has to be supplied by IT (example, data warehouse, or models) and sometime, IT manages developed reports too.
Analysis


Analysis is detail examination or interpretation of data delivered by BI solution. Smart business users explorer data in the model and perform specific examinations for seeing the insight or find a hidden pattern. Example, these business users will use tools like Excel, Power BI, Tableau for creating their own models or doing their own experiments. General users do the same but not with specialized tool but with reports and dashboards. Generally, following analytical requirements are addressed with BI solutions;
  • Interactive analysis
    This type of analysis requires key BI functionality such as slicing and dicing for analyzing data with various segments or dimensions. Results of these analysis can be published as reports or can be used for improving business process such as promotions.
  • Dashboard and scorecards
    Combination of various KPIs and summarized information with graphical widgets are shown with dashboards and scorecards, allowing users to analyze data with functionalities like drilling down and drilling through.
  • Data mining (machine learning)
    This allows users to do various experiments with data using machine learning algorithm for understanding the trends and patterns related business processes and using them for improving the business.


Sunday, November 15, 2015

Trying to connect on-prem Analysis Services using Power BI Analysis Services Connector?

Although Power BI Pro allows us to connect with On-Prem Analysis Services Tabular instances for consuming live data into dashboards and report, there are limitations on it. If you try to configure Power BI Connector with On-Prem Analysis Services Tabular instance using an effective account (such as onmicrosoft.com account) without an Active Directory configured domain or using an active directory account without syncing it with Azure Active Directory, you will not be able to configure. Here is what I get when trying to configure Power BI Analysis Services Connector with my Tabular Instance installed in my standalone laptop.


All Prerequisites are listed at https://powerbi.microsoft.com/en-us/documentation/powerbi-analysis-services-connector/.

Here are important things to be remembered;
  • Only SQL Server Analysis Services Tabular models are supported for Power BI.
  • The connector has to continue running and the computer where it is installed has to remain on.
  • Power BI Analysis Services Connector and the Data Management Gateway on the same computer cannot be co-exist.
  • The Analysis Services server is domain joined.
  • If you use a .onmicrosoft.com email address, you'll need to sync your Active Directory to Azure Active Directory using Azure Active Directory Sync (DirSync). T

Wednesday, October 28, 2015

Loading data from Microsoft SQL Server to DocumentDB in Azure

WIth my previous post (How to create a DocumentDB in Azure), I mentioned that there are multiple ways of loading documents into DocumentDB in Azure. This is the continuation of it, let's see how we can load data from SQL Server into DocumentDB using Data Migration Tool. Data Migration Tool is a free open-source tool that can be used for loading data into DocumentDB from many different sources. It converts loaded data into JSON hierarchical documents and supports loading data from Microsoft SQL Server, MongoDB, JSON, CSV, Azure Table Storage and many more.

The Data Migration Tool can be downloaded from Microsoft Download Center. However there is an easy way, link for the page is available with DocumentDB account blade;


Go to the page and download the compressed file. Once files are extracted, you will see two executable files; dt.exe and dtui.exe.


Click the dtui.exe for opening the tool. As usual, welcome screen appears, and next screen for selecting the source. Since we try to load data from SQL Server, select SQL under Import from. You need to manually add the connection string (you can get it generated by creating a UDL file), make sure it is added without the provider.

Let's try to load data from AdventureWorksDW2014 database. This is the query for loading customers with locations.

USE AdventureWorksDW2014;
GO

SELECT
 c.CustomerKey AS [key]
 , c.FirstName + ' ' + c.LastName AS [Name]
 , c.AddressLine1 AS [Location.Street]
 , g.City AS [Location.City]
 , g.StateProvinceName AS [Location.Province]

FROM dbo.DimCustomer c
 INNER JOIN dbo.DimGeography g
  ON c.GeographyKey = g.GeographyKey;


Examine the query. Note the way how aliases have been set, specifically columns related to the address. If you  need a hierarchy inside the document, this is the way of instructing the tool for making them. As per this query, a document will contain properties like key, name, and location, and location will have three more properties; Street, City and Province.

Once the query is placed, write dot (".") for Nesting Separator for instructing the tool that dot is used for separating nested properties.


Export to
Next screen for setting the target. There are multiple options for Export to, since this loads only 18484 documents, we can use DocumentDB - Bulk Import rather selecting other items such as DocumentDB - Sequential record import and JSON file. This creates a stored procedure in the collection and uses for loading documents. It is recommended to use second option which is DocumentDB - Sequential record import for large set of documents because it is faster than bulk load. The third option JSON file does not load documents to DocumentDB but creates JSON files.

Connection string
Setting the connection string is easy. All we have to do is, get the copy of primary connection string from keys in documentdb account blade and add the database name to it. Note that database name has to be manually added.

Collection
We can load documents to an existing collection or create a new one. If the one we added with the screen is not exist, tool will create a one for us with the selected performance tier.

Id field
If you leave the Id input box blank, it will create one for us, else we can instruct to use one of our columns for that. As you see, I instruct to use Key column for Id of the documents.

Indexing policy
We can add a policy that explains how we search documents and what type of index we prefer. We do not need to write it but can select the required index by right-clicking on it for getting the menu and selecting one. There are three options;
  1. Default - Efficient indexing only for numeric supporting equality and range check and sorting. String searches cause scan and no support on sorting.
  2. Hash - Support only scanning for both numeric and string, and no support on sorting.
  3. Range - Efficient indexing for both numeric and string including sorting but may use extra space and resources.





Now it is ready for loading, go through the summary and complete it.



Documents are loaded, we can check in the collection whether documents are loaded as we instructed.


Tuesday, October 27, 2015

How to create a DocumentDB in Azure

Microsoft implementation of NoSQL database, documentDB is a cloud based, fully managed, massively scalable service for managing semi-structured data in Azure. Just like other NoSQL products, it supports schema-free JSON documents and unlike other NoSQL platforms, supports indexing automatically on all properties in the document as soon as it is added. Properties in added documents are easily query-able using SQL (Not exactly SQL for relational databases) and any property in the hierarchy is searchable.

This post speaks about creating documentDB in Azure with simple set of steps. If you are not familiar with NoSQL, read this post: What is NoSQL Document Database? What is Azure DocumentDB?

Let's create a documentDB. Login to Azure with new portal (https://portal.azure.com/) using your Azure account. Then Browse for DocumentDB Accounts and click on it for opening the blade related to DocumentDB creation.


Once it is opened, firth thing you need to do is, creating an account. Click on Add and enter a name that is globally unique for DocumentDB Account.


DocumentDB Account is created and it is available as .documents.azure.com. This still does not have a database (see image), for creating a database under the account created, click on Add Database button, and give a name for your database. As you see, I have named it as db01.


Once the database is created, you should see it in the Databases grid in the DocumentDB Account blade. Every component in documentDBs has a ResourceID and every documentDB needs at least one Collection.

What is a Collection?
Collection is 10GB of storage that can be created inside the database. It is not only for handling storage but it determines how much you pay for Microsoft. DocumentDB pricing is based on Collections and price is getting changed based on the Pricing Tier assigned to the collection. It is hourly-basis billing and scaling up and down is done by adding or removing collection assigned.

You can get the database blade opened by clicking the name of the database created and add Collection by clicking the Add Collection button. Add Collection blade allows you to enter the name of the collection, select the Pricing Tier, and select the Indexing Policy.


What is Request Units per Second?
As you see, Collection can be assigned to one Pricing Tier out of three; S1, S2, S3. Storage offered with each tier is same but RU different. RU, Request Units per Second simply means how many requests can be performed against the Collection per second. It can be reading requests, can be writing requests but the counting is not exactly the number of requests made but the throughput for the request considering usage of the resources such as CPU, memory and IO operations. As I read, generally a read-request lesser than 1KB requires one RU but a write-request with a same size of document requires more than RU as writing cost is higher than reading cost. A complex request that needs to update documents with many indexes may need more RUs. How do you determine the required RU? It is possible by looking at dashboards given with the blade, it shows all requests made and the RUs used for requests.

Here is an image explains model:


Image was taken from DocumentDB hierarchical resource model and concepts. Read it for more info.

You can start with the tier as you want for the Collection. Once the tier is selected, Indexing Policy has to be selected, it has two policies; Default and Range. The Default is good for properties with numbers and Range is best for both string and numbers but the storage cost is high. For this, let's create a Collection with S1 and Range policy.

Now we have a database with a collection. All Collections are appeared in the database blade with their ResourceID and can be opened by clicking on them. There are many ways of connecting with the database, adding documents and querying documents;

  1. Using the portal itself - Manually add one by one or upload upto 100 documents at a time.
  2. Using DocumentDB migration tool
  3. Programmatically using REST API (difficult and complex), SDK (like .NET, flexible and can do almost all)
  4. Using Stored Procedures and Triggers added in the collection.
Let's use the given interface for adding and querying for testing. Click on the Create Document button in the Collection blade (in my case it is Customer). This opens another blade that can be used for writing a document. As you see, it adds Id automatically which is required. If you do not add an Id to your document, it will create one for the document. Here is my document;


Once it is added, it can be viewed using Document Explorer in database blase as below.


Let's see how this can be done using other ways with next post related to DocumentDB.