USE AdventureWorks2016CTP3; GO SET ROWCOUNT 100; SELECT TOP (10) * FROM Sales.SalesOrderHeader;
USE AdventureWorks2016CTP3; GO SET ROWCOUNT 10; SELECT TOP (100) * FROM Sales.SalesOrderHeader;
USE AdventureWorks2016CTP3; GO SET ROWCOUNT 100; SELECT TOP (10) * FROM Sales.SalesOrderHeader;
USE AdventureWorks2016CTP3; GO SET ROWCOUNT 10; SELECT TOP (100) * FROM Sales.SalesOrderHeader;
USE AdventureWorks2014; GO SELECT TOP (5) Name FROM Production.Product;
SELECT {} ON 0 , {TopCount([Product].[Product Categories].[Product].Members, 3) } ON 1 FROM [Adventure Works];
SELECT {} ON 0 , {Head([Product].[Product Categories].[Product].Members, 3) } ON 1 FROM [Adventure Works];
Think about a scenario where you have to show your top 10 customers based on their purchases. If the record set is as below;
and if you show first 10 records using TOP 10, have you shown all best customers? See the 11th record, it is same as 10th order, hence he should be added to top-10 list. In this case, top-10 list becomes top-11 list. But we cannot change the query as TOP 11 because it is not always guaranteed that 11th record is same as 10th record. In that case, what would be the way?
The solution is, TOP 10 WITH TIES. It adds additional rows to the resultset if any tied rows for 10th are there. See the code below;
Note that WITH TIES can be only used with SELECT statement and ORDER BY must be used too.
Result set can be filtered in many ways. Most common ways is, using the 3rd primary property of the SQL SELECT statement which is the WHERE condition which filters specific set of rows from the result set.
This post is not about WHERE condition, it is all about TOP and TABLESAMPLE.
The TOP operator gives you top n number of records as you need. The “top n” can be changed by changing the order of the result set by using ORDER BY clause.
The TABLESAMPLE operator allows you to randomly pick data from the table. You can instruct to SQL Server to return specific number of records or percent of rows.
Where we can use this? I think that this is commonly used for getting the average from a large result set. If you have a table with millions of records and average has to be calculated on one of the columns, it is worthwhile to use TABLESAMPLE (10 PERCENT) rather than using all records.
1: -- use all rows in the table for calculating
2: SELECT AVG(Freight)
3: FROM Sales.SalesOrderHeader
4: -- use approximately 10% of rows
5: SELECT AVG(Freight)
6: FROM Sales.SalesOrderHeader
7: TABLESAMPLE (10 PERCENT)
REPEATABLE OPTION
If you execute the above two codes again and again, the average returns from first statement is always same but second statement. This is because of the way it picks records. If you need the same average for all the execution of second statement, REPEATABLE option should be used. The REPEATABLE option has to be used with a repeat_seed, and as long as the repeat_seed is same and no records have been change, same average is returned.
1: SELECT AVG(Freight)
2: FROM Sales.SalesOrderHeader
3: TABLESAMPLE (10 PERCENT) REPEATABLE (1)
What is SYSTEM option?
The SYSTEM option is optional, but it is used by default though you do not use it in your query.
1: SELECT AVG(Freight)
2: FROM Sales.SalesOrderHeader
3: TABLESAMPLE SYSTEM (10 PERCENT)
The SYSTEM option returns approximate percentage of rows and generates a random value for each data page. SQL Server decides which data pages to be included for the sample based on the random value generated and the percentage specified in the query. If a page is decided to to be included, all rows in the page is included for sampling, else the page is excluded. If you use set the TABLESAMPLE with number of rows, instead of a percentage, number of rows will be converted to a percent and process same way.