Almost all engineers have used the AVG aggregate function, but have you ever checked whether the result is correct? This was discussed while I was conducting my new SQL class, here is an example for explaining it, where you might make a mistake;
1: CREATE TABLE dbo.EmployeeCommision
2: (
3: EmployeeId int PRIMARY KEY,
4: CommisionGiven money NULL
5: );
6:
7: -- Inserting five employees' commison records
8: -- In real implementation, two tables
9: -- will maintain for holding employees records
10: -- and commison records
11: INSERT INTO dbo.EmployeeCommison
12: (EmployeeId, CommisionGiven)
13: VALUES
14: (1, 50), (2, 50), (3, NULL), (4, 50), (5,50)
15:
16: -- Calculating the average commision given
17: -- to an employee, You may use LEFT OUTER JOIN
18: -- if two tables are maintained
19: SELECT AVG(CommisonGiven)
20: FROM dbo.EmployeeCommision
21: -- Result of above is 50, which is wrong in
22: -- this case to correct the issue, use
23: -- ISNULL function with AVG function
24:
25: SELECT AVG(ISNULL(CommisonGiven, 0))
26: FROM dbo.EmployeeCommision
27: -- Now the result is 40, which is correct.
The reason for this is because, all aggregate functions ignore NULL values. Only exception is COUNT(*) that counts NULL values.
When you execute a query like this, you will see a warning message:
ReplyDeleteWarning: Null value is eliminated by an aggregate or other SET operation.
This message could be removed by executing SET ANSI_WARNINGS OFF statement.
By Default, if you have a null value it is eliminated from aggregate functions except COUNT(*)
The reason for this lies with the interpretation of NULL. It could be Not applicable. For example, if you have spouse's age in a column and want to find the average, you should eliminate those who don't have one.
Of course, NULL is misused these days. That's one of the reason MS asks people to minimize the use of NULL.
Thanks for the comment Preethi, for some reason, I have missed the explanation.... updated the post.
ReplyDelete