Thursday, March 22, 2007

Answer lies beneath the way of Query Processing

The Question I asked in my previous post is why we should not use column alias used in SELECT list in Group BY Clause and we used that in ORDER By Clause.

The answer lies in the way how a SELECT Statement is processed.

SQL is differ from other programming language in an aspect how it is processing the Code. Normally the a code in a programming language is processed top down method, meaning, first written statement will be processed first. But SQL wont.

A 'SELECT' Statement is processed by SQL Server as follows.

FROM Clause is processed first.
On condition of Join Clause is Processed Second.
Join condition is Processed third.
Where is Processed fourth.
Group By is Processed fifth.
With {Cube} | {Rollup} is processed sixth.
Having is Processed seventh.
SELECT is processed eighth.
Distinct is processed 9th
ORDER By is processed 10th.
TOP Clause is processed 11th.

Suppose if you omitted any optional clauses that steps will be omitted in the processing.

Now look at the query.

Group By is Processed before the Select so we can not use the alias created in the SELECT in Group By.

And Order By is Processed after SELECT so we can use that in ORDER BY Clause.

For further reading about how queries are processed read Chapter 1 and 2 of the book.

Inside Microsoft® SQL Server™ 2005 T-SQL Querying
By Itzik Ben-Gan - (Solid Quality Learning), Lubor Kollar, Dejan Sarka

Thursday, March 15, 2007

A Little Question...!

In my last post, I have explained how CTE can be used in SQL Server 2005. Just take one query used in that post leads to the another topic.

Select EmployeeId,Year(OrderDate) AS OrderYear,
Count(*) as [Sales]
From dbo.Orders
Where EmployeeId=1
Group by EmployeeId,Year(OrderDate)
Order By OrderYear


In the above query the
column Year(OrderDate) is having a alias OrderYear. But this alias is not used in the Group By Clause why? Again the alias is used in Order By Clause. Why don't we use the column alias in the select list can not be used in Group By as in Order By?

Post your comments...!

Saturday, March 3, 2007

The Virtue of the Virtual Views


The following SQL Puzzle is found in the book APRESS: Advanced Transact –SQL for SQL Server 2000 in page no 71 of Chapter 2.

You are provided with these tables: Orders, OrderDetails, and OrderPayments. Each order in the Orders table can have one or more order parts in the OrderDetails table and zero or more payments in the OrderPayments table. Your task is to write a query that produces order information along with the sum of its order parts and order payments.

The Orders table will look like this:

Ordered custid odate

1 1001 2001−01−18 00:00:00.000

2 1002 2001−02−12 00:00:00.000

The OrderDetails table will look like this:

orderid partno qty

1 101 5

1 102 10

2 101 8

2 102 2

The OrderPayments table will look like this:

orderid paymentno value

1 1 75

1 2 75

2 1 50

2 2 50

The expected output is shown in the following table:

orderid custid odate sum_of_qty sum_of_value

1 1001 2001−01−18 00:00:00.000 15 150

2 1002 2001−02−12 00:00:00.000 10 100

How can you solve this?

To solve this problem you need to write a query with two derived table in SQL Server 2000.

Derived tables are Virtual Tables in SQL Server that are similar to views but differs from views by not storing the definition of it in the Metadata. The Derived Table is a result set of a Query that is used in the FROM Clause of an outer Query and it will be no longer available after the execution outer query. Derived Tables can be nested but may not be co-related.

Let’s go to the solution.

Query listing:1

Select OQ.OrderId,OQ.CustId,OQ.ODate , QtySum,Paysum

From (

Select Orders.OrderId, Orders.CustId,Orders.Odate,

Sum(OD.QTY) as QtySum

From Orders Join OrderDetails OD

On Orders.OrderID = OD.OrderId

Group By Orders.OrderId,Orders.CustId, Orders.Odate

) OQ

Join

( Select O1.OrderID, Sum(op.[value]) as PaySum

from Orders O1

Join OrderPayments OP

on O1.OrderId = Op.Orderid

Group By O1.OrderId ) as OP1

On OQ.OrderID = OP1.OrderId

This query is gives you the desired solution (which is also available in that book at page no 494)

As this book is dealing SQL Server 2000 T-SQL, the author used this derived table way.

Using Derived Table for complex queries having some disadvantages. It reduces the readability of the query and as well as the debugging complex queries (using derived tables) is a big job.

SQL Server 2005 offers a novel way to handle these kinds of problems. Using CTE (Common Table Expressions) you can improve readability, and easily maintain a complex query by building such queries in several blocks.

CTE offers you something that derive tables doesn’t, are, Derived Tables are not referred multiple time in a query, and CTE can be used as recursive queries.

First I show you the CTE way solution to that problem

Query listing:2

With QTYSum as

(Select Ord.OrderId,Sum(OD.Qty) as PartQtySum

From Orders Ord

Join OrderDetails OD

on OD.OrderId = Ord.OrderId

Group By Ord.OrderId

), --Block 1

PaySum As

(Select O.OrderId,Sum([Value]) as PaySum

From Orders O

Join OrderPayments OP

On O.OrderId = OP.OrderId

Group By O.OrderId

) -- Block 2

Select Orders.OrderID, Orders.CustId, Orders.Odate, PartQtySum, PaySum

From Orders

Join Paysum on Orders.OrderId = Paysum.Orderid

Join QTYSum on Orders.ORderId = QTYSum.OrderID

Here, we used the same lines of code, but separating its appearance, which gives us better readability. The OQ alias Select Query Statement in the Query listing:1 was replaced by the Block 1 Query of the Query listing:2 and the OP1 alias Select Query Statement in the Query listing:1 was replaced by the Block 2 Query of the Query listing:2

But this CTE is not used multiple references and recursive queries. Let us see some other example for these options.

Suppose you want to compare each year’s Sales of a particular Employee with his or her previous year’s sales, the CTE can be used as follows.(NorthWind Database’s Orders Table is used for illustration)

With YearToDateSales as

( Select EmployeeId,Year(OrderDate) AS OrderYear,

Count(*) as [Sales]

From dbo.Orders

Where EmployeeId=1

Group by EmployeeId,Year(OrderDate)

)

Select Cur.OrderYear,Cur.Sales , Prv.Sales

From YearToDateSales as Cur

Left Outer Join YearToDateSales Prv

                       on Prv.OrderYear = Cur.OrderYear -1
 

In this example, a CTE named ‘YearToDateSales‘ was referred two times in the outer query with two different aliases(Cur, Prv). The Employee ID is hard-coded, can be replaced with an argument. Add the following two lines before the With caluse of the CTE.

Declare @empId as int;

set @empID=1;

And replace the ‘Where EmployeeId=1’ withWhere EmployeeId=@empID’.

Suppose a CTE is used in batch like the above argumented CTE, the previous statement of With should be terminated by semicolon.

Now for a recursive CTE, consider a query that has to fetch the hierarchy of organisation using its Employee table.

WITH DirectReports(ManagerID, EmployeeID, EmployeeLevel) AS

(

SELECT ManagerID, EmployeeID, 0 AS EmployeeLevel

FROM HumanResources.Employee

WHERE ManagerID IS NULL

UNION ALL

SELECT e.ManagerID, e.EmployeeID, EmployeeLevel + 1

FROM HumanResources.Employee e

INNER JOIN DirectReports d -- This line is calling the CTE

itself recursively

ON e.ManagerID = d.EmployeeID

)

SELECT ManagerID, EmployeeID, EmployeeLevel

FROM DirectReports

WHERE EmployeeLevel <= 2 ;

For further Studies

http://msdn2.microsoft.com/en-us/library/ms175972.aspx

http://www.dotnetheaven.com/Uploadfile/DipalChoksi/introcte_sql2005_dc08032006001855AM/introcte_sql2005_dc.aspx

http://www.4guysfromrolla.com/ASPScripts/PrintPage.asp?REF=%2Fwebtech%2F071906-1.shtml

Tuesday, February 27, 2007

A Date With DATETIME

Consider a simple(!?) query executed against the SQL Server 2005 sample database 'AdventureWorks'

SELECT soh.SalesPersonID, sum(sod.LineTotal) as amount
FROM sales.SalesOrderHeader soh
JOIN sales.SalesOrderDetail sod
ON sod.SalesOrderID = soh.SalesOrderID
WHERE soh.Status = 5 -- complete
and soh.OrderDate >= '20040101'
GROUP by soh.SalesPersonID

Does it work fine...? Definitely it will work fine because it is a well tested by an author and a corporate database developer and architect-LOUIS DAVIDSON's Query illustrated in the book Apress : Pro SQL Server 2005. LOUIS DAVIDSON is one among the group of authors and he wrote the chapter 3 where this query is used.

I am not going to dig into this query and how it is working. Let us look in to the bold characters in the above query soh.OrderDate >= '20040101'


In this query , soh.OrderDate is datetime column and it is compared with a string expression '20040101' .

On focusing this portion, we have to recollect some simple things that we have already aware of, how SQL Server Stores a date value in available datetime datatypes.

  • datetime : by consuming 8 byte integers for a single value, stores the number of days before or after the base date 1st January 1900 in the first four bytes, and the time of the day measured in 3 1/3 millisecond units after the midnight in the last four bytes.
  • smalldatetime: by consuming 2 two bytes integers for a single value, stores the number of days after the base date 1st January 1900 in the first two bytes, and the time measured in minutes after midnight in the last two bytes.

Right!?

In the above query does not worry about the time. But for us in many cases as we are as a developer or a DBA it is a major factor.

Here in this query, the string expression '20040101' is implicitly converted to datetime, because according to the hierarchy of the precedence datetime has the higher precedence than the character string. This expression does not include time part, and the data in the table is also not time aware. What will happen to a query (which includes this type of string literals) that is so conscious about time part stored in the column?

Normally a datetime string literal which includes a milliseconds and when it is implicitly converted to datetime, it will be rounded to the nearest three hundredth milliseconds. If the same is converted to smalldatetime it will rounded to nearest minute.

The Author Itzik Ben-Gan, of the book 'MSPress: Inside Microsoft SQL Server 2005: T-SQL Programming' advise to explicitly convert the string to datetime using datetime functions supported by SQL Server (which perform correct calculations by taking leap year and other calendar calculations in account) whenever possible.

One more thing we have to consider in this regard is the Language Settings. SQL Server's datetime functions are depend on the language setting either the default setting of the Server or the Session. Session's setting has the higher precedence.

For further studies :

Monday, February 26, 2007

SQL Server 2005.... will change your thinking..!

Hello All.....,

It is my great pleasure to express my views on SQL Server 2005.

From the very first day I started working on the New Version of SQL Server 2005, I feel in heaven, by seeing its new features like... CTE, New Server side Paging functions and much more...., I started shouting on standing on my table.

It changed my views on using SQL Server and may be yours too...!

From that day I thought of starting a blog like this not only for sharing my views but also gaining best of other SQL Server Experts like you.

It will be an open thought discussion blog. You are always welcome to post your views , questions and your day to day problems you are facing using SQL Server 2005 or SQL Server 2000.

I, as a creator of this blog, assure you that your frequent visit to this blog will help you as well as me to get to know more what we knew about this Software and its usage..

In near future I planned to include some kind of learning issues to this blog.

Hope your appreciation by participating in this blog.

Suresh Barathan