Thursday, March 22, 2007

Answer lies beneath the way of Query Processing

The Question I asked in my previous post is why we should not use column alias used in SELECT list in Group BY Clause and we used that in ORDER By Clause.

The answer lies in the way how a SELECT Statement is processed.

SQL is differ from other programming language in an aspect how it is processing the Code. Normally the a code in a programming language is processed top down method, meaning, first written statement will be processed first. But SQL wont.

A 'SELECT' Statement is processed by SQL Server as follows.

FROM Clause is processed first.
On condition of Join Clause is Processed Second.
Join condition is Processed third.
Where is Processed fourth.
Group By is Processed fifth.
With {Cube} | {Rollup} is processed sixth.
Having is Processed seventh.
SELECT is processed eighth.
Distinct is processed 9th
ORDER By is processed 10th.
TOP Clause is processed 11th.

Suppose if you omitted any optional clauses that steps will be omitted in the processing.

Now look at the query.

Group By is Processed before the Select so we can not use the alias created in the SELECT in Group By.

And Order By is Processed after SELECT so we can use that in ORDER BY Clause.

For further reading about how queries are processed read Chapter 1 and 2 of the book.

Inside Microsoft® SQL Server™ 2005 T-SQL Querying
By Itzik Ben-Gan - (Solid Quality Learning), Lubor Kollar, Dejan Sarka

Thursday, March 15, 2007

A Little Question...!

In my last post, I have explained how CTE can be used in SQL Server 2005. Just take one query used in that post leads to the another topic.

Select EmployeeId,Year(OrderDate) AS OrderYear,
Count(*) as [Sales]
From dbo.Orders
Where EmployeeId=1
Group by EmployeeId,Year(OrderDate)
Order By OrderYear


In the above query the
column Year(OrderDate) is having a alias OrderYear. But this alias is not used in the Group By Clause why? Again the alias is used in Order By Clause. Why don't we use the column alias in the select list can not be used in Group By as in Order By?

Post your comments...!

Saturday, March 3, 2007

The Virtue of the Virtual Views


The following SQL Puzzle is found in the book APRESS: Advanced Transact –SQL for SQL Server 2000 in page no 71 of Chapter 2.

You are provided with these tables: Orders, OrderDetails, and OrderPayments. Each order in the Orders table can have one or more order parts in the OrderDetails table and zero or more payments in the OrderPayments table. Your task is to write a query that produces order information along with the sum of its order parts and order payments.

The Orders table will look like this:

Ordered custid odate

1 1001 2001−01−18 00:00:00.000

2 1002 2001−02−12 00:00:00.000

The OrderDetails table will look like this:

orderid partno qty

1 101 5

1 102 10

2 101 8

2 102 2

The OrderPayments table will look like this:

orderid paymentno value

1 1 75

1 2 75

2 1 50

2 2 50

The expected output is shown in the following table:

orderid custid odate sum_of_qty sum_of_value

1 1001 2001−01−18 00:00:00.000 15 150

2 1002 2001−02−12 00:00:00.000 10 100

How can you solve this?

To solve this problem you need to write a query with two derived table in SQL Server 2000.

Derived tables are Virtual Tables in SQL Server that are similar to views but differs from views by not storing the definition of it in the Metadata. The Derived Table is a result set of a Query that is used in the FROM Clause of an outer Query and it will be no longer available after the execution outer query. Derived Tables can be nested but may not be co-related.

Let’s go to the solution.

Query listing:1

Select OQ.OrderId,OQ.CustId,OQ.ODate , QtySum,Paysum

From (

Select Orders.OrderId, Orders.CustId,Orders.Odate,

Sum(OD.QTY) as QtySum

From Orders Join OrderDetails OD

On Orders.OrderID = OD.OrderId

Group By Orders.OrderId,Orders.CustId, Orders.Odate

) OQ

Join

( Select O1.OrderID, Sum(op.[value]) as PaySum

from Orders O1

Join OrderPayments OP

on O1.OrderId = Op.Orderid

Group By O1.OrderId ) as OP1

On OQ.OrderID = OP1.OrderId

This query is gives you the desired solution (which is also available in that book at page no 494)

As this book is dealing SQL Server 2000 T-SQL, the author used this derived table way.

Using Derived Table for complex queries having some disadvantages. It reduces the readability of the query and as well as the debugging complex queries (using derived tables) is a big job.

SQL Server 2005 offers a novel way to handle these kinds of problems. Using CTE (Common Table Expressions) you can improve readability, and easily maintain a complex query by building such queries in several blocks.

CTE offers you something that derive tables doesn’t, are, Derived Tables are not referred multiple time in a query, and CTE can be used as recursive queries.

First I show you the CTE way solution to that problem

Query listing:2

With QTYSum as

(Select Ord.OrderId,Sum(OD.Qty) as PartQtySum

From Orders Ord

Join OrderDetails OD

on OD.OrderId = Ord.OrderId

Group By Ord.OrderId

), --Block 1

PaySum As

(Select O.OrderId,Sum([Value]) as PaySum

From Orders O

Join OrderPayments OP

On O.OrderId = OP.OrderId

Group By O.OrderId

) -- Block 2

Select Orders.OrderID, Orders.CustId, Orders.Odate, PartQtySum, PaySum

From Orders

Join Paysum on Orders.OrderId = Paysum.Orderid

Join QTYSum on Orders.ORderId = QTYSum.OrderID

Here, we used the same lines of code, but separating its appearance, which gives us better readability. The OQ alias Select Query Statement in the Query listing:1 was replaced by the Block 1 Query of the Query listing:2 and the OP1 alias Select Query Statement in the Query listing:1 was replaced by the Block 2 Query of the Query listing:2

But this CTE is not used multiple references and recursive queries. Let us see some other example for these options.

Suppose you want to compare each year’s Sales of a particular Employee with his or her previous year’s sales, the CTE can be used as follows.(NorthWind Database’s Orders Table is used for illustration)

With YearToDateSales as

( Select EmployeeId,Year(OrderDate) AS OrderYear,

Count(*) as [Sales]

From dbo.Orders

Where EmployeeId=1

Group by EmployeeId,Year(OrderDate)

)

Select Cur.OrderYear,Cur.Sales , Prv.Sales

From YearToDateSales as Cur

Left Outer Join YearToDateSales Prv

                       on Prv.OrderYear = Cur.OrderYear -1
 

In this example, a CTE named ‘YearToDateSales‘ was referred two times in the outer query with two different aliases(Cur, Prv). The Employee ID is hard-coded, can be replaced with an argument. Add the following two lines before the With caluse of the CTE.

Declare @empId as int;

set @empID=1;

And replace the ‘Where EmployeeId=1’ withWhere EmployeeId=@empID’.

Suppose a CTE is used in batch like the above argumented CTE, the previous statement of With should be terminated by semicolon.

Now for a recursive CTE, consider a query that has to fetch the hierarchy of organisation using its Employee table.

WITH DirectReports(ManagerID, EmployeeID, EmployeeLevel) AS

(

SELECT ManagerID, EmployeeID, 0 AS EmployeeLevel

FROM HumanResources.Employee

WHERE ManagerID IS NULL

UNION ALL

SELECT e.ManagerID, e.EmployeeID, EmployeeLevel + 1

FROM HumanResources.Employee e

INNER JOIN DirectReports d -- This line is calling the CTE

itself recursively

ON e.ManagerID = d.EmployeeID

)

SELECT ManagerID, EmployeeID, EmployeeLevel

FROM DirectReports

WHERE EmployeeLevel <= 2 ;

For further Studies

http://msdn2.microsoft.com/en-us/library/ms175972.aspx

http://www.dotnetheaven.com/Uploadfile/DipalChoksi/introcte_sql2005_dc08032006001855AM/introcte_sql2005_dc.aspx

http://www.4guysfromrolla.com/ASPScripts/PrintPage.asp?REF=%2Fwebtech%2F071906-1.shtml