Database Management Systems II - Prof. Holowczak

Zicklin School of Business - Baruch College
City University of New York

Database Management Systems II


Query Processing and Optimization - Connolly and Begg Textbook Notation

The following notes make use of the notations from the Connolly and Begg textbook.

What You'll Learn This Week


Query Processing

Elmasri/Navathe (3rd Edition) Connolly and Begg (3rd Edition)
Chapter 18 Chapter 20

A query is processed in four general steps:

  1. Scanning and Parsing
  2. Query Optimization or planning the execution strategy
  3. Query Code Generator (interpreted or compiled)
  4. Execution in the runtime database processor

1. Scanning and Parsing

2. Query Optimization or Planning the Execution Strategy

3. Query Code Generator (interpreted or compiled)

4. Execution in the runtime database processor

Query Optimization

We divide the query optimization into two types: Heuristic (sometimes called Rule based) and Systematic (Cost based).

Heuristic Query Optimization

SELECT PNUMBER, DNUM, LNAME FROM PROJECT, DEPARTMENT, EMPLOYEE WHERE DNUM=DNUMBER and MGRSSN=SSN and PLOCATION = 'Stafford'; Or, in relational algebra:

on the following schema:
EMPLOYEE TABLE FNAME MI LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO -------- -- ------- --------- --------- ------------------------- - ------ --------- -- JOHN B SMITH 123456789 09-JAN-55 731 FONDREN, HOUSTON, TX M 30000 333445555 5 FRANKLIN T WONG 333445555 08-DEC-45 638 VOSS,HOUSTON TX M 40000 888665555 5 ALICIA J ZELAYA 999887777 19-JUL-58 3321 CASTLE, SPRING, TX F 25000 987654321 4 JENNIFER S WALLACE 987654321 20-JUN-31 291 BERRY, BELLAIRE, TX F 43000 888665555 4 RAMESH K NARAYAN 666884444 15-SEP-52 975 FIRE OAK, HUMBLE, TX M 38000 333445555 5 JOYCE A ENGLISH 453453453 31-JUL-62 5631 RICE, HOUSTON, TX F 25000 333445555 5 AHMAD V JABBAR 987987987 29-MAR-59 980 DALLAS, HOUSTON, TX M 25000 987654321 4 JAMES E BORG 888665555 10-NOV-27 450 STONE, HOUSTON, TX M 55000 1
DEPARTMENT TABLE: DNAME DNUMBER MGRSSN MGRSTARTD --------------- --------- --------- --------- HEADQUARTERS 1 888665555 19-JUN-71 ADMINISTRATION 4 987654321 01-JAN-85 RESEARCH 5 333445555 22-MAY-78 PROJECT TABLE: PNAME PNUMBER PLOCATION DNUM ---------------- ------- ---------- ---- ProductX 1 Bellaire 5 ProductY 2 Sugarland 5 ProductZ 3 Houston 5 Computerization 10 Stafford 4 Reorganization 20 Houston 1 NewBenefits 30 Stafford 4 WORKS_ON TABLE: ESSN PNO HOURS --------- --- ----- 123456789 1 32.5 123456789 2 7.5 666884444 3 40.0 453453453 1 20.0 453453453 2 20.0 333445555 2 10.0 333445555 3 10.0 333445555 10 10.0 333445555 20 10.0 999887777 30 30.0 999887777 10 10.0 987987987 10 35.0 987987987 30 5.0 987654321 30 20.0 987654321 20 15.0 888665555 20 null


The left hand tree is evaluated in steps as follows:

The right hand tree is evaluated in steps as follows:

Example of Heuristic Query Optimization

1. Original Query Tree
2. Use Rule 1 to Break up Cascading Selections
3. Commute Selection with Cross Product
4. Combine Cross Product and Selection to form Joins

Systematic (Cost based) Query Optimization

Note: The following notes are based upon materials presented in the Connolly/Begg 3rd edition. The notations differ between textbooks.

Metadata stored in the system catalog

  1. Blocking factor - number of data records (tuples) (nTuples(R)) per block. Given as bFactor(R)
  2. Indexes on tables - primary, secondary, etc. Given as the number of index levels for index I on attributes A
    nLevelsA(I).
  3. The number of blocks that make up a first level index (leaf level) given as:
    nLfBlocksA(I)
  4. Number of distinct values of an indexing attribute (such as a key)
    nDistinctA(R)
    - used for estimation of selectivity
  5. Selection cardinality SCA(R) of an attribute A is the average number of records that will satisfy an equality seelction condition on that attribute.
    For non-key attributes, we can estimate:
    SC = nTuples(R) / nDistinctA(R)
    (assuming a uniform distribution of nDistinctA(R))

Basic variables and notations used

Variable Connolly/Begg
Number of Records (tuples) in a table (relation) nTuples(R)
Record (tuple) Size in bytes
Blocking Factor (records per block) Spanned: bFactor(R) = Block Size / Record Size(R)
Remember to apply a floor function to this result for unspanned
Unspanned: bFactor(R) = BlockSize / RecordSize(R)
Number of Blocks required to store the table nBlocks(R) = nTuples(R) / bFactor(R)
Remember to apply a ceiling function to this result
Number of Levels of an Index nLevelsA(I)
Number of blocks required for first index level nLfBlocksA(I)
Number of distinct values of an attribute nDistinctA(I)
Selection Cardinality (avg. number of records satisfied on equality)
(Assuming uniform distribution of distinct values)
For a non-key:
SCA(R) = nTuples(R) / nDistinctA(R)
For a key:
SCA(R) = 1
Selection Cardinality for non-equality
(Assuming uniform distribution of distinct values)
For inequality (A > c):
SCA(R) = nTuples(R) * (( maxA(R) - c) / (maxA(R) - minA(R) )
For inequality (A < c) According to the textbook:
SCA(R) = nTuples(R) * (( c - maxA(R) ) / (maxA(R) - minA(R) )
Perhaps this is better:
SCA(R) = nTuples(R) * (( c - minA(R) ) / (maxA(R) - minA(R) )
For conjunctive conditions (A B):
SCAB(R) = SCA(R) * SCB(R)
For disjunctive conditions (A B):
SCAB(R) = SCA(R) + SCB(R) - SCA(R) * SCB(R)

Cost functions for operations

conditions Connolly/Begg notation
Non-equality condition or Equality on a non-key: Full table scan nBlocks(R)
Equality condition on a key in unordered data: Linear search nBlocks(R) / 2
Equality condition on a key in ordered data: Binary Search: log2nBlocks(R)
Equality condition on a key in using an Index: nLevelsA(I) + 1
Equality condition on a non-key or Non-equality condition
Retrieve multiple records using a multi-level index:
(assume 1/2 of the records match the condition)
nLevelsA(I) + nBlocks(R) / 2
Equality condition on a non-key using a clustering index: nLevelsA(I) + SCA(R) / bFactor(R)
Equality condition on a non-key using secondary index:
because in the worst case, each record may reside on a different block.
nLevelsA(I) + SCA(R)

Single Table Query Examples

Description Connolly/Begg notation
EMPLOYEE table has 10,000 records with a blocking factor of 5. nTuples(E) = 10000
bFactor(E) = 5
nBlocks(E) = 2000
Assume a Secondary index on the key attribute SSN nLevelsSSN(E.SSN) = 4
SCSSN(E) = 1
Assume a Secondary index on non-key attribute DNO nLevelsDNO(E.DNO) = 2
nLfBlocksDNO(E.DNO) = 4
There are 125 distinct values of DNO nDistinctDNO(E) = 125
The selection cardinality of DNO is SCDNO(E) = nTuples(E) / nDistinctDNO(E) = 80

Query 1: SELECT * FROM employee WHERE SSN=123456789

   1. Cost of a Full table Scan nBlocks(E) = 2000 blocks
   2. Average Cost (no index) Linear search bBlocks(E) / 2 = 1000 blocks
   3. Cost when using a secondary index on SSN nLevelsSSN(E.SSN) + 1 = 5 blocks

Query 2: SELECT * FROM employee WHERE DNO = 5

   1. Cost of a Full table Scan nBlocks(E) = 2000 blocks
   2. Rough estimate using the secondary index nLevelsDNO(E.DNO) + SCDNO(E) = 82

Query 3: SELECT * FROM employee WHERE DNO > 5

   1. Cost of a Full table Scan nBlocks(E) = 2000 blocks
   2. Rough estimate using the secondary index nLevelsDNO(E.DNO) + SCDNO(E) = 9,680 blocks
Where SCDNO(E) is estimated using the formula for inequality (A < c):
SCDNO(E) = nTuples(E) * (( maxDNO(E) - c ) / (maxDNO(E) - minDNO(E) )   =
   10,000 * (( 125 - 5) / (125 - 1) )    = 9677.419 = 9,678 records

Query 4: SELECT * FROM employee WHERE SALARY > 30000 AND DNO > 4

This needs to be done in two steps: salary > 30000 ( dno > 4 (E))   or
TEMP <- dno > 4 (E)
salary > 30000 (TEMP)

   1. Cost of a two Full table Scans nBlocks(E) + nBlocks(TEMP) <= 4000 blocks
   2. Rough estimate using the secondary
index followed by full table scan of TEMP
( nLevelsDNO(E.DNO) + SCDNO(E) ) + nBlocks(TEMP) <= 2082


Cost function estimates for Join Operations

Variable Connolly/Begg notation
Number of blocks for each relation nBlocks(R)
nBlocks(S)
Number or tuples for each relation nTuples(S)
Blocking factor of the joined relation T
Block size divided by the Sum of the record sizes of R and S
bFactor(T)
Size of cartesian product of R and S nTuples(R) * nTuples(s)
nBlocks(R) * nBlocks(S)
Join Cardinalities (Connolly/Begg)
Join R and S resulting in T
R R.A = S.B S
If A is a key of R then nTuples(T) <= nTuples(S)
If B is a key of S then nTuples(T) <= nTuples(R)

Otherwise, estimate as
nTuples(T) = SCA(R) * nTuples(S)       or
nTuples(T) = SCB(S) * nTuples(R)

Join type Connolly/Begg
Nested loop join nBlocks(R) + ( nBlocks(R) * nBlocks(S) ) + [nTuples(T) / bFactor(T)]
Indexed Join
Assuming an index on S.B
nBlocks(R) + [nTuples(R) * (nLevelsB(I) + SCB(S)   ) ] + [nTuples(T) / bFactor(T)]
Sort-Merge Join (assuming tables
are already sorted on the same attributes)
nBlocks(R) + nBlocks(S) + [nTuples(T) / bFactor(T)]
Cost to sort relation R
Cost to sort relation S
nBlocks(R) * [log2nBlocks(R)]
nBlocks(S) * [log2nBlocks(S)]

Example Joins

Description Connolly/Begg
EMPLOYEE table has 10,000 records with a blocking factor of 5. nTuples(E) = 10000
bFactor(E) = 5
nBlocks(E) = 2000
Assume a Secondary index on the key attribute SSN nLevelsSSN(E.SSN) = 4
SCSSN(E) = 1
Assume a Secondary index on non-key attribute DNO nLevelsDNO(E.DNO) = 2
nLfBlocksDNO(E.DNO) = 4
There are 125 distinct values of DNO nDistinctDNO(E) = 125
The selection cardinality of DNO is SCDNO(E) = nTuples(E) / nDistinctDNO(E) = 80
Department table has 125 records with blocking factor 10 nTuples(D) = 125
bFactor(D) = 10
nBlocks(D) = 13
There is a single level primary index on DNUMBER nLevelsDNUMBER(D.DNUMBER) = 1
There is a secondary index on MGRSSN nLevelsMGRSSN(D.MGRSSN) = 2
SCMGRSSN(D) = 1
Join Selection Cardinality estimate (DNUMBER is a key of DEPARTMENT) nTuples(T) <= nTuples(E) = 10000
Blocking factor for the resulting joined table bFactor(ED) = 4

Join query:

SELECT employee.*, department.*
FROM   employee, department
WHERE  employee.dno = department.dnumber
Join Access Routine Formulas and Notations
Use nested loop with
EMPLOYEE on the outside
nTuples(T) = 10,000
bFactor(T) = bFactor(ED) = 4
nBlocks(E) + (nBlocks(E) * nBlocks(D) + [nTuples(T) / bFactor(T)]
= 2000 + (2000 * 13) + [ 10000 / 4 ]
= 30,500
Use nested loop with
DEPARTMENT on the outside
nBlocks(D) + (nBlocks(E) * nBlocks(D) ) + [nTuples(T) / bFactor(T)]
= 13 + (2000 * 13) + [ 10000 / 4 ]
= 28,513
Use index structure
on DNUMBER with EMPLOYEE on the outside
nBlocks(E) + (nTuples(E) * (nLevelsDNUMBER(D.DNUMBER) + 1)) + [nTuples(T) / bFactor(T)]
= 2000 + (10000 * 2) + [ 10000 / 4 ]
= 24,500
Use index structure on
DNO with DEPARTMENT on the outside
nBlocks(D) + (nTuples(D) * (nLevelsDNO(E.DNO) + SCDNO(E)) + [nTuples(T) / bFactor(T)]
= 13 + (125 * (2 + 80)) + [ 10000 / 4 ]
= 12,763
Sort-Merge Join Sort: [nBlocks(E) * log2nBlocks(E)] + [nBlocks(D) * log2nBlocks(D)]
= 2000 * 11 + 13 * 4
= 22,052

Join: nBlocks(E) + nBlocks(D) + [nTuples(T) / bFactor(T)] = 2000 + 13 + [ 10000 / 4 ] = 4,513

Total cost: 22052 + 4513 = 26,565

  • Another example:
    SELECT department.*, employee.*
    FROM   department, employee
    WHERE  department.mgrssn = employee.ssn
    
    Assume an employee may manage at most one department. So join cardinality = 125 (nTuples(T))
    1. Use nested loop with DEPARTMENT on the outside:

      nBlocks(D) + (nBlocks(D) * nBlocks(E)) + [nTuples(T) / bFactor(T)]
      = 13 + (13 * 2000) + [ 125 / 4 ]
      = 13 + 26000 + [125 / 4]
      = 26,045

    Query Optimization Examples

    Table Name # of Records Record Size Blocking Factor # of Blocks Indexes Selection Card.
    EMPLOYEE nTuples(E) = 8 100 bytes bFactor(E) = 2 nBlocks(E) = 4 nLevelse.ssn = 2 SCssn(E) = 1
    PROJECT nTuples(P) = 6 55 bytes bFactor(P) = 3 nBlocks(P) = 2 nLevelsp.dnum = 1 SCdnum(P) = nTuples(P) / nDistinctdnum(P) = 6/3 = 2
    WORKS_ON nTuples(W) = 16 35 bytes bFactor(W) = 5 nBlocks(W) = 4 no indexes SChours(W) = nTuples(W) / nDistincthours(P) = 16/9 = 2

    Ordering #1: 1, 2, 1, 2, Sort, Projection

    1. 1: Select only those records where PROJECT.DNUM = 5
      • Method a: Equality condition on a non-key: Full table scan
        Cost: nBlocks(P) = 2 blocks
      • Method b: Using clustering index
        Algorithm: Multiple records using a clustering index: nLevelsdnum + ( SCdnum / bFactor(P) )
        We need to know the selection cardinality of dnum. we can estimate: SCdnum(P) = nTuples(P) / nDistinctdnum(P)
        From the data we can see that there are 3 distinct values of dnum
        SCdnum = ( 6 / 3 ) = 2
        Remember, this is only an estimate of SCdnum
        Cost: nLevelsdnum + ( SCdnum / bFactor(P) ) = 1 + (2 / 3) = 1.66 or 2 blocks
      So in this step, it does not matter if we use the index or not, the cost will be the same: Cost is 2 blocks (Step 1)

      The result of this step is a temporary table we will call P1 with the following characteristics.
      Note the record size remains the same, only the number of records has been reduced according to our estimate of SCdnum(P)

      Table Name # of Records Record Size Blocking Factor # of Blocks
      P1 nRecords(P1) = 2 55 bytes bFactor(P1) = 3 nBlocks(P1) = 1

      Temporary table P1 has no indexes associated with it (as is the case iwth all temporary tables).

    2. 2: Select only those records where WORKS_ON.HOURS > 10
      Since there are no indexes on the HOURS column, and since the condition is non-equality, we are left with only full table scan.
      • Method a: Full table scan
        Algorithm: nBlocks(W) Cost: 4 blocks
      Thus the cost for this step is: Cost = 4 blocks (Step 2)

      The result of this step is a temporary table we will call W1 with the following characteristics.
      Note the record size remains the same, only the number of records has been reduced.
      Our rough estimate is that 1/2 of the WORKS_ON records will have HOURS > 10):

      Table Name # of Records Record Size Blocking Factor # of Blocks
      W1 nRecords(W1) = 8 35 bytes bFactor(W1) = 5 nBlocks(W1) = 2

      Temporary table W1 has no indexes associated with it.

      At this point, the two temporary tables W1 and P1 would look like the following (note the DBMS does not actually know this at the planning stage; this is just for illustrative purposes only):
      P1 TEMP TABLE: PNAME PNUMBER PLOCATION DNUM ---------------- ------- ---------- ---- ProductX 1 Bellaire 5 ProductY 2 Sugarland 5 ProductZ 3 Houston 5 W1 TEMP TABLE: ESSN PNO HOURS --------- --- ----- 123456789 1 32.5 666884444 3 40.0 453453453 1 20.0 453453453 2 20.0 999887777 30 30.0 987987987 10 35.0 987654321 30 20.0 987654321 20 15.0

    3. 1: Join the WORKS_ON and PROJECT where WORKS_ON.PNO = PROJECT.PNUMBER
      Since we have already filtered out those PROJECT records with DNUM=5, we can use the P1 table in place of the full PROJECT table.
      Since we have already filtered out those WORKS_ON records with HOURS > 10, we can use the W1 table in place of the full WORKS_ON table.

      We need to know what the blocking factor of the resulting table will be.
      Since we are joining these two tables, the resulting temporary table (we will call W1P1) will have all of the columns from both tables, so we need to add up the sizes of their respective records:
      P1 record size is 55 bytes + W1 record size is 45 bytes so W1P1 record size will be 100 bytes
      Thus given a block size of 200 bytes, bFactor(W1P1) = 2

      We also need to know join selection cardinality for the new joined table.
      Since PROJECT.PNUMBER is the key of the PROJECT table, we can make use of the following estimation:
      If PNUMBER is a key of P1 then nTuples(W1P1) <= nTuples(W1)
      We therefore estimate nTuples(W1P1) = nTuples(W1) = 8.
      This is an upper bound, not an exact figure.

      • Method a: Nested loop (W1 on the outside)
        Algorithm: nBlocks(W1) + ( nBlocks(W1) * nBlocks(P1) ) + [ nTuples(W1P1) / bFactor(W1P1) ]
        Cost: 2 + (2 * 1) + [ 8 / 2] = 4 + [8/2] = 8 blocks (method a)

      • Method b: Nested loop (P1 on the outside)
        Algorithm: nBlocks(P1) + ( nBlocks(W1) * nBlocks(P1) ) + [ nTuples(W1P1) / bFactor(W1P1) ]
        Cost: 1 + (2 * 1) + [ 8 / 2] = 3 + [8/2] = 7 blocks (method b)

      In this step, it matters which table we put on the outside loop.
      Putting P1 on the outside loop (method b) gives the best performance: Cost is 7 blocks (Step 3)

      The result of this step is a temporary table we will call W1P1 with the following characteristics:

      Table Name # of Records (upper bound) Record Size Blocking Factor # of Blocks
      W1P1 nTuples(W1P1) = 8 (estimate) 100 bytes bFactor(W1P1) = 2 nBlocks(W1P1) = 4

      This W1P1 table would actually look like:

      ESSN        PNO  HOURS  PNAME     PNUMBER    PLOCATION   DNUM
      123456789   1    32.5   ProductX     1       Bellaire    5
      666884444   3    40.0   ProductZ     3       Houston     5
      453453453   1    20.0   ProductX     1       Bellaire    5
      453453453   2    20.0   ProductY     2       Sugarland   5
      

    4. 2: Join EMPLOYEE and WORKS_ON where EMPLOYEE.SSN = WORKS_ON.ESSN
      Once again, since we have already joined WORKS_ON with PROJECTS (actually a reduced version of WORKS_ON and PROJECTS tables), we can make use of the W1P1 temporary table.

      We need to know what the blocking factor of the resulting table will be.
      Since we are joining these two tables, the resulting temporary table (E1W1P1) will have all of the columns from both tables, so we need to add up the sizes of their respective records:
      W1P1 record size is 100 bytes + E record size is 100 bytes so E1W1P1 record size will be 200 bytes
      Thus given a block size of 200 bytes, bFactor(E1W1P1) = 1

      We also need to know join selection cardinality for the resulting table.
      Since EMPLOYEE.SSN is the key of the PROJECT table, we can make use of the following estimation:
      If SSN is a key of EMPLOYEE then nTuples(E1W1P1) <= nTuples(W1P1)
      We therefore estimate nTuples(E1W1P1) = nTuples(W1P1) = 8.
      This is an upper bound, not an exact figure.

      • Method a: Nested loop (EMPLOYEE on the outside)
        Algorithm: nBlocks(E) + (nBlocks(E) * nBlocks(W1P1)) + [ nTuples(E1W1P1) / bFactor(E1W1P1) ]
        Cost: 4 + (4 * 4) + [ 8 / 1] = 4 + 16 + [8/1] = 28 blocks (method a)

      • Method b: Nested loop (W1P1 on the outside)
        Algorithm: nBlocks(W1P1) + (nBlocks(E) * nBlocks(W1P1)) + [ nTuples(E1W1P1) / bFactor(E1W1P1) ]
        Cost: 4 + (4 * 4) + [ 8 / 1] = 4 + 16 + [8/1] = 28 blocks (method b)

      • Method c: Single loop making use of index on SSN (W1P1 on the outside)
        Algorithm: nBlocks(W1P1) + [ nTuples(W1P1) * (nLevelsssn(E) + SCssn(E)) ] + [ nTuples(E1W1P1) / bFactor(E1W1P1) ]
        Cost: 4 + [ 8 * (2 + 1) ] + [ 8 / 1] = 4 + 21 + [8/1] = 33 blocks (method c)

      Therefore, the lowest cost method is to use nested loop (either way) (method a or b) which gives: Cost = 28 blocks (Step 4)

      As before, we have a new temporary table called E1W1P1 with the following characteristics:

      Table Name # of Records Record Size Blocking Factor # of Blocks
      E1W1P1 nTuples(E1W1P1) = 8 200 bytes bFactor(E1W1P1) = 1 nBlocks(E1W1P1) = 8

      The E1W1P1 table would look like this (all of the columns in EMPLOYEE would also be in there):

      ESSN        PNO  HOURS  PNAME     PNUMBER    PLOCATION   DNUM  FNAME     MINIT LNAME    SSN        ...
      123456789   1    32.5   ProductX     1       Bellaire    5     John      B     Smith    123456789
      666884444   3    40.0   ProductZ     3       Houston     5     Ramesh    K     Narayan  666884444
      453453453   1    20.0   ProductX     1       Bellaire    5     Joyce     A     English  453453453
      453453453   2    20.0   ProductY     2       Sugarland   5     Joyce     A     English  453453453
      

    5. Sort the final temp table E1W1P1 on the LNAME column.
      The cost to sort a relation is: nBlocks(R) * [log2nBlocks(R)]
      Cost: nBlocks(E1W1P1) * [log2nBlocks(E1W1P1)] = 8 * 3 = 24 blocks
      Cost is 24 blocks (Step 5)

    6. Project the FNAME, LNAME, PNAME columns from the final temp table
      We may consider doing this as part of the previous join operation in which case the cost of this step is 0.
    In summary, we have completed the 6 main steps for this query and each step produced the following costs:
    1 Select only those records where PROJECT.DNUM = 5                       2 blocks
    2 Select only those records where WORKS_ON.HOURS > 10                    4 blocks
    3 Join the WORKS_ON and PROJECT where WORKS_ON.PNO = PROJECT.PNUMBER     7 blocks
    4 Join EMPLOYEE and WORKS_ON where EMPLOYEE.SSN = WORKS_ON.ESSN         28 blocks
    5 Sort the final temp table on the LNAME column                         24 blocks
    6 Project the FNAME, LNAME and PNAME columns                             0 blocks
      Total estimated cost for the query plan (using steps in this order)   65 blocks
    
    This represents the total cost, in blocks, give the specific order of operations: 1, 2, 1, 2, Sort, Projection

    Thus the above steps should then be repeated given other orders of operations. In fact, all of the possible plans (ordering of the operations) should be tried until the ordering of steps that produces the least total cost is found.

    In practice, it is generally most efficient if all of the simple selections are done first. Followed by the joins in later steps.

    Now look at the optimization done using the Heuristic (rule based) approach:
    Original Query Tree
    Cascading the Selections
    Commuting Selection and Cartesian product
    Combining Selection and cartesian product to form joins



    Using Hints in Oracle SQL

    Using Hints in MS SQL Server

    Note: This material from the MS SQL Server 7.0 Books Online

    In MS SQL Server, query hints can be added using an OPTIONS clause at the end of the statement:

    SELECT select_list FROM table_source WHERE search_condition GROUP BY group_by_expression HAVING search_condition ORDER BY order_expression OPTIONS (query options)

    Some of the query options available are:

    Oracle EXPLAIN PLAN

    DB2 Explain


    MS SQL Server Query Analyzer

    MS SQL Server 7.0 comes with a Query Analyzer that will display the execution plan for a given query.

    Assume the following query on the Northwind example database:

    SELECT CompanyName, orders.OrderId, ProductName, SUM([order details].unitprice * [order details].quantity) FROM [order details], orders, customers, products WHERE [order details].orderid = orders.orderid AND orders.customerid = customers.customerid AND [order details].productid = products.productid GROUP BY CompanyName, orders.OrderId, ProductName The following diagram is produced automatically by the Query Analyzer:


    File: qp_index.html Date: 11:12 AM 10/8/2004
    All materials Copyright, 1997-2004 Richard Holowczak