Query Processing Architecture When the user submits a query, it is first parsed by the DBMS, which verifies the syntax and type correctness of the query. Being a declarative language, SQL does not suggest concrete ways to evaluate its queries . Therefore, a parsed query has to be converted into a relational algebra expression, which can be evaluated directly using the algorithms presented in Chapter 10 . A typical SQL query such as SELECT FROM WHERE

DISTINCT TargetList

REL1 V1, . . . , RELn Vn

Condition

is normally translated into the following relational algebraic expression: 7TargetList(O~Condition'(REL, X . . . X RELn))

where Condition' is Condition converted from SQL syntax to relational algebra form . Section 10.1 has an example of such a transformation from SQL to the relational algebra .

410

CHAPTER 11

An Overview of Query Optimization

While the above algebraic expressions are straightforward and easy to produce, it might take ages to evaluate them . For one thing, they contain Cartesian products, so a join of four 100-block relations produces a 108-block intermediate rela tion, which, with a disk speed of 10 ms/page, takes about 50 hours just to write out . Even if we manage to convert the Cartesian product into equi-joins (as explained in Section 11 .2), we might still have to grapple with the long turnaround time (dozens of minutes) for the above query. It is the job of the query optimizer to bring this time down to seconds (or, for very complex queries, a few minutes) . A typical rule-based query optimizer uses a set of rules (e.g., an access path based on an index is better than a table scan) to construct a query execution plan. A cost-based query optimizer estimates the cost of query plans based on statistics maintained by the DBMS and uses this information, in addition to the rules, to choose a plan. The two main components of a cost-based query optimizer are the query execution plan generator and the plan cost estimator . A query execution plan can be thought of as a relational expression with concrete evaluation methods (or access paths, as we called them in Chapter 10) attached to each occurrence of a relational operator in the expression . Thus, the main job of the optimizer is to propose a single plan that can evaluate the given relational expression at a "reasonably cheap" cost according to the cost estimator. This plan is then passed to the query plan interpreter, a software component directly responsible for query evaluation according to the given plan. The overall architecture of query processing is depicted in Figure 11 .1 .

i Relational Algebra Expression 1 Query Optimizer ------------- -----------Query Plan Cost

Generator I I Estimator ____ I '~------------Query Execution Plan

FIGURE 11 .1 Typical architecture for

DBMS

query processing .

11 .2

Heuristic Optimization Based on Algebraic Equivalences

Heuristic Optimization Based on Algebraic Equivalences

The heuristics used in relational query evaluation are (for the most part) based on simple observations, such as that joining smaller relations is better than joining large ones, that performing an equi-join is better than computing a Cartesian product, and that computing several operations in just one relation scan is better than doing so in several scans. Most of these heuristics can be expressed in the form of relational algebra transformations, which take one expression and produce a different but equivalent expression . Not all transformations are optimizations by themselves. Sometimes they yield less efficient expressions . However, relational transformations are designed to work with other transformations to produce expressions that are better overall . We now present a number of heuristic transformations used by the query optimizers . Selection and projection-based transformations .

This transformation is known as cascading of selections. It is not an optimization per se, but it is useful in conjunction with other transformations (see the discussion on page 412 of pushing selections and projections through joins). acond, (7cond2 (R)) - acond2 (Clcond, (R)) . This transformation is called commutativity of selection . Like cascading, it is useful in conjunction with other transformations. 7att,(R) --_ natt,(,7attr,(R)), if attr c attr' and attr' is a subset of the attributes of R. This equivalence is known as cascading of projections and is used primarily with other transformations . '-Tcond1ncortd2 (R) =(Tcond,( 6cond z (R)) .

- acond(7 rattr(R)), if attr includes all attributes used in cond. This equivalence is known as the commutativity of selection and projection. It is usually used as a preparation step for pushing a selection or a projection through the join operator. nattr((Tcond(R))

The transformations used for cross products and joins are the usual commutativity and associativity rules for these operators . Cross product and join transformations . R>aS - S>< S) r>< T . The associativity and commutativity rules can be used to transform the latter expression into the former. In fact, the commutativity and associativity rules are largely responsible for the many alternative evaluation plans that might exist for the same query. A query that involves the join of N relations can have T(N) x N! query plans just to handle the join, where T(N) is the number of different binary trees with N leaf nodes. (N! is the number of permutations of N relations, and T(N) is the number of ways a particular permutation can be parenthesized .) This number grows very rapidly and is huge even for very small N. 1 A similar result holds for other commutative and associative operations (e.g., union), but our main focus is on join because it is the most expensive operation to compute . The job of the query optimizer is to estimate the cost of these plans (which can vary widely) and to choose one "good" plan. Because the number of plans is large, it can take longer to find a good plan than to evaluate the query by brute force. (It is faster to perform 106 I/Os than 15! in-memory operations .) To make query optimization practical, an optimizer typically looks at only a small subset of all possible plans, and its cost estimates are approximate at best . Therefore, query optimizers are very likely to miss the optimal plan and are actually designed only to find one that is "reasonable ." In other words, the "optimization" in "query optimizer" should always be taken with a grain of salt since it does not adequately describe what is being done by that component of the DBMS architecture . Pushing selections and projections through joins and Cartesian products . -cond(R x S) - R -co,,,, S . This rule is used when cond relates the attributes of both R and S. The basis for this heuristic is the belief that Cartesian products

should never be materialized. Instead, selections must always be combined with Cartesian products and the techniques for computing joins should be used. By applying the selection condition as soon as a row of R x S is created, we can save one scan and avoid storing a large intermediate relation . orcond(R x S)-- 6cond(R) x S, if the attributes used in cond all belong to R . This heuristic is based on the idea that if we absolutely must compute a Cartesian product, we should make the relations involved as small as possible . By pushing the selection down to R, we hope to reduce the size of R before it is used in the cross product.

1

When N = 4, T(4) is 5, and the number of all plans is 120 . When N = 5, T(5) = 14, and the number of all plans is 1680 .

11 .2

Heuristic Optimization Based on Algebraic Equivalences

ocond(R cond' S) _ ocond(R) >~corr& S, if the attributes in cond all belong to R . The rationale here is the same as for Cartesian products . Computing a join can be very expensive, and we must try to reduce the size of the relations involved . Note that if cond is a conjunction of comparison conditions, we can push each conjunct separately to either R or S as long as the attributes named in the conjunct belong to only one relation. 7rattr(R x S) - 7rattr(7rattr'(R) x S), if attributes(R) ;? attr' D (attr n attributes(R)), where attributes(R) denotes the set of all the attributes of R. The rationale for this rule is that, by pushing the projection inside the Cartesian product, we reduce the size of one of its operands. In Chapter 10, we saw that the I/O complexity of the join operation (of which x is a special case) is proportional to the number of pages in the relations involved . Thus, by applying the projection early we might reduce the number of page transfers needed to evaluate the cross product . ,7attr(R~cond S) _nattr(nattr'(R) -cond S), if attr' c attributes (R) is such that it contains all the attributes that R has in common with either attr or cond . The potential benefit here is the same as for the cross product . The important additional requirement is that attr' must include those attributes of R that are mentioned in cond . If some of these attributes are projected out, the expression nattr'(R) -cond S will not be syntactically correct . This requirement is unnecessary in the case of the Cartesian product since no join condition is involved. The rules for pushing selections and projections through joins and cross products are especially useful when combined with the rules for cascading o and n . For instance, consider the expression ff1,1C21C3(R x S), where cl involves the attributes of both R and S, C involves only the attributes of R, and c3 involves only the attributes of S . We can transform this expression into one that can be evaluated more efficiently by first cascading the selections, then pushing them down and finally eliminating the Cartesian product : ac i AC nc3 (R x S) - acl (oC2(oc3 (R x S))) - °ci (ac2 (R) x aC (S)) - oc2 (R) -c1

or,, (S)

We can optimize the expressions that involve projections in a similar way. Consider, for instance, natt,(R -c(,nd S). Suppose that attrl is a subset of the attributes in R such that attrl D attr n attributes(R) and such that attrl contains all the attributes in cond . Let attr2 be a similar set for S. Then nattr(R

cond S) - 7rattr( 17attr,(R -cond S)) = nattr(nattrl(R) -cond S) nattr(nattr2(7attrl(R) -co d S)) = nattr(7rattrl (R) -cond 7Fattr2 N)

The resulting expression can be more efficient because it joins smaller relations . Using the algebraic equivalence rules. Typically, the above algebraic rules are

used to transform queries expressed in relational algebra into expressions that are believed to be better than the original. The word "better" here should not be understood literally because the criteria used to guide the transformation are heuristic . In fact, in the next section we will see that following through with all the suggested

41 3

414

CHAPTER 11

An

Overview of Query Optimization

transformations might not yield the best result. Thus, the outcome of the algebraic transformation step should yield a set of candidate queries, which must then be further examined using cost-estimation techniques discussed in Section 11 .3 . Here is a typical heuristic algorithm for applying algebraic equivalences : 1. 2.

3.

4.

5. 6.

Use the cascading rule for selection to break up the conjunctions in selection conditions. The result is a single selection transformed into a sequence of selection operators, each of which can be applied separately. The previous step leads to greater freedom in pushing selections through joins and Cartesian products. We can now use the rules for commutativity of selection and for pushing selections through joins to propagate the selections as far inside the query as possible. Combine the Cartesian product operations with selections to form joins . As we saw in Chapter 10, there are efficient techniques for computing joins, but little can be done to improve the computation of a Cartesian product . Thus, converting these products into joins is a potential time and space saver. Use the associativity rules for joins and Cartesian products to rearrange the order of join operations . The purpose here is to come up with the order that produces the smallest intermediate relations . (Note that the size of the intermediate relations directly contributes to overhead, so reducing these sizes speeds up query processing .) Techniques for the estimation of the size of intermediate relations are discussed in Section 11.3. Use the rules for cascading projections and for pushing them into queries to propagate projections as far into the query as possible . This can potentially speed up the computation of joins by reducing the size of the operands . Identify the operations that can be processed in the same pass to save time writing the intermediate results to disk. This technique is called pipelining and is illustrated in Section 11 .3 .

Estimating the Cost of a Query Execution Plan

As defined earlier, a query execution plan is more or less a relational expression with concrete evaluation methods (access paths) attached to each operation . In this section, we take a closer look at this concept and discuss ways to evaluate the cost of a plan to compute query results. For discussion purposes, it is convenient to represent queries as trees. In a query tree each inner node is labeled with a relational operator and each leaf is labeled with a relation name. Unary relational operators have only one child ; binary operators have two . Figure 11 .2 presents four query trees corresponding to the following equivalent relational expressions, respectively: I Name (QDeptId='CS' ASemester='F1994'(PROFESSORt"aId-profIdTEACHING))

Name (O~DeptId='CS' (PROFESSOR) D Pipelining 6Deptld='CS' ^Semester=71994'

>< " Id=ProfId

PROFESSOR

--Block-nested loops

Id=ProfId

B+ tree index / on Dept ID

\

--Block-nested loops B+ tree index OnSemester

6Deptld ='CS'

6Semester= 71994'

PROFESSOR

TEACHING

TEACHING (a)

(b)

Name-~_

__ --7CName

---, Pipelining- ; - ; --6Semester ='F1994' `-

6Deptld ='CS'

-Index-nested loops using the hash index on Prof Id

PROFESSOR

TEACHING (c)

--Sort-merge join

PROFESSOR

TEACHING (d)

FIGURE 11 .3 Query execution plans for relational expressions (11 .2) through (11 .5).

is computed, the join will also be finished with no extra I/O . Thus, the total cost is 4 + 4 + 250 = 258 . Note that if aDeptId=' CS' (PROFESSOR) were too big to fit in the buffer, then it would not be feasible to compute the join without writing d"Semester=71994' (TEACHING) on disk. Indeed, scanning of C'DeptId='CS ' (PROFESSOR) and the initial scan of QSemester='F1994' (TEACHING) could still be done through pipelining but now gSemester='F1994' (TEACHING) would have to be scanned multiple times, once for each chunk of aDeptId='CS' (PROFESSOR) . To enable this, aSemester='F1994' (TEACHING) would have to be written out on disk after the first scan. TEACHING

Case c: selection pushed to the PROFESSOR relation . For the query tree in Fig-

ure 11 .2(c), a query execution plan can be constructed as follows . First, compute

419

420

CHAPTER 11

An Overview of Query

Optimization

using the B+ tree index on PROFESSOR .DeptId . As in case b, this prevents us from further using the hash index on PROFESSOR .Id in the subsequent join computation . Unlike case b, however, the relation TEACHING remains untouched, so we can still use index-nested loops (utilizing the index on TEACHING .ProfId) to compute the join . Other possibilities are block-nested loops and sort-merge join . Finally, we can pipe the output of the join to the selection operator QDeptId='F1994' and apply the projection during the same scan . The above query execution plan is depicted in Figure 11 .3(c). We now estimate the cost of this plan . O~DeptId='CS'(PROFESSOR)

There are 50 departments and 1000 professors. Thus, of this selection will contain about 20 tuples, or 4 pages. Since the the result index on PROFESSOR .DeptId is clustered, retrieval of these tuples should take about 4 I/Os. Index search will take an additional 2 I/Os for a 2-level B+ tree index. Because we intend to pipe the result of the selection into the join step that follows, there is no output cost. Index-nested loops join . We use the result of the previous selection and pipe it directly as input to the join. An important consideration here is that, because we chose index-nested loops utilizing the hash index on TEACHING .Prof Id, the result of the selection does not need to be saved on disk even if this result is large. Once selection on PROFESSOR produces enough tuples to fill the buffers, we can immediately join these tuples with the matching TEACHING tupleS, using the hash index, and output the joined rows . Then we can resume the selection and fill the buffers again . As before, each PROFESSOR tuple matches about 10 TEACHING tuples, which are going to be stored in one bucket . So, to find the matches for 20 tuples, we have to search the index 20 times at the cost of 1 .2 I/Os per search . Another 200 I/Os are needed to actually fetch the matching tuples from disk since the index is unclustered . All in all, this should take 1.2 * 20 + 200 = 224 I/Os. Combined cost . Since the result of the join is piped into the subsequent selection and projection, these latter operations do not cost anything in terms of I/O . Thus, the total cost is: 4 + 2 + 224 = 230 I/Os. 6DeptId=ACS' (PROFESSOR) .

Case d : selection pushed to the TEACHING relation . This case is similar to case

c, except that selection is now applied to TEACHING rather than to PROFESSOR . Since the indices on TEACHING are lost after applying the selection, we cannot use this relation in the inner loop of the index-nested loops join. However, we can use it in the outer loop of the index-nested loops join that utilizes the hash index on PROFESSOR .Id in the inner loop. This join can also be computed using blocknested loops and sort-merge . For this example, we select sort-merge . The subsequent application of selection and projection to the result can be done using pipelining, as in earlier examples. The resulting query plan is depicted in Figure 11 .3(d). Join: the sorting stage. The first step is to sort PROFESSOR on Id and (-Semester='F1994 (TEACHING)

On

Prof Id .

11 .3

"

To sort

Estimating the Cost of a Query Execution Plan

PROFESSOR, we must first PROFESSOR fits in 200 blocks, there

scan it and create sorted runs . Since will be F200/501 = 4 sorted runs. Thus, creation of the 4 sorted runs and storing them back on disk takes 2 x 200 = 400 I/Os. These runs can then be merged in just one more pass, but we postpone this merge and combine it with the merging stage of the sortmerge join. (See below.) " To sort QSemester='F1994'(TEACHING), we must first compute this relation . Since TEACHING holds information for about 4 semesters, the size of the selection is about 10,000/4 = 2500 tuples . The index is clustered, so the tuples are stored consecutively in the file in 250 blocks. The cost of the selection is therefore about 252 1/O operations, which includes 2 I/O operations for searching the index . The result of the selection is not written to disk. Instead, each time the 50-page buffer in main memory is filled, it is immediately sorted to create a run and then written to disk. In this way we create F250/50] = 5 sorted runs. This takes 250 I/Os . The 5 sorted runs of QSemester='F1994' (TEACHING) can be merged in one pass. However, instead of doing this separately, we combine this step with the merging step of the join (and the merging step of sorting PROFESSOR, which was postponed earlier). Join : the merging stage. Rather than merging the 4 sorted runs of PROFESSOR and the sorted runs of QSemester='F1994' (TEACHING) into two sorted relations, the runs are piped directly into the merge stage of the sort-merge join without writing the intermediate sorted results on disk. In this way, we combine the final merge steps in sorting these relations with the merge step of the join. The combined merge uses 4 input buffers for each of the sorted runs of PROFESSOR, 5 input buffers for each sorted run of (TSemester='F1994' (TEACHING), and one output buffer for the result of the join. The tuple p with the lowest value of p.Id among the heads of the 4 PROFESSOR's runs is selected and matched against the tuple t with the lowest value of t.ProfId among the tuples in the head of the 5 runs corresponding to QSemester='F1994' (TEACHING) . Ifp.Id=t .ProfId, t is removed from the corresponding run and the joined tuple is placed in the output buffer (we remove t and not p because the same PROFESSOR tuple can match several TEACHING tuples) . If p .Id< t .Prof Id, p is discarded ; otherwise, t is discarded . The process then repeats itself until all the input runs are exhausted . The combined merge can be done at a cost of reading the sorted runs of the two relations : 200 I/Os for the runs of PROFESSOR and 250 I/Os for aSemester='F1994' (TEACHING), respectively. The rest . The result of the join is then piped directly to the subsequent selection (on DeptId) and projection (on Name) operators . Since no intermediate results are written to disk, the I/O cost of these stages is zero. Combined cost. Summing up the costs of the individual operations, we get : 400 + 252 + 250 + 200 + 250 = 1352 .

421

422

CHAPTER 11

An Overview of Query Optimization

Tallying up the results, we can see that the best plan (among those considered-only a small portion of all possible plans) is plan (c) from Figure 11.3 . The interesting observation here is that this plan is better than plan (b), even though plan (b) joins smaller relations (because the selections are fully pushed) . The reason for this apparent paradox is the loss of an index when selection is pushed down to the TEACHING relation . This illustrates once again that the heuristic rules of Section 11 .2 are just that-heuristics . While they are likely to lead to better query execution plans, they must be evaluated within a more general cost model. And the winner is . . . .

Estimating the Size of the Output

The examples in Section 11 .3 illustrate the importance of accurate estimates of the output size of various relational expressions . The result of one expression serves as input to the next, and the input size has a direct effect on the cost of the computation. To give a better idea of how such estimates can be done, we present a simple technique based on the assumption that all values have an equal chance of occurring in a relation . The system catalog can contain the following set of statistics for each relation name R: The number of blocks occupied by the instance of table R The number of tuples in the instance of R Values(R .A) . The number of distinct values of attribute A in the instance of R MaxVal(R .A) . The maximum value of attribute A in the instance of R MinValIRA). The minimum value of attribute A in the instance of R. Blocks(R) . Tuples(R).

Earlier we introduced the notion of attribute weight and used it to estimate sizes of selection and equi-join. We now define a more general notion, the reduction factor. Consider the following general query: SELECT FROM WHERE

TargetList

RI VI, . . . , R,1 V,z

Condition

The reduction factor of this query is the ratio Blocks(the result set)

Blocks(R l) x . . . x Blocks(R,,)

At first, this definition seems cyclic : to find out the size of the result we need to know the reduction factor, but for this we need to know the size of the result set. However, the reduction factor can be estimated by induction on the query structure without knowing the size of the query result .

11 .4

Estimating the Size of the Output

We assume that reduction factors associated with different parts of the query are independent of each other. Thus, reduction(Query) = reduction(TargetList) x reduction(Condition)

where reduction(TargetList) is the size reduction due to projection of rows on the attributes in the SELECT clause and reduction(Condition) is the size reduction due to the elimination of rows that do not satisfy Condition. We also assume that if Condition = Condition, AND Condition2 then reduction (Condition) = reduction (Condition,) x reduction (Condition 2)

and if Condition = Condition, OR Condition2, then reduction(Condition) = min(1, reduction(Condition,) + reduction (Condition2))

Thus, the size reduction due to a complex condition can be estimated in terms of the size reduction due to the components of that condition . It remains to estimate the reduction factors due to projection in the SELECT clause and due to atomic conditions in the WHERE clause . We ignore nested subqueries and aggregates in this discussion . reduction(Ri .A = value) = Value(Ri .A>, where Ri is a relation name and A is an attribute in Ri. This estimate is based on the uniformity assumption-all values

are equally probable.

where Ri and Rj are relations relations and A and B are attributes . Using the uniformity assumption, we can decompose Ri (respectively, Rj) into subsets with the property that all elements of a subset have the same value of Ri.A (respectively, RIB) . If we assume that there are NR, tuples in Ri and NR. tuples in Rj and that every element of Ri matches an element of Ri, then we can conclude that the number of tuples that satisfy the condition is Values(Ri.A) x (NR./Values(Ri.A)) x (NR, /Values(Ri .B)) . In general, the reduction factor is calculated assuming (unrealistically) that each value in the smaller range always matches a value in the larger range. Assuming that RiA is the smaller range, and dividing this expression by NR, x NR., yields the above reduction factor. MaxVal(Ri .A)-value reduction(Ri .A > value) = MaxVal(R ; .A)-MinVal(Ri The reduction factor for Ri.A < .A) value is defined similarly. These estimates are also based on the assumption that all values are distributed uniformly. reduction(Ri.A = Ri .B) =

max(Values (R; .A>, Palues(R;.A)) I

number-of-attributes(TargetList) . Ei number-of-attributes(R ;)

Here, for simplicity, we assume that all attributes contribute equally to the tuple size . reduction (TargetList)

=

The weight of an attribute, which we used in Section 11.3, can now be estimated using the notion of reduction factor : weight(Ri .A) = Tuples(Ri) x reduction(Ri.A = value)

423

424

CHAPTER 11

An Overview of Query Optimization

For instance, the reduction factor of the query PROFESSOR .DeptId = value is 1/50, since there are 50 departments. As the number of tuples in PROFESSOR is 1000, the weight of the attribute DeptId in PROFESSOR is 20.

Choosing a Plan

In Section 11 .3 we looked at some query execution plans and showed how to estimate their cost. However, we did not discuss how to produce candidate plans. Unfortunately, the number of possible plans can be quite large, so we need an efficient way of choosing a relatively small, promising subset . We can then estimate the cost of each and choose the best. There are at least three major issues involved in this process : 1. 2. 3.

Choosing a logical plan Reducing the search space Choosing a heuristic search algorithm

We discuss each of these issues in turn. Choosing a logical plan . We defined a query execution plan as a query tree with the

relational implementation methods attached to each inner node. Thus, constructing such a plan involves two tasks : choosing a tree and choosing the implementation methods . Choosing the right tree is the more difficult job because of the number of trees involved, which, in turn, is caused by the fact that the binary associative and commutative operators, such as join, cross product, union, and the like, can be processed in so many different ways . We mentioned in Section 11.2 that the subtree of a query tree in which N relations are combined by such an operator can be formed in T(N) x N! ways. We want to deal with this kind of exponential complexity separately, so we first focus on logical query execution plans, which avoid the problem by grouping consecutive binary operators of the same kind into one node, as shown in Figure 11 .4. The different logical query execution plans are created from the "master plan" (as in Figure 11 .2(a) on page 415) by pushing selections and projections down and by combining selections and Cartesian products into joins . Only a few of all possible logical plans are retained for further consideration . Typically, the ones selected are fully pushed trees (because they are expected to produce the smallest intermediate results) plus all the "nearly" fully pushed trees. The reason for the latter should be clear from the discussion and examples in Section 11 .3 : Pushing selection or projection down to a leaf node of a query tree might eliminate the option of using an index in the join computation . According to this heuristic, the query tree in Figure 11.2(a) will not be selected since nothing has been pushed . The remaining trees include the one in Figure 11 .3(c), which has the least estimated cost and which is superior to the fully pushed query plan in Figure 113(b) . In this example, all joins are binary, so the transformation shown in Figure 11 .4 does not pertain .

11 .5

A

B

Choosing a Plan

C

FIGURE 11 .4 Transforming a query tree into a logical query execution plan . Reducing the search space . Having selected candidate logical query execution

plans, the query optimizer must decide how to evaluate the expressions that involve the commutative and associative operators . For instance, Figure 11 .5 shows several alternative but equivalent ways of converting a commutative and associative node of a logical plan that combines multiple relations (a) into query trees (b), (c), and (d) . The space of all possible equivalent query (sub)trees that correspond to a node in a logical query plan is two-dimensional . First, we must choose the desired shape of the tree (by ignoring the labels on the nodes) . For instance, the trees in Fig ure 11 .5 have different shapes, with (d) being the simplest. Trees of such a shape are called left-deep query trees . A tree shape corresponds to a particular parenthesizing of a relational subexpression that involves an associative and commutative operator. Thus, the logical query execution plan in Figure 11 .5 (a) corresponds to the expression A oa B oa C oa D, while the query trees (b), (c), and (d) correspond to the expressions (A ~>a B) >a (C >a D), A >a ((B >a C) >a D), and ((A >a B) >a C) >a D, respectively. A left-deep query tree always corresponds to an algebraic expression of the form ( . . . ((E;1 ca E,2) oa E, 3 ) >a . . .) >a EL S, . Query optimizers usually settle on one particular shape for the query tree : leftdeep. This is because, even with a fixed tree shape, query optimizers have plenty of work to do. Indeed, given the left-deep tree of Figure 11 .5(d), there are still 4! possible ways to order the joins . For instance, ((B >< D) oa C) >< A is another ordering of the join of Figure 11.5(d) that leads to a different left-deep query execution plan . So, if computing cost estimates for 4! query execution plans does not sound like a lot, think of what it would take to estimate the cost of 10! or 12! or 16! plans. Incidentally, all commercial query optimizers give up at around 16 joins . Apart from the general need to reduce the search space, there is another good reason to choose left-deep trees over the trees of the form Figure 11 .5(b) : pipelining. For instance, in Figure 11.5 (d) we can first compute A >a B and pipe the result to the next join with C. The result of this second join can also be piped up the tree

425

426

CHAPTER

11

An Overview of Query Optimization

(a) Logical Query Plan

(c) Another Equivalent Query Tree FIGURE 11 .5

(b) An Equivalent Query Tree

(d) Yet Another Equivalent Tree:

Left-deep Query Tree

Logical plan and three equivalent query trees .

without materializing the intermediate relation on disk. The ability to do this is very important for large relations because the intermediate output from a join can be very large. For instance, if the size of the relations A, B, and C is 1000 pages, the intermediate relation can reach 109 pages just to shrink back to a few pages after joining with D. The overhead of storing such intermediate results on disk can be huge. Note that the tree of Figure 11 .5(b) does not lend itself to pipelining. For example, an attempt to pipe the result of A oa B and the result of C >< D to a module, M, that will join the two does not work. A row, t, of A >< B must be compared to every row of C oa D in order to compute the total join . That means that when t arrives, M must have received and stored all of C oa D, which is exactly what we are trying to avoid . The alternative of storing one piece of C >a D at a time is unsatisfactory since

11 .5

Choosing a Plan

FIGURE 11 .6 Heuristic search of the query execution plan space . Input: A logical plan El o< . . . o< EN Output: A "good" left-deep plan (. . . ((Eil

E

i)

o< . . .) >