11. Table Operations Implementation

Realization of DBS 11. Table Operations – Implementation Theo Härder www.haerder.de Goals - Systematic development of relational processing concept...
Author: Oswin Arnold
0 downloads 0 Views 463KB Size
Realization of DBS

11. Table Operations – Implementation Theo Härder www.haerder.de

Goals

- Systematic development of relational processing concepts for a single table or for several tables - Realization of plan operators

Main reference: Theo Härder, Erhard Rahm: Datenbanksysteme – Konzepte und Techniken der Implementierung, Springer, 2001, Chapter 11. Goetz Graefe: Query Evaluation Techniques for Large Databases, ACM Computing Surveys 25:2, June 1993, pp. 73-170. Realization of Database Systems – SS 2011 © 2011 AG DBIS

Realization of DBS

Table Operations - Implementation 

Operations of the relational algebra -

Table operations

Unary operations: Binary operations:

,  , , , , , –

Plan operators

Set operations



T

operations of the relational algebra. They are further transformed into access plans. So-called plan operators implement these logical operations



Plan operators on a single table



Selection



Operators across several tables



Join algorithms -

 © 2011 AG DBIS



R

 SQL queries contain logical expressions which can be mapped to the

Hash join

Distributed joins

 S

Nested-loops & sort/merge join

Joins on typespanning paths

Sort



Nested-loops join, Sort-merge join Hash join (classic hashing, simple hash join, hybrid hash join) Exploitation of type-crossing access paths Distributed join algorithms

Further binary operations (set operations) 11-2

Realization of DBS

Plan Operators on a Single Table 

Selection – general ways of evaluation • Direct access via a given TID, via a hash method or a one- resp. multidimensional index structure • Sequential search in a table • Search via an index structure (index table, bitlist) • Selection using several pointer lists where more than a single index structure can be exploited • Search via a multi-dimensional index structure

Table operations

Plan operators Nested-loops & sort/merge join Hash join

Joins on typespanning paths



Projection



Modification

Distributed joins

is typically performed in combination with sorting, selection, or join

• Updates are set-oriented in SQL, but restricted to a single table • INSERT, DELETE and UPDATE are directly mapped to the corresponding operations of the storage structures • “Automatic” execution of maintenance operations

Set operations

- to update access paths, - to guarantee clustering and reorganization etc.

• Provisions for logging and recovery etc. 11-3

© 2011 AG DBIS

Realization of DBS

Plan Operators for the Selection 

Use of Scan Operators •

Table operations Plan operators

• 

Plan operators 1. Table scan (relation scan) - Always possible - SCAN operator implements selection operation

Nested-loops & sort/merge join

2. Index scan - Selection of most cost-effective index - Specification of search range (start-, stop condition)

Hash join

3. k-d scan

Joins on typespanning paths

- Evaluation of multi-dimensional search criteria - Use of differing evaluation directions by navigation

4. TID algorithm

Distributed joins

-

Set operations



© 2011 AG DBIS

Definition of start- and stop condition Definition of simple search arguments

Evaluation of all “useable" useable index structures Location of TID lists of variable lengths Boolean connection of the lists Access to the records according to the hit list (result list)

Further plan operators in combination with selection • Sorting • Grouping (see sort operator) • Special operators e.g. in Data-Warehouse applications for grouping and aggregation (CUBE operator)

11-4

Realization of DBS

Operators Across Several Tables 

SQL allows complex queries across k tables • One-variable expressions: describe conditions for the selection of elements from a table • Two-variable expressions: describe conditions for the combination of elements from two tables • Typically, k-variable expressions are decomposed into one- and two-variable expressions i and d evaluated l t d by b corresponding di plan l operators t

Table operations

Plan operators Nested-loops & sort/merge join



Plan operators across several tables • General ways for the evaluation: - Nested iteration for each element of outer table To traversal of inner table Ti

Hash join

Joins on typespanning paths

• •

Distributed joins

-

Set operations

Merge method iterating traversals through T1, T2 • • •

-

© 2011 AG DBIS

Realization of DBS

O(N1 + N2) additional sort costs, if necessary important application: merging join

Hashing Partitioning of inner table Ti and partition-wise loading in HT in memory. “Probing” by outer table To or its resp. partitions using HT:  O(p · No + Ni)

11-5

Operators Across Several Tables (2) 

n-way joins • • • • •

Table operations Plan operators

Nested-loops & sort/merge join Hash join

Joins on typespanning paths

O(No · Ni + No) important application: nested-loops join



2

Decomposition into n-1 two-way joins Number of possible join sequences is dependent on the join attributes chosen Maximal n! different sequences possible Use of pipelining techniques O i l evaluation Optimal l i sequence dependent d d on - Plan operators - “Fitting” sort orders for join attributes - Size of operands etc.

Some join sequences using two-way joins (n=5) result

result

Distributed joins

result T5

Set operations

T2

T4

T5

T3 T1

T2

left-deep tree  © 2011 AG DBIS

2.

T4 T5

T1

T2 T3

T4

bushy tree

T3

T1

right-deep tree

Analogous proceeding in case of set operations Practicality test (Guy Lohman test for join techniques): Does a new technique apply to joining three inputs without interrupting data flow between the join operators?

11-6

Realization of DBS

Plan Operators for the Join 

Join • • • •

Table operations

Plan operators



Nested-loops & sort/merge join

Record-type-spanning operation: usually very expensive Frequent use: important optimization candidate Typical application: equi-join General Θ-join infrequent

Implementation p of the jjoin operation p

can process, at the same time, selections (and projections) on the participating tables R and S

SELECT FROM WHERE

Hash join

Joins on typespanning paths

• •

* R, S R.JA Θ S.JA AND PR AND PS

JA: join attribute PR and PS: predicates defined on selection attributes (SA) of R and S

Distributed joins



Possible access p paths

Set operations

• Scans over R and S

(always)

• Scans over IR(JA), IS(JA)  deliver sort sequence according to JA

(if present)

• Scans over IR(SA), IS(SA)  if necessary, fast selection for PR and PS

(if present)

• Scans over other index structures  if necessary, faster location of all records

(if present)

© 2011 AG DBIS

Realization of DBS

Nested-Loops Join 

 Plan operators

Nested-loops & sort/merge join

Joins on typespanning paths

Records in R and S are not ordered according to join attributes Index structures IR(JA) and IS(JA) do not exist

Algorithm for Θ-join

Scan over S, for each record s, if PS: scan over R, R for each record r, if PR AND (r.JA Θ s.JA): execute join, i.e., write combined record (r, s) into the result set.



Complexity: O(N*M) Nested-loops join using index access



Nested-block join



Distributed joins

Set operations

© 2011 AG DBIS

Assumptions • •

Table operations

Hash join

11-7

Scan over S, for each record s, if PS: determine via access to IR(JA) all TIDs for records satisfying r.JA = s.JA, for each TID: fetch record r, if PR: write combined b d record d (r, ( s ) into the h result l set. Scan over S, for each page (resp. set of contiguous pages) of S: scan over R, for each page (resp. set of contiguous pages) of R: for each record s of the S-page, if PS: for each record r of the R-page, if PR AND (r.JA Θ s.JA): write combined record (r, s) into the result set.

11-8

Realization of DBS

Sort-Merge Join 

Algorithm consists of 2 phases • Phase 1:

Table operations

• Phase 2:

Plan operators Nested-loops & sort/merge join

Sorting of R and S w.r.t R(JA) and S(JA) (if not already present); in doing so, early elimination of records not needed ( PR, PS) Iterating scans over sorted R- and S-records where join is performed in case of r.JA = s.JA



Complexity: O(N log N)



Special case If either IR(JA) and IS(JA) or GAPS over R(JA) and S(JA) (join index) is present:

Hash join

 exploitation of index structures on join attributes Joins on typespanning paths

Iterating scans over IR(JA) and IS(JA): for each with two keys from IR(JA) and IS(JA), if r.JA = s.JA: fetch the records using the related TIDs, if PR and PS: write combined record (r, (r s) into the result set

Distributed joins

Set operations

11-9

© 2011 AG DBIS

Realization of DBS

Hash Join 

• Step 1: Partitioned read of (smaller) table R and construction of a hash table using hH(r(JA)) w.r.t. values of R(JA) of partitions Ri (1  i  p): each partition fits into the available memory and each record satisfies PR

Table operations Plan operators

• Step 2: Probing for records of S using PS; if successful, execution of join • Step 3: Repeat steps 1 and 2 as long as R is exhausted

Nested-loops & sort/merge join Hash join

Simplest case (classic hashing)



Construction of hash tables and probing Scan over R; building hash tables Hi (1  i  p) one at a time in memory

Joins on typespanning paths

R

H1 Scan over S with probing of H1

S

Distributed joins

... Set operations

R S  

© 2011 AG DBIS

Hp

Scan over S with probing of HP

Complexity: O(p · N) Special case R fits into memory: one partition (p = 1)  a single scan over S is sufficient!

11-10

Realization of DBS

Hash Join (2) 

Table operations

Partitioning of R with hp(r(JA)) #records / JA-value

Plan operators Nested-loops & sort/merge join

0

100

#records / JA’-value

Hash join

JA

h p (r(JA))

Joins on typespanning paths Distributed joins

0.33

0 R1

Set operations

0.66 R2

JA’

1 R3

11-11

© 2011 AG DBIS

Realization of DBS

Hash Join (3) 

Table operations Plan operators

Partitioning • Partitioning of R in subsets R1, R2, ..., Rp: a record r of R is in Ri, if h(r) is in Hi R

Nested-loops & sort/merge join Hash join

Joins on typespanning paths

Distributed joins

... H1

H2

Hp

 Why is this partitioning a critical operation?

Set operations

Which auxiliary operations may be required? Is the use of a hash function needed for partitioning? • Table S is partitioned with same function hP while evaluating PS © 2011 AG DBIS

11-12

Realization of DBS

Hash Join (4)  

Table operations

Variants of hash join are primarily distinguished by the kind of partitioning Partitioning technique in case of simple hash join shown for construction and probing of H1 R

step 1:

Plan operators

H1

1. iteration

Nested-loops & sort/merge join

Rrest Srest

Hash join

step 2:

Joins on typespanning paths Distributed joins



S

Simple hash join • Step p 1: Execute scan on R ((smaller table), ), evaluate PR and apply pp y hash function hP to each qualified record r. Is hP(r(JA)) in the chosen range, insert record into H1. Otherwise, write r in an output buffer for a file Rrest for “pretermitted” r-records.

Set operations

• Step 2: Execute scan on S, evaluate PS and apply hash function hP to each qualified record s. Is hP(s(JA)) in the chosen range, search a join counterpart (probing) in H1. If successful, form a join record and put it to the result. Otherwise, write s to an output buffer for a file Srest for “pretermitted” s-records. • Step 3: Repeat step 1 and 2 using the so far “pretermitted” records on Hi as long as Rrest is exhausted. Here, evaluation of PR and PS is not required anymore. 11-13

© 2011 AG DBIS

Realization of DBS

Hash Join (5) 

Table operations Plan operators

Grace join (grace join) • Partitioning of R and S takes place before join starts • Partitions Ri and Si are stored in temporary files on disk • Construction of Hi (having M pages) in memory with Ri and probing with Si

Nested-loops & sort/merge join

R1

Hash join

S1

Joins on typespanning paths

H1

Scan over S1 with probing of H1 ...

RP

HP

Distributed joins

SP Set operations

Scan over SP with probing of HP

 What is the minimal memory size required?

© 2011 AG DBIS

11-14

Realization of DBS

Hash Join (6) Hybrid hash join



• Optimization such that construction and probing of H1 is done in parallel to partitioning

Table operations

Plan operators

Scan

1)

R

Nested-loops & sort/merge join

R2

Hash join

a)

memory

Joins on typespanning paths

b)

Distributed joins

R3

RP

R1 constructed in H1

memory area: 1 page each S2

immediate probing of S1-records

S

S3

SP

Scan

Set operations

2)

H2

R2 S2

as in case of Grace join

Scan ...

3)

11-15

© 2011 AG DBIS

Realization of DBS

Hash Join - Example 

# r ec o r d s / J A - v a lu e

Partitioning a) Partitioning of R with hP(r(JA))

Table operations

JA 0

100

# r ec o r d s / J A ‘ - v a lu e

Plan operators

Nested-loops & sort/merge join

h p ( r (JA ) )

JA‘ 0 .3 3

0

b) Partitioning of S with hP(s(JA))

R1

0 .6 6 R2

1 R3

Hash join

Joins on typespanning paths



II. Join

1)

R1 S1

Distributed joins

Set operations

2) R2

H1

in memory with hH(r(JA)) JA’: 0.0 – 0.33 read, probing with of hH(s(JA))

JA’: 0.0 – 0.33

H2 JA’: 0.34 – 0.66

S2 3)

R3 S3

© 2011 AG DBIS

JA’: 0.34 – 0.66 H3 JA’: 0.67 – 1.0 JA’: 0.67 – 1.0

11-16

Realization of DBS

Use of Type-Spanning Access Paths 

Join via link structures •

Table operations

Use of hierarchical access paths for equi-join

Scan over R (Owner table), for each record r, if PR: Scan over related link structure LR-S(JA), for each record ss, if PS: write combined record (r, s ) into the result set.

Plan operators Nested-loops & sort/merge join Hash join



Further methods • Join indexes which are built for certain Θ-joins

Joins on typespanning paths Distributed joins

Set operations

VIR:

VIS:

R

S

R

S

TIDr2 TIDr1 TIDr2 TIDr2

TIDs4 TIDs3 TIDs2 TIDs6

TIDr1 TIDr2 TIDr2 TIDr2

TIDs3 TIDs2 TIDs4 TIDs6

Logical view

Index for TIDR

R

S

TIDs2 TIDs3 TIDs4 TIDs6

TIDr2 TIDr1 TIDr2 TIDr2

Index for TIDS

11-17

© 2011 AG DBIS

Realization of DBS

Use of Type-Spanning Access Paths (2) • Use of generalized access path structures (GAPS)

Table operations

K53 Plan operators

Nested-loops & sort/merge join

K25

K36

K47

K58

K78

K88

Hash join

...

Joins on typespanning paths

TIDs for Dept

Distributed joins

... TIDs for Mgr

  . . . K55 1 3 1 4 TID TID TID TID TID TID TID TID TID . . . 

Set operations

PRIOR

© 2011 AG DBIS

...

NEXT

TIDs for Emp

TIDs for Equipment

optional reference to overflow page

11-18

Realization of DBS

Join Algorithms - Comparison input stream 2 e21 e22 e23

inputt stream 1

Table operations

Plan operators Nested-loops & sort/merge join

e21 e22 e23

e11 e12

e11

e13 ...

e13 ...

e13 ...

e12

Hash partitions

Hash join

Joins on typespanning paths

e21 e22 e23

e11 e12

(a) Nested-loops join

(b) Merge join

element comparison 

Distributed joins 

Set operations



(c) Hash join

successful element comparison

Nested-loops join is always applicable, however, scanning of complete search space has to be taken into account. Merge join needs lowest search costs, requires, however, sorted input streams. Index structures on both join attributes satisfy this prerequisite. Otherwise, explicit sorting of both tables w.r.t. join attributes reduces cost advantage substantially. Nevertheless, sort-merge join can own additional advantages, if the result is required in sorted sequence and sorting of the large result is more expensive than sorting of two small result sets. Hash join partitions search space. Fig. c assumes that the same hash function h is applied to tables R and S. The partition size of the (smaller) table is given by the available buffer size in memory. A reduction of the partition size, to approximate case b, causes higher preparation costs and is therefore not recommendable. 11-19

© 2011 AG DBIS

Realization of DBS

Join Algorithms in Distributed DBS 

Problem statement •

Table operations Plan operators

• 

Determination of evaluation strategy •

Nested-loops & sort/merge join Hash join





Distributed joins



-

Minimal number of messages

-

Very high transfer volumes

Request for every join value in the first table related records from the second table



-

Large number of messages

-

Only relevant records are considered

Trade-off solution: Semi-join resp. extensions such as Bit-vector join (hash filter join)

Semi-join • • •

Shipping of a list of JA values of R to node of S Determination of join counterparts in S and returning them to node of R Then join processing at node of R

Bit-vector join •

© 2011 AG DBIS

Send participating tables completely to a node and compute join locally (“ship whole”)

(“fetch as needed“)

Joins on typespanning paths

Set operations

Query in node K, which requires a join between (sub-)table R at node KR and (sub-)table S at node KS Determination of processing node: K, KR or KS



Similar to Semi-join, only shipping of a bit vector (Bloom Filter) created using a hash function 11-20 Returning a superset of join counterparts in S

Realization of DBS

Semi-Join and Bit-Vector Join Dept Dno 47 39 64

Frankfurt Table operations

Plan operators

Loc

Dno 69 28 75 47 47 44

Dno D 47 39 64 find join counterparts

Dept DNo Loc Mgr 47 39 64 create bit vector by hashing

Frankfurt

Distributed joins

Set operations

1

0

0

1

1

Munich

0

0

return projections of join counterpart records

Name

Address

Phone

Hans Anna

Name

Address

Phone

return the potential join candidates

0

0

0

1

1

0

0

0

hashing of Dno values to find potential join candidates

© 2011 AG DBIS

Dno 47

check + join

ship bit vector

1

Name Hans Anna

Emp

ship the whole JA column

Munich

Hash join

Realization of DBS

Dno 47 47

join

Nested-loops & sort/merge join

Joins on typespanning paths

Mgr

Emp Dno 69 28 75 47 91 44

Name

Address

Phone

11-21

Set Operations3 

Table operations

Which set operations are needed? S R A

Plan operators

B

R, S A, B, C

C

union-compatible input streams element sets

Nested-loops & sort/merge join Hash join

Joins on typespanning paths

Distributed joins

operation result

matching in all attributes

A

difference (R-S)

anti-semi-join (S, R)

B

intersection

join, semi-join (S, R)

C

difference (S-R)

anti-semi-join (R, S)

A, B A, C

Set operations

left-sided outer join anti-difference

B, C A, B, C 

© 2011 AG DBIS

3.

matching in one or several attributes

anti-join right-sided outer join

union

symmetrical outer join

Which algorithms can be used for these set operations? • What has to be compared at a time? • How can a relationship to the join algorithms be found?

Graefe, G.: Query evaluation techniques for large databases, ACM Computing Surveys 25:2, 1993, pp. 73-170

11-22

Realization of DBS

Set Operations (2) 

Binary matching operations • •

Table operations

• Plan operators Nested-loops & sort/merge join

• 

Same logical proceeding • • •

Hash join

Joins on typespanning paths Distributed joins



Three element sets are formed from R and S: A, B, C Elements in B fit together ! How can these three element sets be formed? - Using nested iteration - Using merge method - Using hash method

Unified realization concept • • •

Set operations

Solve the same task, in principle: “one-to-one matching operations” An input element contributes to the output dependent of its “match” with another input element Operations repeatedly require the same steps and, therefore, can be implemented using the same algorithms p are closelyy connected! Set- and jjoin operations

Comparison of join- vs. vs primary-key attributes Commonality: records are grouped on the basis of attribute values Some unary operations are possible with special measures - Grouping and sorting enable simple duplicate elimination - In case of aggregation, an attribute value per group is determined - In case of join, grouping of potential join counterparts is cost-effective (either in partitions or a sort order) - Using set operations, the element sets A, B, C can be found; at the same time, duplicate elimination is possible

© 2011 AG DBIS

Realization of DBS

Summary 





Distributed joins

 Set operations

Use of the same algorithm classes, in principle Variation of executing comparisons

Extensibility infrastructure in object-relational DBMS • •

© 2011 AG DBIS

Nested-loops join Sort-merge join Hash join And variations

Set operations • •



Nested iteration Merge method Hashing

Many options for processing of join operations • • • •

Joins on typespanning paths

Existing access path types require tailor-made operations and efficient mapping Combination of various access paths possible (TID algorithm)

General classes of evaluation methods for binary operations • • •

Nested-loops & sort/merge join Hash join

Selection operations • •

Table operations Plan operators

11-23

Creation of user-defined functions and operators Generalization: user-defined table operators with n input tables and m output tables

11-24

Suggest Documents