Search Trees 6
< 2 1
9
> 4 =
8
Algorithm Theory Hiroaki Kobayashi
12/20/02
1
Outline Dictionary ADT Log file Binary search Lookup table Binary search tree
Search Insertion Deletion Performance
AVL Tree
(2,4) Tree
Multi-way search tree (2,4) tree
12/20/02
Definition Insertion Restructuring Deletion Performance
Algorithm Theory Hiroaki Kobayashi
Definition Search Insertion Deletion
Comparison of dictionary implementations 2
1
Dictionary Abstract Data Type (ADT) The dictionary ADT models a searchable collection of keyelement items The main operations of a dictionary are searching, inserting, and deleting items Multiple items with the same key are allowed Applications:
address book credit card authorization mapping host names (e.g., cs16.net) to internet addresses (e.g., 128.148.34.101)
Dictionary ADT methods:
Algorithm Theory Hiroaki Kobayashi
12/20/02
findElement(k): if the dictionary has an item with key k, returns its element, else, returns the special element NO_SUCH_KEY insertItem(k, o): inserts item (k, o) into the dictionary removeElement(k): if the dictionary has an item with key k, removes it from the dictionary and returns its element, else returns the special element NO_SUCH_KEY size(), isEmpty() keys(), Elements() findAllElements(k), removeAllElements(k) 3
Log File A log file is a dictionary implemented by means of an unsorted sequence
We store the items of the dictionary in a sequence (based on a doubly-linked lists or a circular array), in arbitrary order
Performance:
insertItem takes O(1) time since we can insert the new item at the beginning or at the end of the sequence findElement and removeElement take O(n) time since in the worst case (the item is not found) we traverse the entire sequence to look for an item with the given key
The log file is effective only for dictionaries of small size or for dictionaries on which insertions are the most common operations, while searches and removals are rarely performed (e.g., historical record of logins to a workstation)
12/20/02
Algorithm Theory Hiroaki Kobayashi
4
2
Ordered Dictionaries If searching a dictionary is the most common operation, elements should be sorted by their keys. Ordered Dictionaries
Keys are assumed to come from a total order.
New operations:
closestKeyBefore(k)
Return the key of the item
Return the key of the item
with largest key less than or equal to k
closestElemBefore(k)
closestKeyAfter(k)
with smallest key greater than or equal to k
Return the element for the
item with largest key less than or equal to k
closestElemAfter(k)
Return the element for the
item with smallest key greater than or equal to k
Algorithm Theory Hiroaki Kobayashi
12/20/02
5
Lookup Table A lookup table is a dictionary implemented by means of a sorted
sequence
We store the items of the dictionary in an array-based sequence, sorted by key We use an external comparator for the keys
Performance:
findElement takes O(log n) time, using binary search insertItem takes O(n) time since in the worst case we have to shift n/2 items to make room for the new item removeElement take O(n) time since in the worst case we have to shift n/2 items to compact the items after the removal
The lookup table is effective only for dictionaries of small size or for dictionaries on which searches are the most common operations, while insertions and removals are rarely performed (e.g., credit card authorizations) 12/20/02
Algorithm Theory Hiroaki Kobayashi
6
3
Binary Search Binary search performs operation findElement(k) on a dictionary implemented by means of an array-based sequence, sorted by key
similar to the high-low game at each step, the number of candidate items is halved terminates after a logarithmic number of steps
Example: findElement(7) 0
1
3
4
5
7
8
9
11
14
16
18
19
8
9
11
14
16
18
19
8
9
11
14
16
18
19
8
9
11
14
16
18
19
m
l 0
1
3
4
5
4
5
7
l
m
h
4
5
m
l 0
1
0
1
3 3
7
h
h
7
l=m =h
12/20/02
Algorithm Theory Hiroaki Kobayashi
7
Binary Search Tree A binary search tree is a binary tree storing keys (or key-element pairs) at its internal nodes and satisfying the following property:
Let u, v, and w be three nodes such that u is in the left subtree of v and w is in the right subtree of v. We have key(u) ≤ key(v) ≤ key(w)
External nodes do not store items
12/20/02
An inorder traversal of a binary search trees visits the keys in increasing order
6 2 1
9 4
8
Null or references to a NULL object Algorithm Theory Hiroaki Kobayashi
8
4
Search Algorithm findElement(k, v) To search for a key k, if T.isExternal (v) we trace a downward return NO_SUCH_KEY path starting at the root if k < key(v) The next node visited return findElement(k, T.leftChild(v)) depends on the else if k = key(v) outcome of the return element(v) comparison of k with else { k > key(v) } the key of the current return findElement(k, T.rightChild(v)) node 6 If we reach a leaf, the < key is not found and we 2 9 return NO_SUCH_KEY > 8 Example: 1 4 = findElement(4) Algorithm Theory Hiroaki Kobayashi
12/20/02
9
Analysis of Binary Tree Searching time per level
height
h
O(1)
O(1)
Tree T
O(1)
Total time: O(h) 12/20/02
Algorithm Theory Hiroaki Kobayashi
10
5
Insertion To perform operation insertItem(k, o), we search for key k Assume k is not already in the tree, and let let w be the leaf reached by the search We insert k at node w and expand w into an internal node Example: insert 5 12/20/02
6
< 2
9
>
1
4
8
> w 6
2
9
1
4
8
w 5
Algorithm Theory Hiroaki Kobayashi
11
Deletion To perform operation removeElement(k), we search for key k Assume key k is in the tree, and let let v be the node storing k If node v has a leaf child w, we remove v and w from the tree with operation removeAboveExternal(w) Example: remove 4
2
9
> 4 v
1
8
w 5
6 2 1
12/20/02
6
2, an AVL tree of height h contains the root node, one AVL subtree of height n-1 and another of height n-2. That is, n(h) = 1 + n(h-1) + n(h-2) Knowing n(h-1) > n(h-2), we get n(h) > 2n(h-2). So n(h) > 2n(h-2), n(h) > 4n(h-4), n(h) > 8n(n-6), … (by induction), n(h) > 2in(h-2i)
Solving the base case we get: n(h) > 2 h/2+1, where h-2i=2 Taking logarithms: h < 2log n(h) -2 Thus the height of an AVL tree is O(log n) Algorithm Theory
12/20/02
Hiroaki Kobayashi
17
Insertion in an AVL Tree Insertion is as in a binary search tree Always done by expanding an external node. Example: 44 44
17
17
2
2
78
32
50
88
78
32
50
88
3
4 48
48
c=z
a=y
62
b=x
62 54 w
before insertion (balanced) 12/20/02
Algorithm Theory Hiroaki Kobayashi
after insertion (unbalanced) 18
9
Trinode Restructuring let (a,b,c) be an inorder listing of x, y, z perform the rotations needed to make b the topmost node of the three (other two cases are symmetrical)
a=z
a=z
case 2: double rotation (a right rotation about c, then a left rotation about a)
c=y
b=y T0
T0
b=x
c=x
T3
T1
b=y T2
b=x T1
T3 a=z
case 1: single rotation (a left rotation about a)
c=x
T2
T1
T0
T2 a=z
T3
T0
c=y
12/20/02
T3
T2
T1
Algorithm Theory Hiroaki Kobayashi
19
Insertion Example, continued 5 44
z
2 17 3
1 32
1
2 y 2
1 1
7 1
50
48
64 78
3
4 62
88
x 5
T3
54
unbalanced...
T2
T0
T1
44
4 3
2 17 2
1 32
...balanced
1
1 48
z6
62
2 y 50
4 x
3
1
5
78
2
7
54
1
88
T2 12/20/02
Algorithm Theory Hiroaki Kobayashi
T0
T1
20
T3
10
Restructuring (as Single Rotations) Single Rotations: a=z
b=y
single rotation b=y
a=z
c=x
c=x
h h+1 h+2
T0
T3
T1
(depth from root)
T3
T0
T1
T2
T2
c=z
b=y
single rotation
b=y
a=x
c=z
a=x
h h+1 h+2
T3 T0
T3
T2
(depth from root) 12/20/02
T2
Algorithm Theory Hiroaki Kobayashi
T1
T1
T0 21
Restructuring (as Double Rotations) double rotations: double rotation
a=z
c=y
b=x
h h+1 h+2
T0
T3
T2
(depth from root)
T0
T2
T1
T3
T1 double rotation
c=z
b=x a=y
a=y h h+1 h+2
b=x a=z
c=y
c=z
b=x T3
(depth from root) 12/20/02
T0 T2 T1
T3 Algorithm Theory Hiroaki Kobayashi
T2
T1
T0 22
11
Removal in an AVL Tree Removal begins as in a binary search tree, which means the node removed will become an empty external node. Its parent, w, may cause an imbalance. Example: 44
44
17
2
32
50
48
12/20/02
17
1
62
50
78
78
3
3 48
88
54
62
after deletion (unbalanced)
Algorithm Theory Hiroaki Kobayashi
before deletion of 32 (balanced)
88
54
23
Rebalancing after a Removal Let z be the first unbalanced node encountered while travelling up the tree from w. Also, let y be the child of z with the larger height, and let x be the child of y with the larger height. We perform restructure(x) to restore balance at z. As this restructuring may upset the balance of another node higher in the tree, we must continue checking for balance until the root of T is reached a=z
w
62
44
17
50
48 12/20/02
c=x
78
54
44
b=y
62
17
50
48
88 Algorithm Theory Hiroaki Kobayashi
78
88
54
24
12
Running Times for AVL Trees a single restructure is O(1)
using a linkedstructure binary tree
find is O(log n)
height of tree is O(log n), no restructures needed
time per level
height
O(1)
AVL Tree T
O(1)
insert is O(log n)
initial find is O(log n) Restructuring up the tree, maintaining heights is O(log n)
O(log n)
Down phase
Up phase
O(1)
remove is O(log n)
initial find is O(log n) Restructuring up the tree, maintaining heights is O(log n)
12/20/02
Worst-case time: O(log n) 25
Algorithm Theory Hiroaki Kobayashi
(2,4) Trees: Bounded-Depth Search Trees 9 2 5 7
12/20/02
Algorithm Theory Hiroaki Kobayashi
10 14
26
13
Multi-Way Search Tree A multi-way search tree is an ordered tree such that Each internal node has at least two children and stores d −1 key-element items (ki, oi), where d is the number of children For a node with children v1 v2 … vd storing keys k1 k2 … kd−1
keys in the subtree of v1 are less than k1 keys in the subtree of vi are between ki−1 and ki (i = 2, …, d − 1) keys in the subtree of vd are greater than kd−1
The leaves store no items and serve as placeholders
11 2 6 8
24 15
27
32 30
Algorithm Theory Hiroaki Kobayashi
12/20/02
27
Multi-Way Inorder Traversal We can extend the notion of inorder traversal from binary trees to multi-way search trees Namely, we visit item (ki, oi) of node v between the recursive traversals of the subtrees of v rooted at children vi and vi + 1 An inorder traversal of a multi-way search tree visits the keys in increasing order 11
24
8
2 6 8 2 1
12/20/02
4 3
6 5
12
15
27
7
9
32
14
10
18
30 11
Algorithm Theory Hiroaki Kobayashi
13
19
16 15
17 28
14
Multi-Way Searching Similar to search in a binary search tree A each internal node with children v1 v2 … vd and keys k1 k2 … kd−1
k = ki (i = 1, …, d − 1): the search terminates successfully k < k1: we continue the search in child v1 ki−1 < k < ki (i = 2, …, d − 1): we continue the search in child vi k > kd−1: we continue the search in child vd
Reaching an external node terminates the search unsuccessfully Example: search for 30 11 2 6 8
24 15
27
32 30
Algorithm Theory Hiroaki Kobayashi
12/20/02
29
(2,4) Tree A (2,4) tree (also called 2-4 tree or 2-3-4 tree) is a multi-way search with the following properties
Node-Size Property: every internal node has at most four children Depth Property: all the external nodes have the same depth
Depending on the number of children, an internal node of a (2,4) tree is called a 2-node, 3-node or 4node 10 15 24 2 8
12/20/02
12
Algorithm Theory Hiroaki Kobayashi
18
27
32
30
15
Height of a (2,4) Tree Theorem: A (2,4) tree storing n items has height O(log n) Proof: Let h be the height of a (2,4) tree with n items Since there are at least 2i items at depth i = 0, … , h − 1 and no items at depth h, we have n ≥ 1 + 2 + 4 + … + 2h−1 = 2h − 1 Thus, h ≤ log (n + 1)
Searching in a (2,4) tree with n items takes O(log n) time depth items 1 0 1
2
h−1
2h−1
h
0 Algorithm Theory Hiroaki Kobayashi
12/20/02
31
Insertion We insert a new item (k, o) at the parent v of the leaf reached by searching for k
We preserve the depth property but We may cause an overflow (i.e., node v may become a 5-node)
Example: inserting key 30 causes an overflow 10 15 24 2 8
12
18
v 27 32 35
10 15 24
v 2 8 12/20/02
12
18
Algorithm Theory Hiroaki Kobayashi
overflow
27 30 32 35 32
16
Overflow and Split We handle an overflow at a 5-node v with a split operation:
let v1 … v5 be the children of v and k1 … k4 be the keys of v node v is replaced nodes v' and v" v' is a 3-node with keys k1 k2 and children v1 v2 v3 v" is a 2-node with key k4 and children v4 v5
key k3 is inserted into the parent u of v (a new root may be created)
The overflow may propagate to the parent node u u
u
15 24 32
15 24
v 12
18
v'
27 30 32 35
12
v1 v2 v3 v4 v5 Algorithm Theory Hiroaki Kobayashi
12/20/02
18
27 30
v" 35
v1 v2 v3 v4
v5 33
Exercise Show a sequence of insertion into a (2,4) tree with a input of {4,6,12,15,3,5,10} 4
12/20/02
??
Algorithm Theory Hiroaki Kobayashi
34
17
Cascade Split As a consequence of a split operation on node v, a new overflow may occur at the parent u of v. 34
68
5 10 12
Insert 17
11
13 14 15
5 10 12 34
68
5 10 12 13 14 15 17
11
34
68
11
15
13 14
17
12 12
5 10 12 15 34
68
11
13 14
5 10 17
34
12/20/02
68
11
15 13 14
5 10 17
34
68
11
15 13 14
Algorithm Theory Hiroaki Kobayashi
17
35
Analysis of Insertion Let T be a (2,4) tree with n items
Algorithm insertItem(k, o) 1. We search for key k to locate the insertion node v
2. We add the new item (k, o) at node v
3. while overflow(v) if isRoot(v) create a new empty root above v v ← split(v)
12/20/02
Algorithm Theory Hiroaki Kobayashi
Tree T has O(log n) height Step 1 takes O(log n) time because we visit O(log n) nodes Step 2 takes O(1) time Step 3 takes O(log n) time because each split takes O(1) time and we perform O(log n) splits
Thus, an insertion in a (2,4) tree takes O(log n) time 36
18
Deletion We reduce deletion of an item to the case where the item is at the node with leaf children Otherwise, we replace the item with its inorder successor (or, equivalently, with its inorder predecessor) and delete the latter item Example: to delete key 24, we replace it with 27 (inorder successor) 10 15 24 2 8
12
18
27 32 35
10 15 27 2 8
12
18
32 35
Algorithm Theory Hiroaki Kobayashi
12/20/02
37
Underflow and Fusion Deleting an item from a node v may cause an underflow, where node v becomes a 1-node with one child and no keys To handle an underflow at node v with parent u, we consider two cases Case 1: the adjacent siblings of v are 2-nodes
Fusion operation: we merge v with an adjacent sibling w and move an item from u to the merged node v' After a fusion, the underflow may propagate to the parent u
u 2 5 7
12/20/02
9 14 10
fusion w v
Algorithm Theory Hiroaki Kobayashi
u 2 5 7
9 10 14
v'
38
19
Underflow and Transfer To handle an underflow at node v with parent u, we consider two cases Case 2: an adjacent sibling w of v is a 3-node or a 4-node Transfer operation: 1. we move a child of w to v 2. we move an item from u to v 3. we move an item from w to u After a transfer, no underflow occurs
u
u
4 9 2
6 8
4 8
w
v
2
6
w
Algorithm Theory Hiroaki Kobayashi
12/20/02
9
v
39
Exercise Draw the sequence of removals {4,12,13} from a (2,4) tree below. 12
5 10 4
68
12/20/02
11
15 13 14
17
??
Algorithm Theory Hiroaki Kobayashi
40
20
Propagating Sequence of Fusion Draw a sequence of fusion in the case of removal of 14, 11
6 5
8 10
15 14
?? 17
Remove 4
12/20/02
Algorithm Theory Hiroaki Kobayashi
41
Analysis of Deletion Let T be a (2,4) tree with n items
Tree T has O(log n) height
In a deletion operation
We visit O(log n) nodes to locate the node from which to delete the item We handle an underflow with a series of O(log n) fusions, followed by at most one transfer Each fusion and transfer takes O(1) time
Thus, deleting an item from a (2,4) tree takes O(log n) time 12/20/02
Algorithm Theory Hiroaki Kobayashi
42
21
Total Order Relation Keys in a priority queue can be arbitrary objects on which an order is defined Two distinct items in a priority queue can have the same key 12/20/02
Mathematical concept of total order relation ≤
Reflexive property: x≤x Antisymmetric property: x≤y∧y≤x⇒x=y Transitive property: x≤y∧y≤z⇒x≤z
Algorithm Theory Hiroaki Kobayashi
43
Tree Terminology Root: node without parent (A) Internal node: node with at least one child (A, B, C, F) External node (a.k.a. leaf ): node without children (E, I, J, K, G, H, D) Ancestors of a node: parent, grandparent, grand-grandparent, etc. Depth of a node: number of ancestors E Height of a tree: maximum depth of any node (3) Descendant of a node: child, grandchild, grand-grandchild, etc. 12/20/02
Algorithm Theory Hiroaki Kobayashi
Subtree: tree consisting of a node and its descendants A
B
C
F
G
D
H
subtree I
J
K
44
22
Tree ADT We use positions to abstract nodes Generic methods:
integer size() boolean isEmpty() objectIterator elements() positionIterator positions()
Accessor methods:
position root() position parent(p) positionIterator children(p)
Query methods:
boolean isInternal(p) boolean isExternal(p) boolean isRoot(p)
Update methods:
swapElements(p, q) object replaceElement(p, o)
Additional update methods may be defined by data structures implementing the Tree ADT
Algorithm Theory Hiroaki Kobayashi
12/20/02
45
Preorder Traversal A traversal visits the nodes of a tree in a systematic manner In a preorder traversal, a node is visited before its descendants Application: print a structured document 1
Algorithm preOrder(v) visit(v) for each child w of v preorder (w)
Make Money Fast!
2
5
1. Motivations
3
4
1.1 Greed
1.2 Avidity
12/20/02
9 2. Methods
6 2.1 Stock Fraud Algorithm Theory Hiroaki Kobayashi
7 2.2 Ponzi Scheme
References
8 2.3 Bank Robbery 46
23
Postorder Traversal In a postorder traversal, a node is visited after its descendants Application: compute space used by files in a directory and its subdirectories 9
Algorithm postOrder(v) for each child w of v postOrder (w) visit(v)
cs16/
3
8
7
homeworks/
todo.txt 1K
programs/
1
2
h1c.doc 3K
h1nc.doc 2K
12/20/02
4 DDR.java 10K
5 Stocks.java 25K
Algorithm Theory Hiroaki Kobayashi
6 Robot.java 20K 47
Properties of Binary Trees Notation
Properties: e = i + 1 n = 2e − 1 h ≤ i h ≤ (n − 1)/2 e ≤ 2h h ≥ log2 e h ≥ log2 (n + 1) − 1
n number of nodes e number of external nodes i number of internal nodes h height
12/20/02
Algorithm Theory Hiroaki Kobayashi
48
24
Inorder Traversal In an inorder traversal a node is visited after its left subtree and before its right subtree Application: draw a binary tree
Algorithm inOrder(v) if isInternal (v) inOrder (leftChild (v)) visit(v) if isInternal (v) inOrder (rightChild (v))
x(v) = inorder rank of v y(v) = depth of v
6
2
8
1
4 3
12/20/02
7
9
5 Algorithm Theory Hiroaki Kobayashi
49
Operations to Expand and Remove an External Node expandExternal(v)
expandExternal(v)
v
A
A ∅
∅
removeAboveExternal(w)
removeAboveExternal(w) A B 12/20/02
v
Algorithm Theory Hiroaki Kobayashi
B C
w
50
25