Topic
Binary Search Trees (Non-Linear Data Structures for Searching) CIS210
1
The Searching Problem
Fundamental to a variety of computer problems! (Search) Key
Searching for Data CIS210
Data
Data Structure
Key Data Key Data Key Data
2
Search Trees
CIS210
3
A Search Tree?
Tree structures used to store data because their organization allows more efficient access to the data. A tree that maintains its data some sorted order and supports efficient search operations. By
constraining the relative positions of the nodes in the tree.
CIS210
4
Binary Search Trees (BST) as Non-linear Data Structures
CIS210
5
A Binary Search Tree?
A binary tree + A search tree A special kind of binary tree with the ordering condition Between
every node and the nodes in its left subtree. Between every node and the nodes in its right subtree. BST Order Property!
CIS210
6
The Order Condition of BST
BST property - For any node N key value in every node in N’s left subtree is less than or equal to the key value K in N. The key value in every node in N’s right subtree is greater than the key value K in N. The
CIS210
7
Logical Structure of BST root
Tleft
CIS210
Tright
8
Binary Search Trees as ADTs
CIS210
9
Operations on Binary Search Trees
Create an empty binary search tree. Destroy a binary search tree. Insert a new item to the binary search tree. Delete the item with a given search key from a binary search tree. Search/Retrieve the item with a given search key from a binary search tree. Determine whether a binary search tree empty? Traverse the items in a binary search tree in preorder, inorder or postorder. ...
CIS210
10
A Pointer-Based Representation using Template Class template class BST template class BSTnode { Public: BSTnode(); BSTnode(DataType D, BSTnode* l, BSTnode* r) : data(D), LchildPtr(l), RchildPtr(r) { } friend class BST; private: DataType data; BSTnode* LchildPtr; BSTnode* RchildPtr; }; CIS210
11
A Pointer-Based Representation using Template Class template class BST { Public: BST(); … private: BSTnode* rootBT; … };
CIS210
12
Binary Search Tree ADT template < class DT, class KF > // Forward dec. of the BSTree class class BSTree; template < class DT, class KF > class BSTreeNode // Facilitator for the BSTree class { private: // Constructor BSTreeNode ( const DT &nodeDataItem, BSTreeNode *leftPtr, BSTreeNode *rightPtr ); // Data members KF searchKey; DT dataItem; // Binary search tree data item BSTreeNode *left, // Pointer to the left child *right; // Pointer to the right child friend class BSTree; CIS210 };
13
Binary Search Tree ADT template < class DT, class KF > // DT : tree data item class BSTree // KF : key field { public:
// Constructor BSTree (); // Destructor ~BSTree ();
CIS210
14
Binary Search Tree ADT // Binary search tree manipulation operations void insert (KF searchKey, const DT &newDataItem );
// Insert data item
bool retrieve ( KF searchKey, DT &searchDataItem ) const; // Retrieve data item bool remove ( KF deleteKey );
void writeKeys () const; void clear ();
CIS210
// Remove data item
// Output keys // Clear tree
15
Binary Search Tree ADT // Binary search tree status operations bool isEmpty () const; // Tree is empty bool isFull () const; // Tree is full // Output the tree structure -- used in testing/debugging void showStructure () const; int getHeight () const; // Height of tree void writeLessThan ( KF searchKey ) const; // Output keys // < searchKey
CIS210
16
Binary Search Tree ADT private: // Recursive partners of the public member functions -- insert // prototypes of these functions here. void insertSub ( BSTreeNode *&p, KF searchKey, const DT &newDataItem ); bool retrieveSub ( BSTreeNode *p, KF searchKey, DT &searchDataItem) const; bool removeSub ( BSTreeNode *&p, KF deleteKey ); void clearSub ( BSTreeNode *p ); void showSub ( BSTreeNode *p, int level ) const; int getHeightSub ( BSTreeNode *p ) const;
CIS210
17
Binary Search Tree ADT // Data member BSTreeNode *root; // Pointer to the root node };
CIS210
18
Example: Insertions of D, B, F, A, C and E D
D
D
B
B
B A
CIS210
B
F
D A
D F
D
F
B
C
A
D
F C
B
E
A
F C
E 19
Example: What order? 4 2
1
6
3
5
7
Insertion Order: 4, 2, 6, 1, 3, 5 and 7 CIS210
20
Example: What order? 1 2 3 4 5 6
7 Insertion Order: 1, 2, 3, 4, 5, 6 and 7 CIS210
21
Example: What order? 1 7 2 6 3 5 4
Insertion Order: 1, 7, 2, 6, 3, 5 and 4 CIS210
22
Insertion Operation - Recursive
Insert (BST, newitem) If
BST == NULL (empty tree) then
Create
a new node; let BST point to this new node;copy newitem into new node’s data portion; set the pointers in the new node to NULL.
else
if newitem.Key < BST->Key then
Insert
(BST->LchildPtr, newitem)
else Insert
CIS210
(BST->RchildPtr, newitem)
23
BST with the Same Data
Several different binary search trees are possible for the same data? Yes
CIS210
24
Insertion Order and Shape of BST
Insertion in search-key order produces a
maximum-height binary search tree!
Insertion in random order produces a
CIS210
near-minimum-height binary search tree!
25
Example: Search F (Successful) B
A
D
C
G
F
E CIS210
26
Example: Search H (Unsuccessful) B
A
D
C
G
F
E CIS210
27
Search Operation - Recursive
Search(BST, SearchKey): If
BST == NULL (empty tree) then
Not
else
Found (Unsuccessful search)
if SearchKey == BST->Key then
Found
else
(Successful search)
if SearchKey < BST->Key then
Search
(BST->LchildPtr, SearchKey)
else Search
CIS210
(BST->RchildPtr, SearchKey)
28
BST Search vs Binary Search
Searching for a key value V in a binary search tree is similar to performing a binary search in a sorted array. If
V=the key data, then the search succeeds. If V < the key data, the search continues in
the left subtree. In the left half of the current part of the array. If
V > the key data, the search continues
in
the right subtree. In the right half of the current part of the array. CIS210
29
Find Min (Smallest) Operation Iterative
FindMin(BST): Start
at the root node BST. Follow the chain of left subtrees until we get to the node that has an empty left subtree. The key in that node is the smallest in the BST.
CIS210
30
Find Min (Smallest) Operation Recursive
FindMin(BST): If
BST == NULL then return NULL. If BST->LchildPtr == NULL (No left subtree) then Return
BST
else FindMin
CIS210
(BST-> LchildPtr)
31
Example: FindMin BST
J B
Q R R
L
M
CIS210
Z
N
K
BST
Z P
32
Find Max (Largest) Operation Iterative
FindMax(BST): Start
at the root node BST. Follow the chain of right subtrees until we get to the node that has an empty right subtree. The key in that node is the largest in the BST.
CIS210
33
Find Max (Largest) Operation Recursive
FindMax(BST): If
BST == NULL then return NULL. If BST ->RchildPtr == NULL (No right subtree) then Return
BST
else FindMax
CIS210
(BST-> RchildPtr)
34
Example: FindMax BST
J B
Q
N
K M
L
N
K
R
L
CIS210
BST
Z
M
P
P
35
Traversal Operation on BST
Preorder traversal Inorder traversal Postorder traversal
CIS210
36
Inorder Traversal of BST
The inorder traversal of a binary search tree visits the nodes in sorted search-key order.
CIS210
37
Find Inorder Predecessor Operation
InorderPredecessor(BST): The
immediate predecessor of the node in the inorder traversal, if it exists. If the node’s left subtree is nonempty then The
largest key in the node’s left subtree FindMax(BST->LchildPtr) else
(the node’s left subtree is empty)
The
lowest ancestor of the node whose right child is the node or also an ancestor of the node.
CIS210
38
Example: Inorder Predecessor of Q J
B
BST
Q R
L N
K M
FindMax
BST
Z P
L N
K M
P
Inorder: B J K L M N P Q R Z CIS210
39
Example: Inorder Predecessor of K J
J
B
Q
N
K M
Q
R
L BST
B
R
L Z
BST
P
N
K M
Z P
Inorder: B J K L M N P Q R Z CIS210
40
Example: Inorder Predecessor of R J
J
B
Q
B
N
K M
BST
R
L
Q
Z
N
K
P
M
BST
R
L
Z P
Inorder: B J K L M N P Q R Z CIS210
41
Find Inorder Successor Operation
InorderSuccessor(BST): The
immediate successor of the node in the inorder traversal, if it exists. If the node’s right subtree is nonempty then The
smallest key in the node’s right subtree FindMin(BST->RchildPtr) else
(the node’s right subtree is empty)
The
lowest ancestor of the node whose left child is the node or also an ancestor of the node.
CIS210
42
Example: Inorder Successor of Q J B
BST
Q R
L N
K M
FindMin
R Z
BST
Z
P
Inorder: B J K L M N P Q R Z CIS210
43
Example: Inorder Successor of P J B
J Q
B R
L N
K M
Q
Z P
R
L N
K BST
M
Z P
BST
Inorder: B J K L M N P Q R Z CIS210
44
Example: Inorder Successor of K J
B
J
Q
N
K M
Q
R
L BST
B
Z P
R
L BST
N
K M
Z P
Inorder: B J K L M N P Q R Z CIS210
45
Deletion Operation
Delete(BST, SearchKey): If
SearchKey < BST->Key then
Delete
else
if SearchKey > BST->Key then
Delete
else
(BST->LchildPtr, SearchKey) (BST->RchildPtr, SearchKey)
(SearchKey == BST->Key )
DeleteNode
CIS210
(BST)
46
Example: Delete Z J
J
B
Q
L
B
R
Q
L
R
Z
CIS210
47
Example: Delete S J
J
B
Q
L
B
S
Q
L
Z
Z
CIS210
48
Example: Delete S J
J
B
Q
B
S
L
Q
L
R
R
CIS210
49
Example: Delete Q J
J
B
Q
L
B
R
L Z
CIS210
? R
Z 50
Deletion Operation
Delete by Merging Delete by Copying
CIS210
51
Deletion by Merging
Observation!
CIS210
52
Deletion by Copying
By copying IOP By copying IOS
CIS210
53
Deletion Operation
Delete(BST, SearchKey): If
SearchKey < BST->Key then
Delete
else
if SearchKey > BST->Key then
Delete
else
(BST->LchildPtr, SearchKey) (BST->RchildPtr, SearchKey)
(SearchKey == BST->Key )
DeleteNode
CIS210
(BST)
54
DeleteNode
If N has two children then Find
M, the node that contains N’s inorder predecessor (or successor).
Inorder predecessor (IOP) = • The rightmost node in the N’s left subtree • The largest key in the N’s left subtree
Inorder successor (IOS) = • The leftmost node in the N’s right subtree • The smallest key in the N’s right subtree
Copy
the item from node M into node N. Delete (BST-> LchildPtr (or RchildPtr), M) // Remove M from the bst.
See Figure 6.32 (p. 251)
CIS210
55
Example: Delete Q J
J
B
Q
L
B
R
L Z
CIS210
? R
Z 56
Example: Delete Q J
J
IOP
B
Q L
B R
CIS210
L L
Z
J B R
L R
Z
Z
57
Example: Delete Q J
J
IOS
B
Q L
B R
CIS210
R L
Z
J B R
R L
Z
Z
58
Quiz: Delete Q J
J
IOP
B
Q
N
K M
P
N
K M
Z P
P R
L
R
L Z
B
P
R
L
CIS210
B
J
N
K
Z
M
59
Quiz: Delete Q J
J
IOS
B
Q
N
K M
P
N
K M
Z P
R Z
L
R
L Z
B
R
R
L
CIS210
B
J
N
K M
60
Delete by Merging Vs. Delete by Copying
Delete by Merging …
Delete by Copying …
CIS210
61
Analysis of BST Operations
The number of comparisons for a search/retrieval, insertion or deletion is the
level (depth) of the element in the binary search tree.
The maximum number of comparisons for a retrieval, insertion or deletion is the
CIS210
height of the binary search tree!
62
Properties of Binary Search Trees
What is the minimum number of nodes that a binary search tree of height h can have? h
The minimum number of nodes that a binary search tree of height h can have is h.
CIS210
63
The minimum number of nodes that a binary search tree of height h can have is h.
Proof (by Induction on h): Base
case: h=1:
N=
1=h
Inductive
hypothesis:
The
minimum number of nodes that a binary search tree of height h = some k 1 can have is k.
CIS210
64
The minimum number of nodes that a binary search tree of height h can have is h. Consider
h= k+1: root
Tleft
N
CIS210
root
Tright
= 1 + # of nodes in the subtree with height k =1+k =k+1=h 65
Properties of Binary Search Trees
What is the maximum number of nodes that a binary search tree of height h can have? 2h
-1
The maximum number of nodes that a binary search tree of height h can have is 2h - 1.
CIS210
66
The maximum number of nodes that a binary search tree of height h can have is 2h - 1.
Proof (by Induction on h): Base
case: h=1:
N=
1=h
Inductive
hypothesis:
The
maximum number of nodes that a binary search tree of height h = some k 1 can have is 2k - 1.
CIS210
67
The maximum number of nodes that a binary search tree of height h can have is 2h - 1. Consider
h= k+1: root
Tleft
N
CIS210
Tright
= 1 + # of nodes in the subtrees with height k = 1 + 2 * (2k -1) = 1 + 2 k+1 - 2 = 2 k+1 - 1 = 2h - 1 68
Properties of Binary Search Trees
N= The number of nodes in a binary search tree. h = The height of a binary search tree.
h log
N
h 2
-1 N
N log (N+1) h N
Bound: h = (log N) log N Upper Bound: h = O (N)
N
Lower
CIS210
N
69
Analysis of Search/Retrieval Operation
Worst case O(N)
Average Case O(log
CIS210
N)
70
Analysis of Insertion Operation
Worst case O(N)
Average Case O(log
CIS210
N)
71
Analysis of Deletion Operation
Worst case O(N)
Average Case O(log
CIS210
N)
72
Analysis of Traversal Operation
Worst case O(N)
Average Case O(N)
CIS210
73
Quiz
What is the maximum number of nodes that a D-ary tree of height h can have? (Dh
- 1) / (D-1)
Prove by induction? ...
CIS210
74
The maximum number of nodes that a D-ary tree of height h can have is (Dh - 1)/(D-1).
Proof (by Induction on h): Base
case: h=1:
N=
D-1 / D-1 = 1 = h
Inductive
hypothesis:
The
maximum number of nodes that a D-ary tree of height h= some k 1 can have is (Dk - 1)/(D-1).
CIS210
75
The maximum number of nodes that a D-ary tree of height h can have is (Dh - 1)/(D-1). Consider
h= k+1: root
T1
N
CIS210
T2
TD
= 1 + # of nodes in the subtrees with height k = 1 + D * (Dk - 1)/(D-1) = 1 + (D k+1 - D)/(D-1) = (D - 1 + D k+1 - D) /(D-1) = (D k+1 - 1) /(D-1) = (Dh - 1) /(D-1) . 76
Iterators for BST
CIS210
77
Design Pattern: The Iterator Pattern
Container
Iterator
An iterator is an object of an iterator class! CIS210
78
Iterator Operations
Inequality compare (!=) Dereference (*) Increment (++)
Overloaded operators!
CIS210
79
Types of Iterators for BST
Pre-order Traversal Iterator In-order Traversal Iterator Post-order Traversal Iterator Level-order Traversal Iterator
CIS210
80
Example: Iterators for BST J B
Q R
L N
K M
Z P
Preorder: J B Q L K N M P R Z Postorder: B K M P N L Z R Q J Inorder: CIS210
B J K L M N P Q R Z 81
Iterators for BST
bst bst1; bst::inorder_iterator p; for (p=bst1.begin();p!=bst1.end(); ++p) // Process *p cout