Topic Binary Search Trees (Non-Linear Data Structures for Searching)

Topic Binary Search Trees (Non-Linear Data Structures for Searching) CIS210 1 The Searching Problem  Fundamental to a variety of computer proble...
Author: Bertha Rich
11 downloads 0 Views 639KB Size
Topic

Binary Search Trees (Non-Linear Data Structures for Searching) CIS210

1

The Searching Problem 

Fundamental to a variety of computer problems! (Search) Key

Searching for Data CIS210

Data

Data Structure

Key Data Key Data Key Data

2

Search Trees

CIS210

3

A Search Tree? 



Tree structures used to store data because their organization allows more efficient access to the data. A tree that maintains its data some sorted order and supports efficient search operations.  By

constraining the relative positions of the nodes in the tree.

CIS210

4

Binary Search Trees (BST) as Non-linear Data Structures

CIS210

5

A Binary Search Tree?  

A binary tree + A search tree A special kind of binary tree with the ordering condition  Between

every node and the nodes in its left subtree.  Between every node and the nodes in its right subtree.  BST Order Property!

CIS210

6

The Order Condition of BST 

BST property - For any node N key value in every node in N’s left subtree is less than or equal to the key value K in N.  The key value in every node in N’s right subtree is greater than the key value K in N.  The

CIS210

7

Logical Structure of BST root

Tleft

CIS210

Tright

8

Binary Search Trees as ADTs

CIS210

9

Operations on Binary Search Trees    



 



Create an empty binary search tree. Destroy a binary search tree. Insert a new item to the binary search tree. Delete the item with a given search key from a binary search tree. Search/Retrieve the item with a given search key from a binary search tree. Determine whether a binary search tree empty? Traverse the items in a binary search tree in preorder, inorder or postorder. ...

CIS210

10

A Pointer-Based Representation using Template Class template class BST template class BSTnode { Public: BSTnode(); BSTnode(DataType D, BSTnode* l, BSTnode* r) : data(D), LchildPtr(l), RchildPtr(r) { } friend class BST; private: DataType data; BSTnode* LchildPtr; BSTnode* RchildPtr; }; CIS210

11

A Pointer-Based Representation using Template Class template class BST { Public: BST(); … private: BSTnode* rootBT; … };

CIS210

12

Binary Search Tree ADT template < class DT, class KF > // Forward dec. of the BSTree class class BSTree; template < class DT, class KF > class BSTreeNode // Facilitator for the BSTree class { private: // Constructor BSTreeNode ( const DT &nodeDataItem, BSTreeNode *leftPtr, BSTreeNode *rightPtr ); // Data members KF searchKey; DT dataItem; // Binary search tree data item BSTreeNode *left, // Pointer to the left child *right; // Pointer to the right child friend class BSTree; CIS210 };

13

Binary Search Tree ADT template < class DT, class KF > // DT : tree data item class BSTree // KF : key field { public:

// Constructor BSTree (); // Destructor ~BSTree ();

CIS210

14

Binary Search Tree ADT // Binary search tree manipulation operations void insert (KF searchKey, const DT &newDataItem );

// Insert data item

bool retrieve ( KF searchKey, DT &searchDataItem ) const; // Retrieve data item bool remove ( KF deleteKey );

void writeKeys () const; void clear ();

CIS210

// Remove data item

// Output keys // Clear tree

15

Binary Search Tree ADT // Binary search tree status operations bool isEmpty () const; // Tree is empty bool isFull () const; // Tree is full // Output the tree structure -- used in testing/debugging void showStructure () const; int getHeight () const; // Height of tree void writeLessThan ( KF searchKey ) const; // Output keys // < searchKey

CIS210

16

Binary Search Tree ADT private: // Recursive partners of the public member functions -- insert // prototypes of these functions here. void insertSub ( BSTreeNode *&p, KF searchKey, const DT &newDataItem ); bool retrieveSub ( BSTreeNode *p, KF searchKey, DT &searchDataItem) const; bool removeSub ( BSTreeNode *&p, KF deleteKey ); void clearSub ( BSTreeNode *p ); void showSub ( BSTreeNode *p, int level ) const; int getHeightSub ( BSTreeNode *p ) const;

CIS210

17

Binary Search Tree ADT // Data member BSTreeNode *root; // Pointer to the root node };

CIS210

18

Example: Insertions of D, B, F, A, C and E D

D

D

B

B

B A

CIS210

B

F

D A

D F

D

F

B

C

A

D

F C

B

E

A

F C

E 19

Example: What order? 4 2

1

6

3

5

7

Insertion Order: 4, 2, 6, 1, 3, 5 and 7 CIS210

20

Example: What order? 1 2 3 4 5 6

7 Insertion Order: 1, 2, 3, 4, 5, 6 and 7 CIS210

21

Example: What order? 1 7 2 6 3 5 4

Insertion Order: 1, 7, 2, 6, 3, 5 and 4 CIS210

22

Insertion Operation - Recursive 

Insert (BST, newitem)  If

BST == NULL (empty tree) then

 Create

a new node; let BST point to this new node;copy newitem into new node’s data portion; set the pointers in the new node to NULL.

 else

if newitem.Key < BST->Key then

 Insert

(BST->LchildPtr, newitem)

 else  Insert

CIS210

(BST->RchildPtr, newitem)

23

BST with the Same Data 

Several different binary search trees are possible for the same data?  Yes

CIS210

24

Insertion Order and Shape of BST 

Insertion in search-key order produces a



maximum-height binary search tree!

Insertion in random order produces a

CIS210

near-minimum-height binary search tree!

25

Example: Search F (Successful) B

A

D

C

G

F

E CIS210

26

Example: Search H (Unsuccessful) B

A

D

C

G

F

E CIS210

27

Search Operation - Recursive 

Search(BST, SearchKey):  If

BST == NULL (empty tree) then

 Not

 else

Found (Unsuccessful search)

if SearchKey == BST->Key then

 Found

 else

(Successful search)

if SearchKey < BST->Key then

 Search

(BST->LchildPtr, SearchKey)

 else  Search

CIS210

(BST->RchildPtr, SearchKey)

28

BST Search vs Binary Search 

Searching for a key value V in a binary search tree is similar to performing a binary search in a sorted array.  If

V=the key data, then the search succeeds.  If V < the key data, the search continues  in

the left subtree.  In the left half of the current part of the array.  If

V > the key data, the search continues

 in

the right subtree.  In the right half of the current part of the array. CIS210

29

Find Min (Smallest) Operation Iterative 

FindMin(BST):  Start

at the root node BST.  Follow the chain of left subtrees until we get to the node that has an empty left subtree.  The key in that node is the smallest in the BST.

CIS210

30

Find Min (Smallest) Operation Recursive 

FindMin(BST):  If

BST == NULL then return NULL.  If BST->LchildPtr == NULL (No left subtree) then  Return

BST

 else  FindMin

CIS210

(BST-> LchildPtr)

31

Example: FindMin BST

J B

Q R R

L

M

CIS210

Z

N

K

BST

Z P

32

Find Max (Largest) Operation Iterative 

FindMax(BST):  Start

at the root node BST.  Follow the chain of right subtrees until we get to the node that has an empty right subtree.  The key in that node is the largest in the BST.

CIS210

33

Find Max (Largest) Operation Recursive 

FindMax(BST):  If

BST == NULL then return NULL.  If BST ->RchildPtr == NULL (No right subtree) then  Return

BST

 else  FindMax

CIS210

(BST-> RchildPtr)

34

Example: FindMax BST

J B

Q

N

K M

L

N

K

R

L

CIS210

BST

Z

M

P

P

35

Traversal Operation on BST   

Preorder traversal Inorder traversal Postorder traversal

CIS210

36

Inorder Traversal of BST

The inorder traversal of a binary search tree visits the nodes in sorted search-key order.

CIS210

37

Find Inorder Predecessor Operation 

InorderPredecessor(BST):  The

immediate predecessor of the node in the inorder traversal, if it exists.  If the node’s left subtree is nonempty then  The

largest key in the node’s left subtree  FindMax(BST->LchildPtr)  else

(the node’s left subtree is empty)

 The

lowest ancestor of the node whose right child is the node or also an ancestor of the node.

CIS210

38

Example: Inorder Predecessor of Q J

B

BST

Q R

L N

K M

FindMax

BST

Z P

L N

K M

P

Inorder: B J K L M N P Q R Z CIS210

39

Example: Inorder Predecessor of K J

J

B

Q

N

K M

Q

R

L BST

B

R

L Z

BST

P

N

K M

Z P

Inorder: B J K L M N P Q R Z CIS210

40

Example: Inorder Predecessor of R J

J

B

Q

B

N

K M

BST

R

L

Q

Z

N

K

P

M

BST

R

L

Z P

Inorder: B J K L M N P Q R Z CIS210

41

Find Inorder Successor Operation 

InorderSuccessor(BST):  The

immediate successor of the node in the inorder traversal, if it exists.  If the node’s right subtree is nonempty then  The

smallest key in the node’s right subtree  FindMin(BST->RchildPtr)  else

(the node’s right subtree is empty)

 The

lowest ancestor of the node whose left child is the node or also an ancestor of the node.

CIS210

42

Example: Inorder Successor of Q J B

BST

Q R

L N

K M

FindMin

R Z

BST

Z

P

Inorder: B J K L M N P Q R Z CIS210

43

Example: Inorder Successor of P J B

J Q

B R

L N

K M

Q

Z P

R

L N

K BST

M

Z P

BST

Inorder: B J K L M N P Q R Z CIS210

44

Example: Inorder Successor of K J

B

J

Q

N

K M

Q

R

L BST

B

Z P

R

L BST

N

K M

Z P

Inorder: B J K L M N P Q R Z CIS210

45

Deletion Operation 

Delete(BST, SearchKey):  If

SearchKey < BST->Key then

 Delete

 else

if SearchKey > BST->Key then

 Delete

 else

(BST->LchildPtr, SearchKey) (BST->RchildPtr, SearchKey)

(SearchKey == BST->Key )

 DeleteNode

CIS210

(BST)

46

Example: Delete Z J

J

B

Q

L

B

R

Q

L

R

Z

CIS210

47

Example: Delete S J

J

B

Q

L

B

S

Q

L

Z

Z

CIS210

48

Example: Delete S J

J

B

Q

B

S

L

Q

L

R

R

CIS210

49

Example: Delete Q J

J

B

Q

L

B

R

L Z

CIS210

? R

Z 50

Deletion Operation

Delete by Merging  Delete by Copying 

CIS210

51

Deletion by Merging

Observation!



CIS210

52

Deletion by Copying  

By copying IOP By copying IOS

CIS210

53

Deletion Operation 

Delete(BST, SearchKey):  If

SearchKey < BST->Key then

 Delete

 else

if SearchKey > BST->Key then

 Delete

 else

(BST->LchildPtr, SearchKey) (BST->RchildPtr, SearchKey)

(SearchKey == BST->Key )

 DeleteNode

CIS210

(BST)

54

DeleteNode 

If N has two children then  Find

M, the node that contains N’s inorder predecessor (or successor). 

Inorder predecessor (IOP) = • The rightmost node in the N’s left subtree • The largest key in the N’s left subtree



Inorder successor (IOS) = • The leftmost node in the N’s right subtree • The smallest key in the N’s right subtree

 Copy

the item from node M into node N.  Delete (BST-> LchildPtr (or RchildPtr), M) // Remove M from the bst. 

See Figure 6.32 (p. 251)

CIS210

55

Example: Delete Q J

J

B

Q

L

B

R

L Z

CIS210

? R

Z 56

Example: Delete Q J

J

IOP

B

Q L

B R

CIS210

L L

Z

J B R

L R

Z

Z

57

Example: Delete Q J

J

IOS

B

Q L

B R

CIS210

R L

Z

J B R

R L

Z

Z

58

Quiz: Delete Q J

J

IOP

B

Q

N

K M

P

N

K M

Z P

P R

L

R

L Z

B

P

R

L

CIS210

B

J

N

K

Z

M

59

Quiz: Delete Q J

J

IOS

B

Q

N

K M

P

N

K M

Z P

R Z

L

R

L Z

B

R

R

L

CIS210

B

J

N

K M

60

Delete by Merging Vs. Delete by Copying 

Delete by Merging …



Delete by Copying …

CIS210

61

Analysis of BST Operations 

The number of comparisons for a search/retrieval, insertion or deletion is  the

level (depth) of the element in the binary search tree.



The maximum number of comparisons for a retrieval, insertion or deletion is  the

CIS210

height of the binary search tree!

62

Properties of Binary Search Trees 

What is the minimum number of nodes that a binary search tree of height h can have? h

The minimum number of nodes that a binary search tree of height h can have is h.

CIS210

63

The minimum number of nodes that a binary search tree of height h can have is h. 

Proof (by Induction on h):  Base

case: h=1:

 N=

1=h

 Inductive

hypothesis:

 The

minimum number of nodes that a binary search tree of height h = some k  1 can have is k.

CIS210

64

The minimum number of nodes that a binary search tree of height h can have is h.  Consider

h= k+1: root

Tleft

N  

CIS210

root

Tright

= 1 + # of nodes in the subtree with height k =1+k =k+1=h 65

Properties of Binary Search Trees 

What is the maximum number of nodes that a binary search tree of height h can have?  2h

-1

The maximum number of nodes that a binary search tree of height h can have is 2h - 1.

CIS210

66

The maximum number of nodes that a binary search tree of height h can have is 2h - 1. 

Proof (by Induction on h):  Base

case: h=1:

 N=

1=h

 Inductive

hypothesis:

 The

maximum number of nodes that a binary search tree of height h = some k  1 can have is 2k - 1.

CIS210

67

The maximum number of nodes that a binary search tree of height h can have is 2h - 1.  Consider

h= k+1: root

Tleft

N   

CIS210

Tright

= 1 + # of nodes in the subtrees with height k = 1 + 2 * (2k -1) = 1 + 2 k+1 - 2 = 2 k+1 - 1 = 2h - 1 68

Properties of Binary Search Trees  

N= The number of nodes in a binary search tree. h = The height of a binary search tree.

h  log

 N

h 2

-1 N

N  log (N+1)  h  N

Bound: h =  (log N) log N  Upper Bound: h = O (N)

N

 Lower

CIS210

N

69

Analysis of Search/Retrieval Operation 

Worst case  O(N)



Average Case  O(log

CIS210

N)

70

Analysis of Insertion Operation 

Worst case  O(N)



Average Case  O(log

CIS210

N)

71

Analysis of Deletion Operation 

Worst case  O(N)



Average Case  O(log

CIS210

N)

72

Analysis of Traversal Operation 

Worst case  O(N)



Average Case  O(N)

CIS210

73

Quiz 

What is the maximum number of nodes that a D-ary tree of height h can have?  (Dh



- 1) / (D-1)

Prove by induction?  ...

CIS210

74

The maximum number of nodes that a D-ary tree of height h can have is (Dh - 1)/(D-1). 

Proof (by Induction on h):  Base

case: h=1:

 N=

D-1 / D-1 = 1 = h

 Inductive

hypothesis:

 The

maximum number of nodes that a D-ary tree of height h= some k  1 can have is (Dk - 1)/(D-1).

CIS210

75

The maximum number of nodes that a D-ary tree of height h can have is (Dh - 1)/(D-1).  Consider

h= k+1: root

T1

N    

CIS210

T2

TD

= 1 + # of nodes in the subtrees with height k = 1 + D * (Dk - 1)/(D-1) = 1 + (D k+1 - D)/(D-1) = (D - 1 + D k+1 - D) /(D-1) = (D k+1 - 1) /(D-1) = (Dh - 1) /(D-1) . 76

Iterators for BST

CIS210

77

Design Pattern: The Iterator Pattern

Container

Iterator

An iterator is an object of an iterator class! CIS210

78

Iterator Operations   

Inequality compare (!=) Dereference (*) Increment (++)

Overloaded operators!

CIS210

79

Types of Iterators for BST    

Pre-order Traversal Iterator In-order Traversal Iterator Post-order Traversal Iterator Level-order Traversal Iterator

CIS210

80

Example: Iterators for BST J B

Q R

L N

K M

Z P

Preorder: J B Q L K N M P R Z Postorder: B K M P N L Z R Q J Inorder: CIS210

B J K L M N P Q R Z 81

Iterators for BST

bst bst1; bst::inorder_iterator p; for (p=bst1.begin();p!=bst1.end(); ++p) // Process *p cout