CS32 Week 9: Binary Search Tree Hash table

CS32 Week 9: Binary Search Tree Hash table Yuchen Liu Email: [email protected] Office Hour: Wednesday 12:30 ~ 13:30 BH2432 Thursday 11:30 ~ 13:30 h...
Author: Joan Sherman
3 downloads 0 Views 311KB Size
CS32 Week 9: Binary Search Tree Hash table Yuchen Liu Email: [email protected] Office Hour: Wednesday 12:30 ~ 13:30 BH2432 Thursday 11:30 ~ 13:30 http://www.cs.ucla.edu/~yliu/cs32/ 1

Outline n  n 

Binary Search Tree Hash Table

2

Binary Search Tree n 

Definition q 

q 

For any node, all nodes on the left subtree have smaller values than current node’s value. For any node, all nodes on the right subtree have larger values than current node’s value.

3

Binary Search Tree n 

For a same set of data, there exists multiple binary search trees. 6

5 4

Binary Search Tree n 

It’s a kind of data structure for storing data q 

Pros and cons

q 

Operations: n  n  n  n 

Search Insert Delete Traversal

5

Binary Search Tree n 

Search for a value q 

From the root, compare the value of current node with desired value n  n 

If equals, found the value If desired value is larger than current node’s value q 

n 

If desired vale is smaller than current node’s value q 

n 

Go check right child Go check left child

If reach NULL child, then not found.

6

Binary Search Tree n 

Search

q 

Search for 6 Search for 3 Search for 9

q 

Complexity

q  q 

n 

For full binary search tree q 

n 

O(log n)

What affects the performance of search? q 

Tree height! 7

Binary Search Tree n 

Search q 

For a fixed size of data, sorted.

q 

Using binary search on a sorted array n 

Complexity? q  q 

n 

O(log n) Same as full binary search tree

Why use a binary search tree? q 

More flexibility for modification – insertion/ deletion

8

Binary Search Tree n 

Insertion q  q 

Put the new value in the correct position First do a search n 

q 

Until we hit the NULL pointer

Create a new Node n  n 

Put it at the position of the NULL pointer Fix the link

9

Binary Search Tree n 

Insertion q  q 

n 

Insert 7 Insert 0

Complexity? q  q  q 

Search + insert O(log n) + O(1) = O(log n)

10

Binary Search Tree n 

Insertion q  q 

Given a sorted array of integers How to construct a BST? n  n  n 

Insertion by linear order? Insertion by random order? Pick the middle one as the root?

11

Binary Search Tree n 

Challenge: q  q 

Given a sorted array, build the best BST. Node* buildBST(int array[], int size);

Hint: use recursion q  Hint: design your own recursion function q 

struct Node { Node(int val) { m_value = val; m_left = NULL; m_right = NULL; } int m_value; Node* m_left; Node* m_right; };

12

Binary Search Tree n 

Challenge:

q 

Given a sorted array, build the best BST. Node* buildBST(int array[], int size); { return buildBST(array, 0, size-1); }

q 

Node* buildBST(int array[], int start, int end)

q  q  q  q 

13

Binary Search Tree n 

Challenge: q 

Given a sorted array, build the best BST.

q 

Node* buildBST(int array[], int start, int end)

q 

Recursion n 

How to break down the problem? q  q 

n 

How to merge results? q 

n 

Use first half to construct left subtree Use second half to construct right subtree Use middle one to create a node, link left and right subtree

Base case? q 

When start > end. 14

Binary Search Tree n 

Challenge: q 

Given a sorted array, build the best BST.

Node* buildBST(int array[], int start, int end) { if (start > end) return NULL; int mid = (start + end) / 2; Node* root = new Node(array[mid]); Node* left = buildBST(array, start, mid - 1); Node* right = buildBST(array, mid + 1, end); root->m_left = left; root->m_right = right; return root; } 15

Binary Search Tree n 

Traversal q 

Basic tree traversal techniques n  n  n 

q 

Pre-order Post-order In-order

Which one is special for BST? n 

In-order traversal produce the values in order.

16

Binary Search Tree n 

Deletion q  q 

Remove one node from the tree Tree still remain a valid BST n  n 

q  q 

Values still in order Fix links, move node if necessary

Do a search first, find the node to delete Delete the node n 

3 cases

17

Binary Search Tree n 

Deletion q 

3 cases n  n  n 

The node is a leaf The node has one child The node has two children

18

Binary Search Tree n 

Deletion q 

The node is a leaf Delete the node n  Set the parent’s child pointer to NULL n 

19

Binary Search Tree n 

Deletion q 

The node has one child n 

Link the parent to its own child q 

n 

No matter left or right

This will remain the order

20

Binary Search Tree n 

Deletion q 

The node has one child n 

Link the parent to its own child q 

n 

No matter left or right

This will remain the order

21

Binary Search Tree n 

Deletion q 

The node has two children n 

Find a replacement The one next to current node’s value from left or right q  Largest one in the left subtree q  Smallest one in the right subtree q  This node has no children or only one child. q 

n  n 

Put the value for replacement in the current node Delete the replacement node (use case 1 or 2) 22

Binary Search Tree n 

Height of a BST? q  q  q 

q 

q 

int getHeight(Node* pRoot) { } Height of a tree is the deepest level of all leaf nodes. Hint: use recursion

23

Binary Search Tree n 

Height of a BST?

int getHeight(Node* proot) { if (proot == NULL) return 0; int leftheight = getHeight(proot->m_left); int rightheight = getHeight(proot->m_right); if (leftheight > rightheight) return leftheight + 1; else return rightheight + 1; }

24

Binary Search Tree n 

Find the max in BST q  q  q 

int GetMax(Node* proot) { int GetMax(node *pRoot) } { if (pRoot == NULL) return(-1); // empty while (pRoot->right != NULL) pRoot = pRoot->right; return(pRoot->value); }

25

Binary Search Tree n 

Find the max in a binary tree q  q  q 

int GetMax(Node* proot) { }

26

Binary Search Tree int GetMax(Node* proot) { int max = proot->m_value; if (proot->m_left != NULL) { int leftmax = GetMax(proot->m_left); if (leftmax > max) max = leftmax; } if (proot->m_right != NULL) { int rightmax = GetMax(proot->m_right); if (rightmax > max) max = rightmax; } return max; } 27

Balanced Binary Search Tree n 

Balanced q 

n 

for any node, the difference between depth for left subtree and right subtree is at most 1.

Height (depth) affects the performance of searching performance for BST. q 

q  q 

a balanced tree will have the smallest height for all BSTs. Rotation techniques. AVL tree, etc. 28

Hash table n 

Hash function q 

Take a key and map it to a number “Carey” -> H(x) -> 4531

q 

Basic requirement:

q 

n 

q 

For same key, produce same value.

Better hash function: n 

n 

Spreads out the values: two different keys are likely to result in different hash values. Computes each value quickly. 29

Hash table n 

If we have a perfect hash function: q 

q 

n 

H(x) could map the key into an integer range of [0, 10000] different key will result in different hash value.

We could use the hash function to store the data to support fast retrieval.

30

Hash table

31

Hash table n 

Time complexity: q 

Insert n 

q 

Delete n 

q 

O(1) O(1)

Search n  n  n 

Compute the hash value for the key Go to the memory location O(1)

32

Hash table n 

But there is no perfect hash functions q 

q  q 

There exists the case that two different key would result in the same hash value. “Collision” Typical hash function: n 

q 

mod by a large prime number

Design a way to resolve hash function collision n  n 

Closed: linear probing Open

33

Hash table n 

Close hash table q  q 

Linear probing Solution: append the value in the next available spot starting from the desired position.

34

Hash table

C

35

Hash table n 

Closed hash table (linear probing) q 

Search: n  n  n 

Compute the hash value Linear scan starting from the hash value to an empty slot Nearly O(1), depending on the load factor

36

Hash table n 

Closed hash table (linear probing) q 

Problem: n  n 

Deletion Hard to maintain data integrity for the hash table.

37

Hash table n 

Open hash table q  q 

Use Linkedlist for each hash value (bucket) Maintain the linkedlist for collision n  n 

Insert: append a new node Delete: delete a node from the linked list

38

Hash table n 

Open hash table

39

Hash table n 

Open hash table q 

Search: n  n 

Find the corresponding bucket Traverse the linkedlist to find the item

40

Hash table n 

Complexity analysis q 

Desired performance: n 

q 

Collision ruins the wish n  n 

q 

Insertion, deletion, search: O(1) Insertion, deletion and search would take longer But approximately O(1)

Based on the load factor and how frequent a collision from the hash function happens. n 

Generally open hash table performs better than closed hash table using linear probing. 41

Hash table n 

Compared with Binary Search Tree Hash table

Binary Search Tree

Speed

O(1)

O(log n)

Max size

Closed: by array size Open: unlimited

unlimited

Space efficiency

Waste a lot of memory

Only memory needed

Ordering

No ordering (random)

sorted

42

Hash table n 

To keep a better performance, keep a low load factor. q  q 

A waste of memory. Tradeoffs between space and speed.

43

Big-O for compound STL n 

Take a review by yourself

44

Reminder n 

Homework 4 is on the way q 

n  n 

Warm-ups

Final is on Saturday next week. Course evaluation due next Saturday.

45

Thank you! Questions? Slides available at http://www.cs.ucla.edu/~yliu/cs32/

46

Suggest Documents