Course objectives: This laboratory course enable students to get practical experience in design, develop, implement, analyze and evaluation/testing of · Asymptotic performance of algorithms. · Linear data structures and their applications such as Stacks, Queues and Lists · Non-Linear Data Structures and their Applications such as Trees and Graphs · Sorting and Searching Algorithms

Exp. 1 Design, Develop and Implement a menu driven Program in C for the following Array operations a. Creating an Array of N Integer Elements b. Display of Array Elements with Suitable Headings c. Inserting an Element (ELEM) at a given valid Position (POS) d. Deleting an Element at a given valid Position(POS) e. Exit. Support the program with functions for each of the above operations. Theory: Array is a collection of elements of the same type. In this program we need to use functions for various operations Create(): Create an array for the size given by the user Display(): Display the elements of the array Insert(): Insert an element at the position given by the user Delete(): Delete an element from the position specified by the user Exit(): Terminate

Algorithm: Step 1: Start. Step 2: Read N value. Step 3: Read Array of N integer elements Step 4: Print array of N integer elements. Step 5: Insert an element at given valid position in an array. Step 6: Delete an element at given valid position from an array. Step 7: Stop.

Exp. 2 Design, Develop and Implement a Program in C for the following operations on Strings a. Read a main String (STR), a Pattern String (PAT) and a Replace String (REP) b. Perform Pattern Matching Operation: Find and Replace all occurrences of PAT in STR with REP if PAT exists in STR. Report suitable messages in case PAT does not exist inSTR Support the program with functions for each of the above operations. Don't use Built-in functions. Theory: Strings are actually one-dimensional array of characters terminated by a null character '\0'. Thus a null-terminated string contains the characters that comprise the string followed by a null C language supports a wide range of built-in functions that manipulate null-terminated strings as follows: strcpy(s1, s2): Copies string s2 into string s1. strcat(s1, s2): Concatenates string s2 onto the end of string s1. strlen(s1): Returns the length of string s1. strcmp(s1, s2): Returns 0 if s1 and s2 are the same; less than 0 if s1s2. strchr(s1, ch): Returns a pointer to the first occurrence of character ch in string s1. strstr(s1, s2): Returns a pointer to the first occurrence of string s2 in string s1 Algorithm: Step 1: Start. Step 2: Read main string STR, pattern string PAT and replace string REP. Step 3: Search / find the pattern string PAT in the main string STR. Step 4: if PAT is found then replace all occurrences of PAT in main string STR with REP string. Step 5: if PAT is not found give a suitable error message. Step 6: Stop.

Exp. 3 Design, Develop and Implement a menu driven Program in C for the following operations on STACK of Integers (Array Implementation of Stack with maximum size MAX) a. Push an Element on to Stack b. Pop an Element from Stack c. Demonstrate how Stack can be used to check Palindrome d. Demonstrate Overflow and Underflow situations on Stack e. Display the status of Stack f. Exit Support the program with appropriate functions for each of the above operations Theory: Stack is a collection of elements of the similar types. It is also called as last in, first out. The element inserted first is the last one to be deleted. It is used for various applications like infix to postfix expression, postfix evaluation and for maintaining stack frames for function calling • push() - pushing (storing) an element on the stack. • pop() - removing (accessing) an element from the stack. To use a stack efficiently we need to check status of stack as well. For the same purpose, the following functionality is added to stacks; • peek() − get the top data element of the stack, without removing it. • isFull() − check if stack is full. • isEmpty() − check if stack is empty. Algorithm: Step 1: Start. Step 2: Initialize stack size MAX and top of stack -1. Step 3: Push integer element on to stack and display the contents of the stack. if stack is full give a message as ‘Stack is Overflow’. Step 3: Pop element from stack along with display the stack contents. if stack is empty give a message as ‘Stack is Underflow’. Step 4: Check whether the stack contents are Palindrome or not. Step 5: Stop

Exp. 4 Design, Develop and Implement a Program in C for converting an Infix Expression to PostfixExpression. Program should support for both parenthesized and free parenthesized expressions with the operators: +, -, *, /, %(Remainder), ^(Power) and alphanumeric operands. Theory: Infix: Operators are written in-between their operands. Ex: X + Y Prefix: Operators are written before their operands. Ex: +X Y postfix: Operators are written after their operands. Ex: XY+ Algorithm: Step 1: Read the infix expression as a string. Step 2: Scan the expression character by character till the end. Repeat the following operations 1. If it is an operand add it to the postfix expression. 2. If it is a left parentheses push it onto the stack. 3. If it is a right parentheses pop out elements from the stack and assign it to the postfix string. Pop out the left parentheses but don’t assign to postfix. Step 3: If it is an operator compare its precedence with that of the element at the top of stack. 1.If it is greater push it onto the stack. 2.Else pop and assign elements in the stack to the postfix expression until you find one such element. Step 4: If you have reached the end of the expression, pop out any leftover elements in the stack till it becomes empty. Step 5: Append a null terminator at the end display the result

Exp. 5 Design, Develop and Implement a Program in C for the following Stack Applications a. Evaluation of Suffix expression with single digit operands and operators: +, -, *, /, %, b. Solving Tower of Hanoi problem with n disks a. EVALUATION OF POSTFIX EXPRESSION Algorithm Step 1: Read the infix expression as a string. Step 2: Scan the expression character by character till the end. Repeat the following operations a. If it is an operand push it onto the stack. b. If it is an operator 1.Pop out two operands from stack 2.Apply the operator onto the popped operands. 3.Store the result back on to the stack Step 3: On reaching the end of expression pop out the contents of the stack and display as the result. b. TOWER OF HANOI The Tower of Hanoi is a mathematical game or puzzle. It consists of three rods, and a number of disks of different sizes which can slide onto any rod. The puzzle starts with the disks in a neat stack in ascending order of size on one rod, the smallest at the top, thus making a conical shape. The objective of the puzzle is to move the entire stack to another rod, obeying the following simple rules: • Only one disk can be moved at a time. • Each move consists of taking the upper disk from one of the stacks and placing it on top of another stack i.e. a disk can only be moved if it is the uppermost disk on a stack. • No disk may be placed on top of a smaller disk. With three disks, the puzzle can be solved in seven moves. The minimum number of moves required to solve a Tower of Hanoi puzzle is 2n - 1, where n is the number of disks Algorithm: Step 1:Move a tower of height-1 to an intermediate pole, using the final pole. Step 2:Move the remaining disk to the final pole. Step 3:Move the tower of height-1 from the intermediate pole to the final pole using the original pole.

Exp. 6 Design, Develop and Implement a menu driven Program in C for the following operations on Circular QUEUE of Characters (Array Implementation of Queue with maximum size MAX) a. Insert an Element on to Circular QUEUE b. Delete an Element from Circular QUEUE c. Demonstrate Overflow and Underflow situations on Circular QUEUE d. Display the status of Circular QUEUE e. Exit Support the program with appropriate functions for each of the above operations Theory: Circular queue is a linear data structure. It follows FIFO principle. In circular queue the last node is connected back to the first node to make a circle.It is also called FIFO structure. Elements are added at the rear end and the elements are deleted at front end of the queue. The queue is considered as a circular queue when the positions 0 and MAX-1 are adjacent. ALGORITHM: Step 1: Start. Step 2: Initialize queue size to MAX. Step 3: Insert the elements into circular queue. If queue is full give a message as ‘queue is overflow” Step 4: Delete an element from the circular queue. If queue is empty give a message as ‘queue is underflow’. Step 5: Display the contents of the queue. Step 6: Stop.

Exp7. Design, Develop and Implement a menu driven Program in C for the following operations on Singly Linked List (SLL) of Student Data with the fields: USN, Name, Branch, Sem, PhNo a. Create a SLL of N Students Data by using front insertion. b. Display the status of SLL and count the number of nodes in it c. Perform Insertion and Deletion at End of SLL d. Perform Insertion and Deletion at Front of SLL e. Demonstrate how this SLL can be used as STACK and QUEUE f. Exit A linked-list is a sequence of data structures which are connected together via links. Linked List is a sequence of links which contains items. Each link contains a connection to another link. Linked list the second most used data structure after array. Following are important terms to understand the concepts of Linked List. Link − Each Link of a linked list can store a data called an element. Next − Each Link of a linked list contain a link to next link called Next. LinkedList − A LinkedList contains the connection link to the first Link called First.

Linked List Representation Linked list can be visualized as a chain of nodes, where every node points to the next node. The image cannot be display ed. Your computer may not hav e enough memory to open the image, or the image may hav e been corrupted. Restart y our computer, and then open the file again. If the red x still appears, y ou may hav e to delete the image and then insert it again.

As per above shown illustration, following are the important points to be considered. LinkedList contains an link element called first. Each Link carries a data field(s) and a Link Field called next. Each Link is linked with its next link using its next link. Last Link carries a Link as null to mark the end of the list.

Basic Operations Following are the basic operations supported by a list. Insertion − add an element at the beginning of the list. Deletion − delete an element at the beginning of the list. Display − displaying complete list.

Search − search an element using given key. Delete − delete an element using given key.

Insertion Operation Insertion is a three step process − Create a new Link with provided data. Point New Link to old First Link. Point First Link to this New Link. The image cannot be display ed. Your computer may not hav e enough memory to open the image, or the image may hav e been corrupted. Restart y our computer, and then open the file again. If the red x still appears, y ou may hav e to delete the image and then insert it again.

//insert link at the first location void insertFirst(int key, int data){ //create a link struct node *link = (struct node*) malloc(sizeof(struct node)); link->key = key; link->data = data;

//point it to old first node link->next = head; //point first to new first node head = link; }

Deletion Operation Deletion is a two step process − Get the Link pointed by First Link as Temp Link. Point First Link to Temp Link's Next Link. The image cannot be display ed. Your computer may not hav e enough memory to open the image, or the image may hav e been corrupted. Restart y our computer, and then open the file again. If the red x still appears, y ou may hav e to delete the image and then insert it again.

//delete first item struct node* deleteFirst(){ //save reference to first link struct node *tempLink = head; //mark next to first link as first head = head->next; //return the deleted link return tempLink; }

Navigation Operation Navigation is a recursive step process and is basis of many operations like search, delete etc. − Get the Link pointed by First Link as Current Link. Check if Current Link is not null and display it. Point Current Link to Next Link of Current Link and move to above step. The image cannot be display ed. Your computer may not hav e enough memory to open the image, or the image may hav e been corrupted. Restart y our computer, and then open the file again. If the red x still appears, y ou may hav e to delete the image and then insert it again.

//display the list void printList(){ struct node *ptr = head; printf("\n[ "); //start from the beginning while(ptr != NULL){ printf("(%d,%d) ",ptr->key,ptr->data); ptr = ptr->next; } printf(" ]"); }

Exp. 8

Design, Develop and Implement a menu driven Program in C for the following operations on Doubly Linked List (DLL) of Employee Data with the fields: SSN, Name, Dept, Designation, Sal, PhNo a. Create a DLL of N Employees Data by using end insertion. b. Display the status of DLL and count the number of nodes in it c. Perform Insertion and Deletion at End of DLL d. Perform Insertion and Deletion at Front of DLL e. Demonstrate how this DLL can be used as Double Ended Queue f. Exit Doubly Linked List is a variation of Linked list in which navigation is possible in both ways either forward and backward easily as compared to Single Linked List. Following are important terms to understand the concepts of doubly Linked List Link − Each Link of a linked list can store a data called an element. Next − Each Link of a linked list contain a link to next link called Next. Prev − Each Link of a linked list contain a link to previous link called Prev. LinkedList − A LinkedList contains the connection link to the first Link called First and to the last link called Last.

Doubly Linked List Representation The image cannot be display ed. Your computer may not hav e enough memory to open the image, or the image may hav e been corrupted. Restart y our computer, and then open the file again. If the red x still appears, y ou may hav e to delete the image and then insert it again.

As per above shown illustration, following are the important points to be considered. Doubly LinkedList contains an link element called first and last. Each Link carries a data field(s) and a Link Field called next. Each Link is linked with its next link using its next link. Each Link is linked with its previous link using its prev link. Last Link carries a Link as null to mark the end of the list.

Insertion Operation Following code demonstrate insertion operation at beginning in a doubly linked list. //insert link at the first location void insertFirst(int key, int data) {

//create a link struct node *link = (struct node*) malloc(sizeof(struct node)); link->key = key; link->data = data; if(isEmpty()) { //make it the last link last = link; }else { //update first prev link head->prev = link; } //point it to old first link link->next = head; //point first to new first link head = link; }

Deletion Operation Following code demonstrate deletion operation at beginning in a doubly linked list. //delete first item struct node* deleteFirst() { //save reference to first link struct node *tempLink = head; //if only one link if(head->next == NULL) { last = NULL; }else { head->next->prev = NULL; } head = head->next; //return the deleted link return tempLink; }

Insertion at End Operation Following code demonstrate insertion operation at last position in a doubly linked list. //insert link at the last location void insertLast(int key, int data) { //create a link struct node *link = (struct node*) malloc(sizeof(struct node)); link->key = key; link->data = data;

if(isEmpty()) { //make it the last link last = link; }else { //make link a new last link last->next = link; //mark old last node as prev of new link link->prev = last; } //point last to new last node last = link; }

Exp. 9 Design, Develop and Implement a Program in C for the following operations on Singly Circular Linked List (SCLL) with header nodes a. Represent and Evaluate a Polynomial P(x,y,z) = 6x2y2z-4yz5+3x3yz+2xy5z-2xyz3 b. Find the sum of two polynomials POLY1(x,y,z) and POLY2(x,y,z) and store the result in POLYSUM(x,y,z)

Support the program with appropriate functions for each of the above operations Circular Linked List is a variation of Linked list in which first element points to last element and last element points to first element. Both Singly Linked List and Doubly Linked List can be made into as circular linked list.

Singly Linked List as Circular In singly linked list, the next pointer of the last node points to the first node. The image cannot be display ed. Your computer may not hav e enough memory to open the image, or the image may hav e been corrupted. Restart y our computer, and then open the file again. If the red x still appears, y ou may hav e to delete the image and then insert it again.

Doubly Linked List as Circular In doubly linked list, the next pointer of the last node points to the first node and the previous pointer of the first node points to the last node making the circular in both directions. The image cannot be display ed. Your computer may not hav e enough memory to open the image, or the image may hav e been corrupted. Restart y our computer, and then open the file again. If the red x still appears, y ou may hav e to delete the image and then insert it again.

As per above shown illustrations, following are the important points to be considered. Last Link's next points to first link of the list in both cases of singly as well as doubly linked list. First Link's prev points to the last of the list in case of doubly linked list.

Basic Operations Following are the important operations supported by a circular list. insert − insert an element in the start of the list. delete − insert an element from the start of the list. display − display the list.

Insertion Operation Following code demonstrate insertion operation at in a circular linked list based on single linked list. //insert link at the first location void insertFirst(int key, int data) {

//create a link struct node *link = (struct node*) malloc(sizeof(struct node)); link->key = key; link->data= data; if (isEmpty()) { head = link; head->next = head; }else { //point it to old first node link->next = head; //point first to new first node head = link; } }

Deletion Operation Following code demonstrate deletion operation at in a circular linked list based on single linked list. //delete first item struct node * deleteFirst() { //save reference to first link struct node *tempLink = head; if(head->next == head){ head = NULL; return tempLink; } //mark next to first link as first head = head->next; //return the deleted link return tempLink; }

Adding two polynomials are represented using linked lists Representation of a Polynomial: A polynomial is an expression that contains more than two terms. A term is made up of coefficient and exponent. An example of polynomial is P(x) = 4x3+6x2+7x+9 A polynomial thus may be represented using arrays or linked lists. Array representation assumes that the exponents of the given expression are arranged from 0 to the highest value (degree),

which is represented by the subscript of the array beginning with 0. The coefficients of the respective exponent are placed at an appropriate index in the array. The array representation for the above polynomial expression is given below: The image cannot be display ed. Your computer may not hav e enough memory to open the image, or the image may hav e been corrupted. Restart y our computer, and then open the file again. If the red x still appears, y ou may hav e to delete the image and then insert it again.

A polynomial may also be represented using a linked list. A structure may be defined such that it contains two parts- one is the coefficient and second is the corresponding exponent. The structure definition may be given as shown below: struct polynomial { int coefficient; int exponent; struct polynomial *next; }; Thus the above polynomial may be represented using linked list as shown below: The image cannot be display ed. Your computer may not hav e enough memory to open the image, or the image may hav e been corrupted. Restart y our computer, and then open the file again. If the red x still appears, y ou may hav e to delete the image and then insert it again.

Addition of two Polynomials: For adding two polynomials using arrays is straightforward method, since both the arrays may be added up element wise beginning from 0 to n-1, resulting in addition of two polynomials. Addition of two polynomials using linked list requires comparing the exponents, and wherever the exponents are found to be same, the coefficients are added up. For terms with different exponents, the complete term is simply added to the result thereby making it a part of addition result. The complete program to add two polynomials is given in subsequent section. void my_add_poly(my_poly ** result, my_poly * poly1, my_poly * poly2) { my_poly * tmp_node; //Temporary storage for the linked list tmp_node = (my_poly *) malloc(sizeof(my_poly)); tmp_node->next = NULL; *result = tmp_node; //Copy the head address to the result linked list //Loop while both of the linked lists have value while(poly1 && poly2) { if (poly1->pow > poly2->pow) { tmp_node->pow = poly1->pow; tmp_node->coeff = poly1->coeff; poly1 = poly1->next; } else if (poly1->pow < poly2->pow) { tmp_node->pow = poly2->pow;

tmp_node->coeff = poly2->coeff; poly2 = poly2->next; } else { tmp_node->pow = poly1->pow; tmp_node->coeff = poly1->coeff + poly2->coeff; poly1 = poly1->next; poly2 = poly2->next; } //Grow the linked list on condition if(poly1 && poly2) { tmp_node->next = (my_poly *) malloc(sizeof(my_poly)); tmp_node = tmp_node->next; tmp_node->next = NULL; } } //Loop while either of the linked lists has value while(poly1 || poly2) { //We have to create the list at beginning //As the last while loop will not create any unnecessary node tmp_node->next = (my_poly *) malloc(sizeof(my_poly)); tmp_node = tmp_node->next; tmp_node->next = NULL; if(poly1) { tmp_node->pow = poly1->pow; tmp_node->coeff = poly1->coeff; poly1 = poly1->next; } if(poly2) { tmp_node->pow = poly2->pow; tmp_node->coeff = poly2->coeff; poly2 = poly2->next; } } printf("\nAddition Complete"); }

Exp. 10 Design, Develop and Implement a menu driven Program in C for the following operations on Binary Search Tree (BST) of Integers a. Create a BST of N Integers: 6, 9, 5, 2, 8, 15, 24, 14, 7, 8, 5, 2 b. Traverse the BST in Inorder, Preorder and Post Order c. Search the BST for a given element (KEY) and report the appropriate message d. Delete an element(ELEM) from BST e. Exit A binary search tree (BST) is a tree in which all nodes follows the below mentioned properties − The left sub-tree of a node has key less than or equal to its parent node's key. The right sub-tree of a node has key greater than or equal to its parent node's key. Thus, a binary search tree (BST) divides all its sub-trees into two segments; left sub-tree and right sub-tree and can be defined as − left_subtree (keys) ≤ node (key) ≤ right_subtree (keys)

Representation BST is a collection of nodes arranged in a way where they maintain BST properties. Each node has key and associated value. While searching, the desired key is compared to the keys in BST and if found, the associated value is retrieved. An example of BST − The image cannot be display ed. Your computer may not hav e enough memory to open the image, or the image may hav e been corrupted. Restart y our computer, and then open the file again. If the red x still appears, y ou may hav e to delete the image and then insert it again.

We observe that the root node key (27) has all less-valued keys on the left sub-tree and higher valued keys on the right sub-tree.

Basic Operations Following are basic primary operations of a tree which are following. Search − search an element in a tree. Insert − insert an element in a tree. Preorder Traversal − traverse a tree in a preorder manner. Inorder Traversal − traverse a tree in an inorder manner. Postorder Traversal − traverse a tree in a postorder manner.

Node Define a node having some data, references to its left and right child nodes. struct node { int data; struct node *leftChild; struct node *rightChild; };

Search Operation Whenever an element is to be search. Start search from root node then if data is less than key value, search element in left subtree otherwise search element in right subtree. Follow the same algorithm for each node. struct node* search(int data){ struct node *current = root;

printf("Visiting elements: "); while(current->data != data){ if(current != NULL) { printf("%d ",current->data); //go to left tree if(current->data > data){ current = current->leftChild; }//else go to right tree else { current = current->rightChild; } //not found if(current == NULL){ return NULL; } } } return current; }

Insert Operation Whenever an element is to be inserted. First locate its proper location. Start search from root node then if data is less than key value, search empty location in left subtree and insert the data. Otherwise search empty location in right subtree and insert the data. void insert(int data){ struct node *tempNode = (struct node*) malloc(sizeof(struct node)); struct node *current; struct node *parent; tempNode->data = data; tempNode->leftChild = NULL; tempNode->rightChild = NULL; //if tree is empty if(root == NULL){ root = tempNode; }else { current = root; parent = NULL; while(1){ parent = current; //go to left of the tree if(data < parent->data){ current = current->leftChild; //insert to the left if(current == NULL){

parent->leftChild = tempNode; return; } }//go to right of the tree else{ current = current->rightChild; //insert to the right if(current == NULL){ parent->rightChild = tempNode; return; } } } } }

Exp. 11 Design, Develop and Implement a Program in C for the following operations on Graph(G) of Cities a. Create a Graph of N cities using Adjacency Matrix. b. Print all the nodes reachable from a given starting node in a digraph using BFS method c. Check whether a given graph is connected or not using DFS method.

DepthDepth-first search (DFS) There are various ways to traverse (visit all the nodes) of a graph systematically. A couple of these ways (depth-first and breadth-first) give us some information about graph structure (e.g. connectedness). In depth-first search the idea is to travel as deep as possible from neighbour to neighbour before backtracking. What determines how deep is possible is that you must follow edges, and you don't visit any vertex twice. To do this properly we need to keep track of which vertices have already been visited, plus how we got to (the path to...) where we currently are, so that we can backtrack. We could keep track of which nodes were visited in a boolean array, and a stack to push nodes onto that we mean to

visit (the course Readings have a recursive algorithm for DFS which takes a slightly different approach). Here's some pseudocode: DFS(G,v) ( v is the vertex where the search starts ) Stack S := {}; ( start with an empty stack ) for each vertex u, set visited[u] := false; push S, v; while (S is not empty) do u := pop S; if (not visited[u]) then visited[u] := true; for each unvisited neighbour w of u push S, w; end if end while END DFS()

It would probably be useful to keep track of the edges we used to visit the vertices, since these edges would span the vertices visited. One way to do this is with another array predecessor[u] which indicates which vertex was reached from. When we are processing the neighbours of, say, vertex , for each neighbour (say ) of that we push onto the stack, we set predecessor[v] to . Eventually we end up with a tree: an acyclic, connected graph of all the vertices that can be reached from our starting point. The image cannot be display ed . Your c…

The image cannot be display ed . Your c…

The image cannot be display e d. You…

The image cannot be display ed . Your c…

The image cannot be display ed . Your c…

The image cannot be display ed. Your computer may not hav e enough memory to o

What happens if our original graph isn't connected? Then DFS(G,v) won't visit any vertices that aren't connected to its starting point. You'll need an outer loop that iterates over unvisited vertices, and then calls DFS(G,V). The end result is a forest (a collection of trees) representing the connected components of

The image cannot be display ed. Your computer may not hav e enough memory to o

.

Exp. 12 Given a File of N employee records with a set K of Keys(4-digit) which uniquely determine the records in file F. Assume that file F is maintained in memory by a Hash Table(HT) of m memory locations with L as the set of memory addresses (2-digit) of locations in HT. Let the keys in K and addresses in L are Integers. Design and develop a Program in C that uses Hash function H: K →L as H(K)=K mod m (remainder method), and implement hashing technique to map a given key K to the address space L. Resolve the collision (if any) using linear probing.

Direct-address table If the keys are drawn from the reasoning small universe U = {0, 1, . . . , m-1} of keys, a solution is to use a Table T[0, . m-1], indexed by keys. To represent the dynamic set, we use an array, or direct-address table, denoted by T[0 . . m-1], in which each slot corresponds to a key in the universe. Following figure illustrates the approach.

The image cannot be display ed. Your computer may not hav e enough memory to open the image, or the image may hav e been corrupted. Restart y our computer, and then open the file again. If the red x still appears, y ou may hav e to delete the image and then insert it again.

Each key in the universe U i.e., Collection, corresponds to an index in the table T[0 . . m-1]. Using this approach, all three basic operations (dictionary operations) take θ(1) in the worst case.

Hash Tables When the size of the universe is much larger the same approach (direct address table) could still work in principle, but the size of the table would make it impractical. A solution is to map the keys onto a small range, using a function called a hash function. The resulting data structure is called hash table. With direct addressing, an element with key k is stored in slot k. With hashing =, this same element is stored in slot h(k); that is we use a hash function h to compute the slot from the key. Hash function maps the universe U of keys into the slot of a hash table T[0 . . .m-1]. h: U → {0, 1, . . ., m-1} More formally, suppose we want to store a set of size n in a table of size m. The ratio α = n/m is called a load factor, that is, the average number of elements stored in a Table. Assume we have a hash function h that maps each key k U to an integer name h(k) [0 . . m-1]. The basic idea is to store key k in location T[h(k)]. The imag e cann ot

The imag e cann ot

Typical, hash functions generate "random looking" valves. For example, the following function usually works well h(k) = k mod m where m is a prime number. Is there any point of the hash function? Yes, the point of the hash function is to reduce the range of array indices that need to be handled.

Collision As keys are inserted in the table, it is possible that two keys may hash to the same table slot. If the hash function distributes the elements uniformly over the table, the number of conclusions cannot be too large on the average, but the birthday paradox makes it very likely that there will be at least one collision, even for a lightly loaded table The image cannot be display ed. Your computer may not hav e enough memory to open the image, or the image may hav e been corrupted. Restart y our computer, and then open the file again. If the red x still appears, y ou may hav e to delete the image and then insert it again.

A hash function h map the keys k and j to the same slot, so they collide. There are two basic methods for handling collisions in a hash table: Chaining and Open addressing.

Collision Resolution by Chaining When there is a collision (keys hash to the same slot), the incoming keys is stored in an overflow area and the corresponding record is appeared at the end of the linked list.

The image cannot be display ed. Your computer may not hav e enough memory to open the image, or the image may hav e been corrupted. Restart y our computer, and then open the file again. If the red x still appears, y ou may hav e to delete the image and then insert it again.

Each slot T[j] contains a linked list of all the keys whose hash value is j. For example, h(k1) = h(kn) and h(k5) = h(k2) = h(k7). The worst case running time for insertion is O(1). Deletion of an element x can be accomplished in O(1) time if the lists are doubly linked. In the worst case behavior of chain-hashing, all n keys hash to the same slot, creating a list of length n. The worst-case time for search is thus θ(n) plus the time to compute the hash function. keys: 5, 28, 19, 15, 20, 33, 12, 17, 10 slots: 9 hash function = h(k) = k mod 9 h(5) = 5 mod 9 = 4 h(28) = 28 mod 9 = 1 h(19) = 19 mod 9 = 1 h(15) = 15 mod 9 = 6 h(20) = 20 mod 9 = 2 h(33) = 33 mod 9 = 6 h(12) = 12mod 9 = 3 h(17) = 17 mod 9 = 8 h(10) = 10 mod 9 = 1

A good hash function satisfies the assumption of simple uniform hashing, each element is equally likely to hash into any of the m slots, independently of where any other element has hash to. But usually it is not possible to check this condition because one rarely knows the probability distribution according to which the keys are drawn.

In practice, we use heuristic techniques to create a hash function that perform well. One good approach is to derive the hash value in a way that is expected to be independent of any patterns that might exist in the data (division method). Most hash function assume that the universe of keys is the set of natural numbers. Thus, its keys are not natural to interpret than as natural numbers.

Method for Creating Hash Function 1. The division method. 2. The multiplication method. 3. Universal hashing.

1. The Division Method Map a key k into one of m slots by taking the remainder of k divided by m. That is, the hash function is h(k) = k mod m. Example: If table size m = 12 key k = 100 than h(100) = 100 mod 12 =4

Poor choices of m m should not be a power of 2, since if m = 2p, then h(k) is just the p lowestorder bits of k. So, 2p may be a poor choice, because permuting the characters of k does not change value.

Good m choice of m A prime not too close to an exact of 2.

2. The Multiplication Method Two step process Step 1: Multiply the key k by a constant 0< A < 1 and extract the fraction part of kA. Step 2: Multiply kA by m and take the floor of the result.

The hash function using multiplication method is: h(k) = m(kA mod 1) where "kA mod 1" means the fractional part of kA, that is, kA - kA. Advantage of this method is that the value of m is not critical and can be implemented on most computers. A reasonable value of constant A is ≈ (sqrt5 - 1) /2 suggested by Knuth's Art of Programming.

3. Universal Hashing Open Addressing This is another way to deal with collisions. In this technique all elements are stored in the hash table itself. That is, each table entry contains either an element or NIL. When searching for element (or empty slot), we systematically examine slots until we found an element (or empty slot). There are no lists and no elements stored outside the table. That implies that table can completely "fill up"; the load factor α can never exceed 1.Advantage of this technique is that it avoids pointers (pointers need space too). Instead of chasing pointers, we compute the sequence of slots to be examined. To perform

insertion, we successively examine or probe, the hash table until we find an empty slot. The sequence of slots probed "depends upon the key being inserted." To determine which slots to probe, the hash function includes the probe number as a second input. Thus, the hash function becomes h:

The image cannot be display ed. Your compu ter

× {0, 1, . . . m -1 }--> {0, 1, . . . , m-1}

and the probe sequence < h(k, 0), h(k, 1), . . . , h(k, m-1) > in which every slot is eventually considered. Pseudocode for Insertion HASH-INSERT (T, k) i=0 reepeat j for each key k. Uniform hashing required are not met. Since none of these techniques capable of generating more than m2 probe sequences (instead of m!). Uniform Hashing Each key is equally likely to have any of the m! permutation of < 0, 1, . . . , m-1> as its probe sequence. Note that uniform hashing generalizes the notion of simple uniform hashing