Domatic number 1. Greedy Algorithms. Guy Kortsarz

Domatic number 1 ✬ ✩ Greedy Algorithms Guy Kortsarz ✫ ✪ Domatic number 2 ✬ ✩ Greedy algorithm Say for example that we want to find a ”goo...
Author: Guest
6 downloads 0 Views 88KB Size
Domatic number

1





Greedy Algorithms

Guy Kortsarz





Domatic number

2





Greedy algorithm Say for example that we want to find a ”good” subset. A greedy algorithm will go over the elements and at every moment chooses the element that seems the best now. Seems unlikely that such a ”local” strategy would work. But it does for many problems. Showing that a greedy algorithm is optimal: Always the same way. Let A be the set of elements chosen so far. We need to prove: There exists some OP T so that A ⊆ OP T . In greedy you do not delete so one wrong move and you are done. Like in ”Riding in Cars with Boys”. ✫



Domatic number

3





Coin Exchange Input: An integer n and the definition of legal coins. Output: A collection of coins that adds to n. We always assume that there is a coin of value 1 so that there always be a solution. 1. R ← n; L ← ∅. 2. While R > 0 do (a) Let c be the largest coin value so that c≤R (b) Add c into the solution L. (c) R ← R − c 3. Output L.





Domatic number

4





Example T=66 We choose 25 twice. What remains is 16 We choose 10 once. What remains is 6. We choose 5 once And then 1 once.





Domatic number



5



A proof that it works for America coins The America coins lets say 1, 5, 10, 25. This implies some rules: 1. There are at most 4 1, as otherwise replace with a 5. 2. The 1, 5 can be at most 9 otherwise we replace by 10. Thus one 5 3. The 1, 5 and 10 can get to at most 24 as otherwise replace by 25, 4. This implies the greedy algorithm is correct. For example, as long as x ≥ 25 take 25 because the others sum to only 24. And the same for taking 10 and taking 5





Domatic number



6



But does it work in Sweden? For the sake of example let the coins be 1, 5, 11. Let us say that we want to get to 15 The greedy will take one 11 and 4 1. The optimum takes 3 times 5. The greedy does not work for any set of coins. Works for example if all numbers are a power of some constant.





Domatic number



7



Max independent set in a tree In a rooted tree we say that a set is independent, if no vertex and its parent belong to the set. Say that we need to choose who to invite to a party but P (v) and v can not be invited together (because they hate each other). We may assume the following simple property: Every leaf in the tree belongs to the independent set. Otherwise let ℓ be a leaf. Why is it not in the independent set? P (ℓ) must be there. Remove P (ℓ) and add ℓ The value is still optimal. Thus this works: put all leaves in, remove all their parents and recurse. The running time is O(n) ✫



Domatic number

8





An example of a run of the algorithm C B D

A

Y

E

I

F

X

G W

Z H

K U V

T J

L S R

M

Q N P O





Domatic number

9





What remains C

Y I W

Z

V





Domatic number

10





Activity selection We have a factory and a machine. Clients ask for maintenance starting at a certain time. We know how much each job takes. So this defines an interval [si , fi ] for job i: si is the time the costumer asked to start and fi − si is the time it takes to start to finish of the maintenance of job i. Say that we have an interval [2, 5]. Say that there is another client its interval is [4, 6]. Since there is only one machine, we can not choose both of them.





Domatic number



11



Problem definition continued Two jobs that can be selected together are called independent jobs. An independent set is a collection of jobs so that every two are independent. If si , fi and sj , fj are independent then either fi ≤ sj or fj ≤ si . Equality still means independence. Every job pays a dollar. The problem: Input: A collection of intervals (si , ti )ki=1 . Objective: Find a maximum size independent set





Domatic number

12





An example of an input for the activity selection problem E

C

F A

B

I

G A



B

H

G

D

H

E



Domatic number



13



A good rule of thumb? Some rules do not work such as choose the shortest interval and remove the conflicting interval and iterate. Also does not work ”Choose the interval that intersects the minimum number of intervals” does not work. The counter example a bit big. Also does not work: choose an interval that starts first remove its neighbors and iterate.





Domatic number



14



Rules that do not work CHOSING THE SHORTEST DOES NOT WORK

CHOOSING THE ONE WHO STARTS FIRST DOES NOT WORK





Domatic number

15





The reason it works

X

Z

Y

Y CAN NOT INTERSECT Z FOR OTHERWISE X END FIRST. THIS MEANS THAT Y INTERSECTS EXACTLY ONE INTERVAL IN OPT THUS THERE IS AN OPT CONTAINING Y REMOVE X FROM THE ABOVE AND ADD Y





Domatic number

16





New witness Let OP T1 be the witness that contains S. Set OP T2 ← OP T1 + Y − X We removed one interval and added another one hence still optimum. And contains Y . By the same reasoning, at every stage we can add the interval to end first from L. Hence the algorithm is optimal





Domatic number

17





Gas pumping Say that k cars come at the same time to a Gas station that has only one machine. Say that every car has a filling time si . If the cars are treated in the order 1, 2, . . . n Pi−1 then car i has to wait Fi = i=1 si Give a greedy algorithm that minimizes the Pk sum of waiting times i=1 si . Any suggested strategy?





Domatic number

18





Huffman trees We represent, say, the English alphabet. 26 letters. But also a versus A, etc. So 52 The characters ‘0′ , ‘1′ , . . . ‘9′ (not the numbers) And many other symbols such as: &, %, #, $ and so on. There exists a standard code. The ASCII code. American Standard code for information interchange. Since 8 bits we may represent up to 128 characters The eight bit was first designed to be a check bit (this is why 128 not 256). But now was extended to 256. ✫



Domatic number

19





Some examples 00000000 is the Null character 00000001 is the start of a header 00000011 a character that marks the end of text The letter of a to z are: 01100001 to 01111010 in a contiguous way. In many languages you can do “a” − “b” = 1 The capital letters are A = 01000001 and then contiguous. An extension exists known as Unicode 216 > 65000 characters





Domatic number



20



Compressing: Huffman Compression (save space) is important. For example, in a DNA sequence instead of writing AAAAAAAA, write A8. But, more frequent is the need to compress texts. Letter do not have the same frequency. Clearly A, a appear more than X, x Hence in saving space we should give frequent letters smaller codes Every text may be different (frequency of letters may vary) and so some times you compress only after computing the frequencies. The optimal compression method according to frequencies is due to Huffman ✫



Domatic number



21



The difficulty with varying sizes How would the computer know when a letter ends It will have a table with codes of letters. But what happens if A=001, B=0011? How can the computer tell them apart? Rule: any code of a letter is not the beginning (prefix) of the code for another letter. To do this we use : Hierarchical trees





Domatic number

22





1

0

0

1

0

1

I

O 0 E

A

1

0

1 B

0

1

Z X





Domatic number

23





Reading the codes The paths from root to leaf would be the code. I = 00, E = 010, A = 011, O = 10, B = 110, X = 1110, Z = 1111 The idea: frequent letters should get small code. Frequent, depends on the specific text Not enough, many times, as the application is typically “on-line” /Inorder Lempel-Ziv: well known on-line compression





Domatic number



24



A systematic way to create the tree The tree is built from the leaves up Each time, choose two (super)vertices with lowest frequency and make them brothers (sisters siblings). A greedy algorithm as goes for the two lowest. Making two letters sibling means that the path until their parent would be equal Which two letters should we “unite” first? Clearly, the two less frequent If A,B unit then create a virtual letter called AB with frequency equal to sum of frequencies Then continue in the same way





Domatic number

25





A specific input

Note: the numbers sum to 1. f (X) = 0.05, f (Y ) = 0.05, f (A) = 0.1, f (Z) = 0.2, f (B) = 0.2, f (C) = 0.4, What is the optimal code?





Domatic number

26





The resulting tree XYA ZBC 1

0 0.6

XYA ZB

C

0

0.2 0

0.1

XY

0

1

X

Y

0.05

0.05



0.4

XYA

0.4

1

1

0

ZB 1

A

Z

B

0.1

0.2

0.2



Suggest Documents