Error correction codes I. Error correction codes I. First, back to Lempel- Ziv(-Welch)

Error correction codes I Error correction codes I Not in textbook :( Not in textbook :( First, back to LempelZiv(-Welch) 00000000011110000010 • ...
0 downloads 0 Views 6MB Size
Error correction codes I

Error correction codes I

Not in textbook :(

Not in textbook :(

First, back to LempelZiv(-Welch)

00000000011110000010



a

00000

b

00001

...

...

same

y

11001

• Just follow along and build the dictionary

z

11010

We went over a couple of examples of compressing with LZW

• I said decompressing is pretty much the



11011

• There’s one complication

00000000011110000010 a

00000

b

a

00000



11011

00001

b

00001

a?

11100

...

...

...

...

y

11001

y

11001

z

11010

z

11010

a



11011

00000000011110000010

a

00000000011110000010

00000000011110000010

a

00000



11011

a

00000



11011

b

00001

ab

11100

b

00001

ab

11100

...

...

b?

11101

...

...

ba

11101

y

11001

y

11001

ab?

11110

z

11010

z

11010

ab

abab

00000000011110000010 a

00000



11011

b

00001

ab

11100

...

...

ba

11101

y

11001

abc

11110

z

11010

c?

11111

ababc

Voyager probes •

Launched in 1977 (and still operational)



Voyager 2 is about 90AU (12 light hours) away. Voyager I is about 110AU (15 light hours) away



Signal strengths are weak (bandwidth is small)



Latency is enormous

I love you! 7

I love you! 7

I hate you! 7

7 doesn’t make sense!

I hate you! 7

7 doesn’t make sense!

Wait what?

7 doesn’t make sense!

Wait what?

I love you! 7 I love you! 7 I love you! 7 I love you! 7 I love you! 7

I love you! 7

I love you! 7 I love you! 7 I love you! 7 I love you! 7 I love you! 7

Backstreet’s back, alright! 7 I am your father! 7 I hate you! 7 I love you! 7 Yo momma so fat! 7

Backstreet’s back, alright! 7 I am your father! 7 I hate you! 7 I love you! 7 Yo momma so fat! 7

I love you! 7

Error correction Positives

Negatives

Little redundancy if little noise

Latency

Pre-emptive retransmission

Little latency

Enormous redundancy

Codes

Little redundancy, little latency

More computation

Retransmission

Shannon’s theorem (the first) revisited • Remember Shannon’s theorem? • B is bandwidth, S/N is signal-to-noise ratio, C is capacity:



• Shannon’s theorem says something about digital noise as well

Shannon’s theorem part 2 • C is the channel capacity (ignoring noise) • For any ! > 0 and rate R < C, there exists a (possibly large) N such that we can send R information with error ! using codes of length N



For R > C, it is hopeless to get ! close to 0 (but the theorem does let us get it lower than the noise level)

Shannon’s theorem implications • Shannon’s theorem says nothing about the possibility if ! = 0, only ! being arbitrarily close to 0

• Intuitively we can see that we can never get !=0

• No matter how noisy a channel gets (even

99.99% noise), we can get ! as close to 0 as we like

Shannon’s theorem implications • Shannon’s theorem has a non-constructive proof

Hamming codes •

Richard Hamming worked with Claude Shannon at Bell Labs after WWII



The first error correcting code (“Hamming Code”) was published in 1950

• For a given noise level and a given !, determining codes is undecidable in general :(

• Good news: we’ve already discovered some kick ass error correction codes

! " S C = B log2 10 log10 N

Hamming(7, 4) codes

Hamming(7, 4) codes

• Encode 4 bits of information into 7 bits • I.e., 4 bits of entropy and 3 bits of

• Idea: interpreting the errors in our 3 parity bits will give us an index to the bit that got flipped

redundancy

par 1 par 2 data 1 par 3 data 2 data 3 data 4

par 1 par 2 data 1 par 3 data 2 data 3 data 4

parity 1

parity 1

parity 2

parity 2

parity 3

parity 3



Generating parity bits

    G=    

• We pass through the data bits unmolested • We construct the parity bits according to the table we just had

• Actually we can be more mathy and use a matrix instead....

1 1 1 0 0 0 0

1 0 0 1 1 0 0

0 1 0 1 0 1 0

1 1 0 1 0 0 1

         

par 1 par 2 data 1 par 3 data 2 data 3 data 4 parity 1 parity 2 parity 3



    G=    

1 1 1 0 0 0 0

1 0 0 1 1 0 0

0 1 0 1 0 1 0

1 1 0 1 0 0 1

         



 1  0   d=  1  0

         

1 1 1 0 0 0 0

1 0 0 1 1 0 0

par 1 par 2 data 1 par 3 data 2 data 3 data 4

0 1 0 1 0 1 0

1 1 01 1 0 0 1



      ·     



   1   0  =   1   0 

1 2 1 1 0 1 0





        ≡        

1 0 1 1 0 1 0



     (mod 2)    

par 1 par 2 data 1 par 3 data 2 data 3 data 4

parity 1

parity 1

parity 2

parity 2

parity 3

parity 3

Decoding • Like in error detection, error correction

decoding in this involves recalculating the parity bits



1 0 H= 0 1 0 0

• If the parity bits are all 0, we did not an error

1 0 0 1 1 1

 1 1  1

par 1 par 2 data 1 par 3 data 2 data 3 data 4

detect an error

• If any of the parity bits are 1, we detected

1 0 1 0 0 1

parity 1 parity 2 parity 3



1 0 H= 0 1 0 0

1 0 1 0 0 1

1 0 0 1 1 1

 1 1  1



    x! =     

1 0 1 1 0 1 0

         



1 0  0 1 0 0

par 1 par 2 data 1 par 3 data 2 data 3 data 4 parity 1

parity 2

parity 2

parity 3

parity 3

1 0 H= 0 1 0 0

1 0 1 0 0 1

1 0 0 1 1 1

 1 1  1

1 0 0 1 1 1

    1  1 ·   1  

1 0 1 1 0 1 0



       2 0   =  2  ≡  0  (mod 2)   2 0  

par 1 par 2 data 1 par 3 data 2 data 3 data 4

parity 1



1 0 1 0 0 1





1 0 H= 0 1 0 0

par 1 par 2 data 1 par 3 data 2 data 3 data 4

1 0 1 0 0 1

1 0 0 1 1 1

 1 1  1



    x! =     

1 0 1 1 0 1 0

         

par 1 par 2 data 1 par 3 data 2 data 3 data 4

parity 1

parity 1

parity 2

parity 2

parity 3

parity 3



1 0 H= 0 1 0 0

1 0 1 0 0 1

1 0 0 1 1 1

 1 1  1



    x! =     

1 0 1 1 1 1 0

         



1 0  0 1 0 0

par 1 par 2 data 1 par 3 data 2 data 3 data 4 parity 1

parity 2

parity 2

parity 3

parity 3

• Sweet deal • The correction code told us that bit (101)2 = 5 got flipped

• We just flip it back and we’re good to go

• Sadly Hamming codes can only correct if there is only 1 error

• We can detect 2 bits of error but if we

1 0 0 1 1 1

    1  1 ·   1  

1 0 1 1 1 1 0



       3 1   =  2  ≡  0  (mod 2)   3 1  

par 1 par 2 data 1 par 3 data 2 data 3 data 4

parity 1

Hamming error correction

1 0 1 0 0 1



A note on the matrices • Why were the bits arranged the way they were?

• P, P, D, P, D, D, D • Why not D, D, D, D, P, P, P or something?

• What we looked at were (7, 4) Hamming

codes, but there are an infinite number of Hamming codes

can’t correct for them properly

General Hamming codes

Detection vs. Correction

• For any integer k greater than 2, there’s a

• “It was the best of times, it was the blurst

code of length 2k - 1. For every bit i:

• If i is a power of 2, bit i is a parity bit • Otherwise, bit i is a data bit

• Construct the matrix as before, making

sure that every non-zero binary number of length k is in there exactly once

• E.g., (15, 11) codes, (31, 26) codes, ....

of times”

• “I shot his good” • More redundancy generally leads to better error-correction abilities

• Redundancy by itself is useless...

Detection vs. Correction • Detection requires less redundancy • I.e., catching 1 bit of error is much easier than correcting 1 bit of error

• Detection requires a retransmission (or even outright failure!) on noise

• Sometimes retransmission is not possible/ practical

• Ever got stuck at 99% on a Torrent?

Looking forward to Wednesday

• Hamming(7, 4) isn’t very efficient • Every bit of data has 4/7 bits of redundancy

• Modern day ECCs are much more efficient • Voyager 2 uses the Reed-Solomon(255, 233) codes

• Decoding is much more fun... :)