Error correction codes I
Error correction codes I
Not in textbook :(
Not in textbook :(
First, back to LempelZiv(-Welch)
00000000011110000010
•
a
00000
b
00001
...
...
same
y
11001
• Just follow along and build the dictionary
z
11010
We went over a couple of examples of compressing with LZW
• I said decompressing is pretty much the
11011
• There’s one complication
00000000011110000010 a
00000
b
a
00000
11011
00001
b
00001
a?
11100
...
...
...
...
y
11001
y
11001
z
11010
z
11010
a
11011
00000000011110000010
a
00000000011110000010
00000000011110000010
a
00000
11011
a
00000
11011
b
00001
ab
11100
b
00001
ab
11100
...
...
b?
11101
...
...
ba
11101
y
11001
y
11001
ab?
11110
z
11010
z
11010
ab
abab
00000000011110000010 a
00000
11011
b
00001
ab
11100
...
...
ba
11101
y
11001
abc
11110
z
11010
c?
11111
ababc
Voyager probes •
Launched in 1977 (and still operational)
•
Voyager 2 is about 90AU (12 light hours) away. Voyager I is about 110AU (15 light hours) away
•
Signal strengths are weak (bandwidth is small)
•
Latency is enormous
I love you! 7
I love you! 7
I hate you! 7
7 doesn’t make sense!
I hate you! 7
7 doesn’t make sense!
Wait what?
7 doesn’t make sense!
Wait what?
I love you! 7 I love you! 7 I love you! 7 I love you! 7 I love you! 7
I love you! 7
I love you! 7 I love you! 7 I love you! 7 I love you! 7 I love you! 7
Backstreet’s back, alright! 7 I am your father! 7 I hate you! 7 I love you! 7 Yo momma so fat! 7
Backstreet’s back, alright! 7 I am your father! 7 I hate you! 7 I love you! 7 Yo momma so fat! 7
I love you! 7
Error correction Positives
Negatives
Little redundancy if little noise
Latency
Pre-emptive retransmission
Little latency
Enormous redundancy
Codes
Little redundancy, little latency
More computation
Retransmission
Shannon’s theorem (the first) revisited • Remember Shannon’s theorem? • B is bandwidth, S/N is signal-to-noise ratio, C is capacity:
•
• Shannon’s theorem says something about digital noise as well
Shannon’s theorem part 2 • C is the channel capacity (ignoring noise) • For any ! > 0 and rate R < C, there exists a (possibly large) N such that we can send R information with error ! using codes of length N
•
For R > C, it is hopeless to get ! close to 0 (but the theorem does let us get it lower than the noise level)
Shannon’s theorem implications • Shannon’s theorem says nothing about the possibility if ! = 0, only ! being arbitrarily close to 0
• Intuitively we can see that we can never get !=0
• No matter how noisy a channel gets (even
99.99% noise), we can get ! as close to 0 as we like
Shannon’s theorem implications • Shannon’s theorem has a non-constructive proof
Hamming codes •
Richard Hamming worked with Claude Shannon at Bell Labs after WWII
•
The first error correcting code (“Hamming Code”) was published in 1950
• For a given noise level and a given !, determining codes is undecidable in general :(
• Good news: we’ve already discovered some kick ass error correction codes
! " S C = B log2 10 log10 N
Hamming(7, 4) codes
Hamming(7, 4) codes
• Encode 4 bits of information into 7 bits • I.e., 4 bits of entropy and 3 bits of
• Idea: interpreting the errors in our 3 parity bits will give us an index to the bit that got flipped
redundancy
par 1 par 2 data 1 par 3 data 2 data 3 data 4
par 1 par 2 data 1 par 3 data 2 data 3 data 4
parity 1
parity 1
parity 2
parity 2
parity 3
parity 3
Generating parity bits
G=
• We pass through the data bits unmolested • We construct the parity bits according to the table we just had
• Actually we can be more mathy and use a matrix instead....
1 1 1 0 0 0 0
1 0 0 1 1 0 0
0 1 0 1 0 1 0
1 1 0 1 0 0 1
par 1 par 2 data 1 par 3 data 2 data 3 data 4 parity 1 parity 2 parity 3
G=
1 1 1 0 0 0 0
1 0 0 1 1 0 0
0 1 0 1 0 1 0
1 1 0 1 0 0 1
1 0 d= 1 0
1 1 1 0 0 0 0
1 0 0 1 1 0 0
par 1 par 2 data 1 par 3 data 2 data 3 data 4
0 1 0 1 0 1 0
1 1 01 1 0 0 1
·
1 0 = 1 0
1 2 1 1 0 1 0
≡
1 0 1 1 0 1 0
(mod 2)
par 1 par 2 data 1 par 3 data 2 data 3 data 4
parity 1
parity 1
parity 2
parity 2
parity 3
parity 3
Decoding • Like in error detection, error correction
decoding in this involves recalculating the parity bits
1 0 H= 0 1 0 0
• If the parity bits are all 0, we did not an error
1 0 0 1 1 1
1 1 1
par 1 par 2 data 1 par 3 data 2 data 3 data 4
detect an error
• If any of the parity bits are 1, we detected
1 0 1 0 0 1
parity 1 parity 2 parity 3
1 0 H= 0 1 0 0
1 0 1 0 0 1
1 0 0 1 1 1
1 1 1
x! =
1 0 1 1 0 1 0
1 0 0 1 0 0
par 1 par 2 data 1 par 3 data 2 data 3 data 4 parity 1
parity 2
parity 2
parity 3
parity 3
1 0 H= 0 1 0 0
1 0 1 0 0 1
1 0 0 1 1 1
1 1 1
1 0 0 1 1 1
1 1 · 1
1 0 1 1 0 1 0
2 0 = 2 ≡ 0 (mod 2) 2 0
par 1 par 2 data 1 par 3 data 2 data 3 data 4
parity 1
1 0 1 0 0 1
1 0 H= 0 1 0 0
par 1 par 2 data 1 par 3 data 2 data 3 data 4
1 0 1 0 0 1
1 0 0 1 1 1
1 1 1
x! =
1 0 1 1 0 1 0
par 1 par 2 data 1 par 3 data 2 data 3 data 4
parity 1
parity 1
parity 2
parity 2
parity 3
parity 3
1 0 H= 0 1 0 0
1 0 1 0 0 1
1 0 0 1 1 1
1 1 1
x! =
1 0 1 1 1 1 0
1 0 0 1 0 0
par 1 par 2 data 1 par 3 data 2 data 3 data 4 parity 1
parity 2
parity 2
parity 3
parity 3
• Sweet deal • The correction code told us that bit (101)2 = 5 got flipped
• We just flip it back and we’re good to go
• Sadly Hamming codes can only correct if there is only 1 error
• We can detect 2 bits of error but if we
1 0 0 1 1 1
1 1 · 1
1 0 1 1 1 1 0
3 1 = 2 ≡ 0 (mod 2) 3 1
par 1 par 2 data 1 par 3 data 2 data 3 data 4
parity 1
Hamming error correction
1 0 1 0 0 1
A note on the matrices • Why were the bits arranged the way they were?
• P, P, D, P, D, D, D • Why not D, D, D, D, P, P, P or something?
• What we looked at were (7, 4) Hamming
codes, but there are an infinite number of Hamming codes
can’t correct for them properly
General Hamming codes
Detection vs. Correction
• For any integer k greater than 2, there’s a
• “It was the best of times, it was the blurst
code of length 2k - 1. For every bit i:
• If i is a power of 2, bit i is a parity bit • Otherwise, bit i is a data bit
• Construct the matrix as before, making
sure that every non-zero binary number of length k is in there exactly once
• E.g., (15, 11) codes, (31, 26) codes, ....
of times”
• “I shot his good” • More redundancy generally leads to better error-correction abilities
• Redundancy by itself is useless...
Detection vs. Correction • Detection requires less redundancy • I.e., catching 1 bit of error is much easier than correcting 1 bit of error
• Detection requires a retransmission (or even outright failure!) on noise
• Sometimes retransmission is not possible/ practical
• Ever got stuck at 99% on a Torrent?
Looking forward to Wednesday
• Hamming(7, 4) isn’t very efficient • Every bit of data has 4/7 bits of redundancy
• Modern day ECCs are much more efficient • Voyager 2 uses the Reed-Solomon(255, 233) codes
• Decoding is much more fun... :)