15-853:Algorithms in the Real World Error Correcting Codes I – Overview – Hamming Codes – Linear Codes Error Correcting Codes II (Reed-Solomon Codes) Error Correcting Codes III (LDPC/Expander Codes)

15-853

Page1

15-853

Page2

15-853

Page3

15-853

Page4



15-853

Page5

15-853

Page6

15-853

Page7

15-853

Page8



General Model message (m)

coder codeword (c)

noisy channel codeword’ (c’)

decoder message or error

Applications

Errors introduced by the noisy channel: • changed fields in the codeword (e.g. a flipped bit) • missing fields in the codeword (e.g. a lost byte). Called erasures How the decoder deals with errors. • error detection vs. • error correction 15-853

• • • • •

Storage: CDs, DVDs, cloud storage, NAND flash… Wireless: Cell phones, wireless links Satellite and Space: TV, Mars rover, … Digital Television: DVD, MPEG2 layover High Speed Modems: ADSL, DSL, ..

Reed-Solomon codes are by far the most used in practice, including pretty much all the examples mentioned above. LDPC codes used for 4G communication. Algorithms for decoding are quite sophisticated.

Page9

15-853

Block Codes

Block Codes

symbols

message (m)

s1 s2 s3 s4 s5 … sk sk+1 sk+2

block 1



s2k …



coder

block 2

codeword (c)

noisy channel message 1

codeword’ (c’)

message 2

decoder Other kind: convolutional codes (we won’t cover it)… 15-853

Page10

message or error Page11

Each message and codeword is of fixed size  = codeword alphabet k =|m| n = |c| q = || C  n (codewords) (x,y) = number of positions s.t. xi  yi d = min{(x,y) : x,y C, x  y} Code described as: (n,k,d)q 15-853

Page12



Hierarchy of Codes

Binary Codes

linear

C forms a linear subspace of n of dimension k

cyclic

C is linear and c0c1c2…cn-1 is a codeword implies c1c2…cn-1c0 is a codeword

BCH

Hamming

Bose-Chaudhuri-Hochquenghem

Reed-Solomon

These are all block codes. 15-853

Page13

15-853

Example of (6,3,3)2 systematic code message 000 001 010 011 100 101 110 111

codeword 000000 001011 010101 011110 100110 101101 110011 111000

Page14

Hypercube Interpretation Consider codewords as vertices on a hypercube. 110 010

Definition: A Systematic code is one in which the message appears in the codeword

111 011

100 000

101 001

codeword d = 2 = min distance n = 3 = dimensionality 2n = 8 = number of nodes

The distance between nodes on the hypercube is the Hamming distance . The minimum distance is d. 001 is equidistant from 000, 011 and 101. For s-bit error detection d  s + 1 For s-bit error correction d  2s + 1 15-853

Page15

15-853

Page16



Error Detection with Parity Bit

Error Correcting One Bit Messages

A (k+1,k,2)2 systematic code Encoding: m1m2…mk  m1m2…mkpk+1 where pk+1 = m1  m2  …  mk

How many bits do we need to correct a one bit error on a one bit message? 110

10

111

010

11

011 100

d = 2 since the parity is always even (it takes two bit changes to go from one codeword to another). Detects one-bit error since this gives odd parity Cannot be used to correct 1-bit error since any oddparity word is equal distance  to k+1 valid codewords.

00

01

000

101 001

3 bits 0 -> 000, 1-> 111 (n=3,k=1,d=3)

2 bits 0 -> 00, 1-> 11 (n=2,k=1,d=2)

In general need d  3 to correct one error. Why? 15-853

Page17

Desiderata

15-853

Page18

Error Correcting Multibit Messages

We look for codes with the following properties:

We will first discuss Hamming Codes Named after Richard Hamming (1915-1998), a pioneer in error-correcting codes and computing in general.

1. Good rate: k/n should be high (low overhead) 2. Good distance: d should be large (good error correction) 3. Small block size k 4. Fast encoding and decoding 5. Others: want to handle bursty/random errors, local decodability, network effects, … 15-853

Aside: has a lecture called You and Your Research that is an interesting read (or you can listen to it on YouTube). Page19

15-853

Page20



Error Correcting Multibit Messages

Hamming Codes: Encoding Localizing error to top or bottom half 1xxx or 0xxx

We will first discuss Hamming Codes Detect 2-bit errors, or correct 1-bit errors.

m15 m14 m13 m12 m11 m10 m9 p8 m7 m6 m5

m3

p0

p8 = m15  m14  m13  m12  m11  m10  m9

Localizing error to x1xx or x0xx

Codes are of form: (2r-1, 2r-1 – r, 3) for any r > 1 e.g. (3,1,3), (7,4,3), (15,11,3), (31, 26, 3), … which correspond to 2, 3, 4, 5, … “parity bits” (i.e. n-k)

m15 m14 m13 m12 m11 m10 m9 p8 m7 m6 m5 p4 m3

p0

p4 = m15  m14  m13  m12  m7  m6  m5

Localizing error to xx1x or xx0x m15 m14 m13 m12 m11 m10 m9 p8 m7 m6 m5 p4 m3 p2

The high-level idea is to “localize” the error.

p0

p2 = m15  m14  m11  m10  m7  m6  m3

Localizing error to xxx1 or xxx0

Any specific ideas?

m15 m14 m13 m12 m11 m10 m9 p8 m7 m6 m5 p4 m3 p2 p1 p0 15-853

Page21

Hamming Codes: Decoding

15-853

Page22

Hamming Codes

m15 m14 m13 m12 m11 m10 m9 p8 m7 m6 m5 p4 m3 p2 p1 p0

Can be generalized to any power of 2 – n = 2r – 1 (15 in the example) – (n-k) = r (4 in the example) – d ≥ 3 (since we can correct one error) – Can correct one error, but can’t tell difference between one and two! – Gives (2r-1, 2r-1-r, 3) code

We don’t need p0, so we have a (15,11,?) code. After transmission, we generate b8 = p8  m15  m14  m13  m12  m11  m10  m9 b4 = p4  m15  m14  m13  m12  m7  m6  m5 b2 = p2  m15  m14  m11  m10  m7  m6  m3 b1 = p1  m15  m13  m11  m9  m7  m5  m3

With no errors, these will all be zero With one error b8b4b2b1 gives us the error location. e.g. 0100 would tell us that p4 is wrong, and 1100 would tell us that m12 is wrong 15-853

p1 = m15  m13  m11  m9  m7  m5  m3

Extended Hamming code – Add back the parity bit at the end – Gives (2r, 2r-1-r, 4) code – Can still correct one error, but now can detect 2 Page23

15-853

Page24



Lower bound on parity bits

Lower bound on parity bits

How many nodes in hypercube do we need so that d = 3? Each of 2k codewords eliminates n neighbors plus itself, i.e. n+1

2n n n

 (n  1)2k  k  log 2 (n  1)  k  log 2 (n  1)

2n n

In above Hamming code, 15  11 + log2(15+1)  = 15.

15-853

Page25

Lower Bounds: a side note

x x

x

x

x

 n  n  n n  k  log 2 (1           ) 1 2     s 15-853

Page26

If  is a field, then n is a vector space Definition: C is a linear code if it is a linear subspace of n of dimension k. This means that there is a set of k independent vectors vi  n (1  i  k) that span the subspace. i.e. every codeword can be written as: c = a1 v1 + a2 v2 + … + ak vk where ai  

x xxx x

Can we do better if we assume regular errors? We will come back to this later when we talk about ReedSolomon codes. This is a big reason why Reed-Solomon codes are used much more than Hamming-codes. 15-853

 1  2

Linear Codes

The lower bounds assume arbitrary placement of bit errors. In practice errors are likely to have patterns: maybe evenly spaced, or clustered: x

 (1  n  n(n  1) / 2)2k  k  log 2 (1  n  n(n  1) / 2) k  2 log 2 n  1 

Generally to correct s errors:

Hamming Codes are called perfect codes since they match the lower bound exactly.

x

What about fixing 2 errors (i.e. d=5)? Each of the 2k codewords eliminates itself, its neighbors and its  n  n neighbors’ neighbors, giving: 1      

Page27

“Linear”: the sum of two codewords is a codeword.

15-853

Page28



Linear Codes

Generator and Parity Check Matrices

Vectors for the (7,4,3)2 Hamming code: m7 m6 m5 p4 m3 p2

Generator Matrix: A k x n matrix G such that: C = { xG | x  k } Made from stacking the spanning vectors

p1

v1

=

1

0

0

1

0

1

1

v2

=

0

1

0

1

0

1

0

v3

=

0

0

1

1

0

0

1

v4

=

0

0

0

0

1

1

1

Parity Check Matrix: An (n – k) x n matrix H such that: C = {y  n | HyT = 0} (Codewords are the null space of H.)

Another way to see that d = 3 for Hamming codes? What is the least Hamming weight among non-zero codewords?

Distance of code = least weight codeword (for linear codes) 15-853

Page29

k

n

Page30

• Encoding is efficient (vector-matrix multiply)

n

=

G

15-853

Advantages of Linear Codes

mesg

mesg

These always exist for linear codes

codeword

• Error detection is efficient (vector-matrix multiply)

n-k

• Syndrome (HyT) has error information

recv’d word

=

H

n-k

syndrome

• How to decode? In general, have qn-k sized table for decoding (one for each syndrome). Useful if n-k is small, else want other approaches.

if syndrome = 0, received word = codeword else have to use syndrome to get back codeword (“decode”) 15-853

Page31

15-853

Page32



Example and “Standard Form”

Relationship of G and H

For the Hamming (7,4,3) code: 1  0 G 0  0

Theorem: For binary codes, if G is in standard form [Ik A] then H = [AT In-k]

0 0 1 0 1 1  1 0 1 0 1 0 0 1 1 0 0 1  0 0 0 1 1 1

Example of (7,4,3) Hamming code: transpose

By swapping columns 4 and 5 it is in the form Ik,A. 1  0 G 0  0

1  0 G 0  0

0 0 0 1 1 1  1 0 0 1 1 0 0 1 0 1 0 1  0 0 1 0 1 1 

0 0 0 1 1 1  1 0 0 1 1 0 0 1 0 1 0 1  0 0 1 0 1 1

1 1 1 0 1 0 0   H  1 1 0 1 0 1 0 1 0 1 1 0 0 1 

A code with a matrix in this form is systematic, and G is in “standard form” 15-853

Page33

15-853

Page34

Relationship of G and H Suppose that x is a message. Then H(xG)T = H(GTxT) = (HGT)xT = (ATIk+In-kAT)xT = (AT + AT)xT = 0 Conversely, suppose that HyT = 0. Then for each 1  i  n-k ATi,*  yT[1..k] + yTk+i = 0 (where ATi,* is row i of AT and yT[1..k] are the first k elements of yT]).

The above proof held only for

.

For codes over a general field

,

if G is of the standard form , then the parity check matrix

Thus, y[1..k]  A*,i = yk+i where A*,i is now column i of A, and y[1..k] are the first k elements of y, so y[k+1…n] = y[1..k]A. Consider x = y[1..k]. Then xG = [y [1..k] | y[1..k]A] = y.

In the binary case, principle is the same



and hence the

Hence if HyT = 0, y is the codeword for x = y[1..k]. 15-853

Page35

15-853

Page36



The d of linear codes

The d of linear codes

Theorem: Linear codes have distance d if every set of (d-1) columns of H are linearly independent, but there is a set of d columns that are linearly dependent.

Theorem: Linear codes have distance d if every set of (d-1) columns of H are linearly independent, but there is a set of d columns that are linearly dependent.

transpose 1  0 G 0  0

0 0 0 1 1 1  1 0 0 1 1 0 0 1 0 1 0 1  0 0 1 0 1 1

1 1 1 0 1 0 0   H  1 1 0 1 0 1 0 1 0 1 1 0 0 1

High level idea: for linear codes, distance equals least weight of non-zero codeword. And each codeword gives some collection of columns that must sum to zero. 15-853

Page37

Dual Codes

Page38

Dual Codes

For every code with G = [Ik A] and H = [AT In-k] we have a dual code with G = [In-k AT] and H = [A Ik] The dual of the Hamming codes are the binary “simplex” or Hadamard codes: (2r-1, r, 2r-1)

15-853

For every code with G = [Ik A] and H = [AT In-k] we have a dual code with G = [In-k AT] and H = [A Ik] Jacques Hadamard (1865-1963)

Irving Reed

David Muller

The dual of the Hamming codes are the binary “simplex” or Hadamard codes: (2r-1, r, 2r-1) codes The dual of the extended Hamming codes are the firstorder Reed-Muller codes. Note that these codes are highly redundant, with very low rate. Where would these be useful?

15-853

Page39

15-853

Page40



Dual Codes

NASA Mariner

For every code with G = [Ik A] and H = [AT In-k] we have a dual code with G = [In-k AT] and H = [A Ik]

Deep space probes from 1969-1977. Mariner 10 shown

Dual of (2r-1, 2r–r–1, 3) Hamming code has generator matrix

Used (32,6,16) Reed Muller code (r = 5) Rate = 6/32 = .1875 (only 1 out of 5 bits are useful) Can fix up to 7 bit errors per 32-bit word

N.b.: every non-zero r-bit vector appears as a column. Lemma: this is a (2r – 1, r, 2r-1) code.

15-853

Page41

15-853

Page42

How to find the error locations HyT is called the syndrome (no error if 0). In general we can find the error location by creating a table that maps each syndrome to a set of error locations.

Reed-Solomon Codes

Theorem: assuming s  (d-1)/2 errors, every syndrome value corresponds to a unique set of error locations. Proof: HW exercise. Keep table of all these syndrome values. Has qn-k entries, each of size at most n (i.e. keep a bit vector of locations). Generic algorithm: not efficient!! (Better for special codes.) 15-853

Page43

Irving S. Reed and Gustave Solomon

15-853

Page44



Reed-Solomon Codes in the Real World (204,188,17)256 : ITU J.83(A)2 (128,122,7)256 : ITU J.83(B) (255,223,33)256 : Common in Practice – Note that they are all byte based (i.e., symbols are from GF(28)). Decoding rate on 1.8GHz Pentium 4: – (255,251) = 89Mbps – (255,223) = 18Mbps Dozens of companies sell hardware cores that operate 10x faster (or more) – (204,188) = 320Mbps (Altera decoder)

PDF-417 QR code

Aztec code

DataMatrix code

All 2-dimensional Reed-Solomon bar codes images: wikipedia

15-853

Page45

15-853

Page46

Applications of Reed-Solomon Codes • • • • •

Storage: CDs, DVDs, “hard drives”, Wireless: Cell phones, wireless links Sateline and Space: TV, Mars rover, Voyager, Digital Television: DVD, MPEG2 layover High Speed Modems: ADSL, DSL, ..

Good at handling burst errors. Other codes are better for random errors. – e.g., Gallager codes, Turbo codes

15-853

Page47

