An error-correcting code

An error-correcting code 19 octobre 2011 1 Elementary information theory A bit is a unit of measure corresponding to the elementary information enc...
Author: Piers Harris
1 downloads 1 Views 112KB Size
An error-correcting code 19 octobre 2011

1

Elementary information theory

A bit is a unit of measure corresponding to the elementary information encoded with one binary figure (either a 0 or a 1). According to information theory, as initiated by Shannon, a bit is the information corresponding to the disclosure of a uniformly distributed random variable in {0, 1}, like flipping a coin. The word bit is a contraction of binary digit. The memory of a computer consists of a large number of simple devices like capacitors within an integrated circuit. Each capacitor can be either charged or discharged. These two states correspond to the two possible values 0 and 1 for the corresponding stored bit of information. A gigagit of memory can can keep 109 bits of information. When dealing with text writen in various human languages, one has to represent every letter (or symbol) by a sequence of bits. To this end one uses a standard. For example, the capital A is encoded by the 8 bits 01000001 in the utf8 standard. The character e´ is represented by the 16 bits 1100001110101001. The character = C is represented by the 24 bits 111000101000001010101100. Every character in the utf8 standard is encoded by at most 32 bits. So 109 bits of memory can keep 109 /32 ∼ 3, 12 × 107 characters at least. A byte consists of several bits, usually 8. Bytes consisting of 8 bits are also called octets.

2

Error-correcting codes

Let m0 , m1 , . . . , mk−1 be a sequence of k bits. Assume we want to transmit this sequence (for example between a satellite and a TV set ; or between a compact disk and a loudspeaker). These k bits may be altered on their way. Possible causes are noise, transmission failure, etc. We would like to protect the message against such alterations. A first idea is to repeat the message : if the message is sent twice, it is unlikely that the same error occurs twice at the same place. When the two received messages are different, we detect an error. But we don’t know how to correct (we don’t know which copy has been corrupted). If the same message is sent thrice, and if only one among the three copies has been corrupted, we can correct thanks to the majority rule. For example if one copy says 0 and the two other ones say 1, we keep the value 1. 1

As a concrete example, assuming we have received n1 = 100110010110100011011001, n2 = 100110010100100011001001, n3 = 110110010100100011011001, we will keep the value n = 100110010100100011011001. This is not a very efficient method. The length of the messages is multiplied by three.

3

Linear codes

A linear error correcting code associates to every k-uple m = (m0 , m1 , . . . , mk−1 ) an r-uple h = (h0 , h1 , . . . , hr−1 ) defined by a linear map r : Fk2 −→ Fr2 m 7→ h. The set of (k + r)-uples (m, r(m)) is the code C. This code C is a dimension k sub-vector space of Fn2 where n = k + r. Conversely, given a subspace C of Fn2 (a linear code), a linear map r as above is called a systematic encoder for C. The integers n and k are called the length and dimension of C. Vectors in C are called words. The Hamming weight of a vector x = (x1 , . . . , xn ), denoted ω(x), is the number of nonzero coordinates of x. The Hamming distance between two vectors x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) in Fn2 , denoted d(x, y), is the number of coordinates where x and y differ. So d(x, y) = {i, xi 6= yi }. The minimal distance of the code C, denoted d, is the smallest distance between two distinct words in C. So d=

inf

c∈C,c6=0

ω(c) =

inf

c,c0 ∈C,c6=c0

d(c, c0 ).

Since C is a vector space, this d is also the smallest weight of a non-zero word in C. The three parameters n, k and d given above play an important role in coding theory. One says that C is an [n, k, d]-code. ) errors. An [n, k, d]-code can detect (resp. correct) any configuration of d − 1 (resp. d−1 2 We give an example with k = 3, r = 3, n = 3 + 3 = 6. We identify Fk2 with the set of line vectors with length k. We identify Fr2 a` with the set of line vectors of length r. Let R be the matrix   0 1 1 R =  1 0 1 . 1 1 0 Define r as r(m) = mR. Let G be the matrix

2



 1 0 0 0 1 1 G =  0 1 0 1 0 1 . 0 0 1 1 1 0 The code C is the linear sub-space of Fn2 generated by the three lines of G. The matrix G is called the generating matrix of C. In the present case, the minimal distance is d = 3. This code detects two errors and corrects one error.

4

Noise

We say that a word c of the code C ⊂ {0, 1}n is transmitted through a so-called additive channel if the received message is c+e where e = (e1 , . . . , en ) is a random vector in Fn2 called the error vector (or noise). The channel is defined by the law of the random variable e. In particular the channel is said to be 1. uniform if the law of e is uniformly distributed in {0, 1}n . 2. binary symmetric with transition probability p if every bit ei of e is a random variable with P [ei = 1] = p, and the variables ei are pairwise independent. The uniform noise is a convenient model when we have no information about the nature of errors that may occur in the transmission process. If we are facing isolated and uncorrelated errors, then a binary symmetric channel. In typical situations, the transition probability is small. Note however that the uniform canal is a binary symmetric canal with p = 1/2. In practice, the receiver gets an n-bits vector consisting of k information bits (the initial message), and r bits of redundancy. The receiver computes the redundancy associated with the received message m, and he checks if this computed redundancy coincides with the received r-bits. One says that there is a detection failure when the error vector lies itself in the code C. Assuming the noise is uniform, then the probability that an error is not detected is 1 1 − . 2r 2n

P (N D) =

In particular, only the length of the redundancy matters in that case. In case we have a binary symmetric channel X P (N D) = pw(c) (1 − p)n−w(c) 06=c∈C

=

X

Ai pi (1 − p)n−i

d≤i≤n

3

(1)

where d is the minimal distance of the code C, w(c) is the weight of c, i.e. the number of nonzero coordinates in it, and Ai is the number of words in the code C having weight i. When p is small enough we have P (N D) ≈ Ad pd . The error detection power increases with the minimal distance.

5

Cyclic codes A linear code C with length n is said to be cyclic if it is stable by cyclic shift of its entries : (c0 , . . . , cn−1 ) ∈ C =⇒ (cn−1 , c0 , c1 , . . . , cn−2 ) ∈ C.

One can describe cyclic codes using polynomials. We note F2 [X] for the set of polynomials in the variable X with coefficients in F2 , and An = F2 [X]/(X n − 1) for the quotient ring by X n − 1. We set x = X mod X n − 1. So x ∈ A and xn = 1. Coming back to the cyclic code C, we have an injective map C → An (c0 , . . . , cn−1 ) → c0 + c1 x + · · · + cn−1 xn−1 . So one can consider C as a sub-vector space of An . Its elements c will be seen as polynomials c(x) in x with degree ≤ n − 1 (since xn = 1). The cyclicity condition boils down to : c(x) ∈ C =⇒ xc(x) ∈ C.

5.1

Generating polynomial of a cyclic code

We consider a cyclic code C ∈ An with length n. There is a unique polynomial with minimal degree in C − {0}. This polynomial is called the generating polynomial of C. The reason for this terminology is the following proposition : Proposition 1 Let C be a cyclic code with generating polynomial g(x). Then C is the set of multiples of g(x) in An . In particular, a basis for C is g(x), xg(x), . . . , xn−r−1 g(x) where r is the degree of g(x). We thus have : Proposition 2 If C is a cyclic code with length n and generating polynomial g(x) with degree r, then the dimension of C is k = n − r.

4

5.2

Encoding

To every message m = (m0 , . . . , mk−1 ) of length k, one can associate a code word by computing the euclidean division of (m0 + m1 X + · · · + mk−1 X k−1 )X n−k by g(X). We then have, recalling that r = n − k : X r m(X) = q(X)g(X) + h(X), where q is and h the remainder in the above euclidean division. Since −1 = +1 in F2 , X r m(X) + h(X) = q(X)g(X) ∈ C. In practice this means that we send (h0 , h1 , . . . , hr−1 , m0 , m1 , . . . , mk−1 ), or (mk−1 , . . . , m0 , hr−1 , . . . , h0 ). The code word is obtained by concatenating the initial message (mk−1 , . . . , m0 ) and the redundancy part (hr−1 , . . . , h0 ), called the CRC (cyclic redundancy check).

5.3

The BCH bound

Th´eor`eme 1 (BCH bound) Let C is a cyclic code with length n = 2m − 1, and generating polynomial g(X). We assume that there exists integers b and δ ≥ 2, such that g(αb ) = g(αb+1 ) = · · · = g(αb+δ−2 ) = 0, where α is a primitive element in F∗2m . Then the minimal distance d of C satisfies d ≥ δ. Here BCH stands for Bose, Ray-Chaudhuri, Hocquenghem.

5.4

Hamming codes

Let m be an integer, let α be a primitive element in F∗2m . The Hamming code Hm is the cyclic code with length n = 2m − 1 and generating polynomial g(X), the minimal polynomial of α. Since the map F2m → F2m x → x2 commutes with g we have g(α2 ) = (g(α))2 = 02 = 0, so the BCH bound applies with b = 1 and δ = 3. So the Hamming code has minimum distance d ≥ 3. 5

5.5

2-error correcting BCH codes

Let again α be a primitive element in F∗2m . Consider the minimal polynomial g1 (X) ∈ F2 [X] of α, and g3 (X) the minimal polynomial of α3 . They both have degree m. We set g(x) = g1 (x)g3 (x). Then Proposition 3 The cyclic code with length n = 2m − 1 and generating polynomial g(x) has dimension k = n − 2m and minimal distance d ≥ 5. The assertion about the minimal distance results from the BCH bound for b = 1 and δ = 5, because α and α3 being roots of g, we deduce that α2 and (α2 )2 = α4 also are roots of g. Thus, α, α2 , α3 , α4 cancel g.

6

Questions 1. Explain why an [n, k, d]-code detects (resp. corrects) any configuration of d − 1 (resp. errors. 2. Give an example of coding and decoding with the [6, 3, 3]-code in section 3. 3. Justify the probability calculations in section 4. 4. Prove the two propositions in section 5. 5. Prove the BCH bound. You may introduce a Vandermonde determinant. 6. Build a 3-errors correcting BCH code.

6

d−1 ) 2