Best Basis Search in Lapped Dictionaries

1 Best Basis Search in Lapped Dictionaries. Yan Huang Ilya Pollak* Abstract— We propose, analyze, and illustrate several best basis search algorith...
Author: Griffin Grant
2 downloads 0 Views 677KB Size
1

Best Basis Search in Lapped Dictionaries. Yan Huang

Ilya Pollak*

Abstract— We propose, analyze, and illustrate several best basis search algorithms for dictionaries consisting of lapped orthogonal bases. We improve upon the best local cosine basis selection based on a dyadic tree [10], [11], by considering larger dictionaries of bases. We show that this can result in sparser representations and approximate shift-invariance. We also provide an algorithm which is strictly shift-invariant. Our experiments suggest that the new dictionaries can be advantageous for timefrequency analysis, compression, and noise removal. We provide accelerated versions of the basic algorithm which explore various trade-offs between computational efficiency and adaptability. We show that our algorithms are in fact applicable to any finite dictionary comprised of lapped orthogonal bases. We propose one such novel dictionary which constructs the best local cosine representation in the frequency domain, and show that the new dictionary is better suited for representing certain types of signals.

I. I NTRODUCTION . The contributions of our paper are in the area of best basis search algorithms where the aim is to adaptively select, from a dictionary of orthonormal bases, the basis which minimizes a cost for a given signal [3], [10], [11]. Such methods have been demonstrated to be effective for compression [27], [28], [46], [52], estimation [13]–[15], [25], [26], [30], [38], [43], [53], and time-frequency (or space-frequency) analysis [12], [18], [19], [50], [51], [54]. The original work on best basis search [10], [11] exploited the fact that a dictionary consisting of local cosine bases [9], [32], [33], [47] on dyadic intervals can be represented as a single dyadic tree. This made it possible to find the best basis, for an additive cost function, via an efficient tree pruning algorithm. On the other hand, it has been noticed that, for an additive cost function, the optimal segmentation of a 1-D signal can be efficiently found using dynamic programming. This has been exploited in many contexts such as piecewise polynomial approximation [2], [41], [44], best basis search in time-varying wavelet packet [54] and MDCT [40] dictionaries, estimation of abrupt changes in a linear predictive model [45], and optimal selection of cosine-modulated filter banks [39]. In this paper, we exploit a similar idea to remove the restriction of [10], [11] that the supports of local cosine basis functions This work was supported in part by a National Science Foundation (NSF) CAREER award CCR-0093105, an NSF grant IIS-0329156, a Purdue Research Foundation grant, and an NSF CAREER award CCR-0237633. All experiments were generated with the help of Wavelab 802 [16]. Preliminary results were reported in [20], [21]. Y. Huang, I. Pollak*, and C.A. Bouman are with the School of Electrical and Computer Engineering, Purdue University, 1285 EE Building, West Lafayette, IN 47907, phone 765-494-3465, 5916, and 0340, fax 765-4943358, e-mail yanh,ipollak,[email protected]. M.N. Do is with the Department of Electrical and Computer Engineering and Beckman Institute, University of Illinois at Urbana-Champaign, 1406 West Green St., Urbana, IL 61801, phone 217-244-4782, fax 217-244-1642, e-mail [email protected]. Corresponding author’s e-mail: [email protected].

Charles A. Bouman

Minh N. Do

be dyadic, and use a dynamic programming algorithm to find the best basis in a much larger collection of local cosine bases. Through examples, we illustrate the advantages of our approach in three application areas: time-frequency analysis, compression, and noise removal. Specifically, these examples show that our algorithms result in: • sparser and more accurate time-frequency descriptions; • lower entropy, even when the side information is taken into account; • improved noise removal performance, as measured by the SNR. In addition, we extend our basic algorithm in several ways. We argue that our algorithm is approximately shiftinvariant, and moreover show that it can be made strictly shiftinvariant by using a procedure similar to the one developed in [12]. We furthermore propose two accelerated versions of the algorithm which explore various trade-offs between computational efficiency and adaptability, and which are based on the idea of two-stage processing of the data: first, small pieces of a signal are processed using dynamic programming within each piece, and then the results are combined using another dynamic programming sweep. The use of our algorithms is not restricted to local cosine dictionaries. For example, lapped bases in the frequency domain were used in [24], [30], [37]. We propose a novel construction which represents the discrete cosine transform (DCT) of a signal in a local cosine dictionary, and therefore corresponds to representing the signal in a dictionary whose elements are the inverse DCT’s of the local cosine functions. We give examples where noise removal using this new dictionary yields a higher SNR than the best local cosine representation. While we develop and illustrate our algorithms using two dictionaries—the local cosines in the time domain and in the DCT domain—we show in Section IV that our algorithms are applicable to any finite dictionary comprised of lapped orthogonal bases. II. L OCAL C OSINE D ECOMPOSITIONS . A. Best Basis Search Problem. The general best basis search problem is formulated, for example, in [10], [11], [30]. We consider a dictionary D that is a set of orthonormal bases for RN , D = {B λ }λ∈Λ , where λ each basis B λ consists of N vectors, B λ = {gm }1≤m≤N . The λ cost of representing a signal f in B is typically defined as follows [10], [11], [30]: ´ ³ λ PN |hf,gm i|2 or C(f, B λ ) = m=1 Φ 2 (1) ¡ kf kλ 2 ¢ PN λ C(f, B ) = m=1 Φ |hf, gm i| , where Φ is application dependent. Any basis which achieves the minimum of the cost C(f, B λ ) over all the bases in the

2

O0,N ← best basis(f ) { for u = N − M, N − 2M, . . . , 2M, M, 0 { Ou,N ← Bu,N ; //Initialize Ou,N ∗ ∗ Cu,N ← C(f, Bu,N ); //Initialize Cu,N for d = u + M, u + 2M, . . . , N − M { ∗ ∗ if C(f, Bu,d ) + Cd,N < Cu,N { Ou,N ← Bu,d ∪ Od,N ; ∗ ∗ Cu,N ← C(f, Bu,d ) + Cd,N ; } } ∗ save Ou,N and Cu,N in an internal data structure; } return O0,N ; }

βu ,v(t) 1

0

u − 21 −η

u − 21 + η

v − 12 −η

v − 21 + η t

(a) A window function βu,v .

Fig. 2. Pseudocode specification of a fast dynamic programming algorithm for the best local cosine basis search. The cost of the best basis Ou,N is ∗ denoted by Cu,N .

which consists of the following local cosine bases: Bλ =

Bnk ,nk+1 ,

(3)

k=0

(b) A local cosine basis function. Fig. 1. A window function βu,v and an element of a local cosine dictionary.

dictionary, is called the best basis. B. A Local Cosine Dictionary. We identify each vector in RN with a signal f (n) defined for n = 0, 1, . . . , N − 1. A local cosine basis [9], [30], [32], [33], [47] for RN is defined using cosine functions multiplied by overlapping smooth windows. For each discrete interval [u, v − 1] ⊂ [1, N − 2], we define a window function βu,v (see Fig. 1(a)) which gradually ramps up from zero to one around u−1/2 and goes down from one to zero around v−1/2:  ³ ´  r t−(u−1/2) if u − 12 − η ≤ t < u − 12 + η  η    1 1 1³ ´ if u − 2 + η ≤ t < v − 2 − η βu,v (t) =  if v − 12 − η ≤ t ≤ v − 12 + η r (v−1/2)−t  η    0 otherwise, where the parameter η ∈ R controls how fast the window tapers off, and r is a profile function which monotonically increases from r(t) = 0 for t < −1 to r(t) = 1 for t > 1 and satisfies r2 (t) + r2 (−t) = 1 ∀t ∈ R. Following [30], we define the discrete local cosine family Bu,v as follows: ( √ ¸)v−u−1 · π(κ + 12 )(n − (u − 12 )) βu,v (n) 2 √ , Bu,v = cos v−u v−u κ=0

where n ∈ Z is a discrete time parameter and κ ∈ Z is a discrete frequency parameter. One signal from such a family is depicted in Fig. 1(b). It can be shown [30] that this set of signals is orthonormal if v − u ≥ 2η. For a signal f of length N , we search for the best basis in the local cosine dictionary [ Bλ (2) D= λ∈Λ

K[ λ −1

where λ is a set of partition points {nk }0≤k≤Kλ of the domain of f . If the partition points are such that only adjacent windows overlap (i.e., if nk+1 − nk ≥ 2η for all k) then B λ is an orthonormal basis for RN [30]. In order to achieve this, we impose that the finest cell size be some fixed integer M ≥ 2η, i.e., we require the partition points to be integer multiples of M : n0 = 0 < n1 < · · · < nKλ −1 < nKλ = N (4) nk is divisible by M where M ≥ 2η is a fixed integer. (5) We will refer to the resulting D as a MOD -M dictionary. We note that a MOD -M dictionary is larger than the local cosine tree dictionary of [10], [11]. In fact, if we choose M such that N/M = 2J where J is the maximum depth of the local cosine tree of [10], [11], it can be easily shown that the local cosine tree dictionary of [10], [11] will be a subset of the MOD -M dictionary. C. A Best Basis Algorithm. We now describe an efficient best basis search algorithm for our MOD -M dictionary. It is a dynamic programming algorithm whose variants have been widely used in literature since [2] to find the best segmentation of a 1-D signal. Our exposition closely follows [54] where it was used to find the best block wavelet packet basis. Let 0 ≤ u < v ≤ N , and let the best basis associated with the window βu,v be denoted by Ou,v . For v − u > M , ½ B ∗ ∪ Od∗ ,v if C(f, Bu,d∗ ) + C(f, Od∗ ,v )

Suggest Documents