A Simple and Efficient Approach to Barcode Localization

A Simple and Efficient Approach to Barcode Localization Aliasgar Kutiyanawala and Xiaojun Qi Jiandong Tian Computer Science Department Utah State Un...
Author: Austin Richard
3 downloads 1 Views 749KB Size
A Simple and Efficient Approach to Barcode Localization Aliasgar Kutiyanawala and Xiaojun Qi

Jiandong Tian

Computer Science Department Utah State University Logan, UT 84322-4205 [email protected] [email protected]

State Key Laboratory of Robotics Shenyang Institute of Automation Chinese Academy of Sciences Shenyang, P.R. China, 110016 [email protected]

Abstract— In this paper, we propose a simple and efficient approach to localizing the barcode regions in an image. We first apply the multichannel Gabor filtering technique to extract eight directional texture features. We then apply a randomized hierarchical search strategy to quickly find a sufficient number of pairs of line segments, which have high frequency and high similarity measures. We finally employ the histogram analysis technique on the start and end points of each qualified pair of line segments to localize the barcode regions. Our extensive experimental results show that the proposed scheme outperforms the two peer systems and can successfully localize the barcode regions in an image with a precision of 96% and a recall of 86%. In addition, the proposed system can be easily ported to a cell phone to improve the ShopTalk system to aid the blind to successfully retrieve common grocery products.

localize the barcode, recognize the barcode, and estimate the location of the shopper within the store. Specifically, the locations of all shelf barcodes in the store can be used to guide the visually impaired shopper to the most probably location of the target product on the shelf, because products are normally located directly above their corresponding shelf barcodes. The barcodes on the individual products can be used to inform the visually impaired shopper about the product information such as the brand and the price. The images captured by the cell phone may contain text and graphics together with the barcodes or without any barcode. As a result, identifying the existence of a barcode and the location of a barcode in an image is the first step to reading the barcode. In this paper, we propose a simple and efficient technique to recognize the

Index Terms— Barcode localization, multichannel Gabor filters, longest common subsequence, randomized hierarchical search strategy

I.

INTRODUCTION

Barcodes have been widely used in many fields for applications of great commercial value. Specifically, many large- and medium-sized grocery stores use barcodes on shelves to assist the store personnel in managing the product inventory. That is, these inventory systems place barcodes on the shelves immediately beneath every product area. Nicholson and Kulyukin [1] presented ShopTalk, a wearable small-scale system as shown in Figure 1 that enables a visually impaired shopper to successfully retrieve common grocery products using verbal route direction and barcode scans. In ShopTalk, the visually impaired shopper will use the off-theshelf laser based barcode reader to read the barcodes from the shelves and transmit data to a computing device for estimating his location within the store. This requires the visually impaired person carrying at least two separate pieces of hardware, along with the inconvenience of maintaining them. Recently, the availability of cellular phones with a built-in camera has provided visually impaired people a mobile platform for localizing and decoding barcodes. This recent advance in cell phone technology can make the ShopTalk system miniaturized to a cell phone where the camera of the cell phone can function as a barcode reader and the processor of the cell phone can be used as the computing device to

978-1-4244-4657-5/09/$25.00 ©2009 IEEE

Figure 1. ShopTalk: a wearable shopping system for the blind.

position and orientation of the barcode in an image. II.

RELATED WORK

The barcode localization methods can be roughly classified into two categories: spatial domain-based methods and frequency domain-based methods. We will briefly review some representative techniques in each category. Spatial domain-based methods: Some algorithms [2, 3] search for groups of lines with mono-oriented gradients to locate the barcode region. However, they cannot achieve adequate localization results in cluttered images. Ando [4-6] gives promising results for local barcode searches in cluttered 3-D scenes by extracting connected regions with monooriented texture. However, this extraction becomes invalid when the barcode region is large since it is regarded as background. A Hough transformation-based global method [7]

ICICS 2009

extracts barcode lines due to its robustness against orientation and size. However, it does not work well on images with bad illuminations. Moreover, extra steps need to be performed to get local information if a barcode is a small part of the image. Other algorithms use morphological filters [8] or self-study networks [9] to locate barcodes. However, they are capable of locating barcode regions only under certain conditions and are typically very time-consuming. Frequency domain-based methods: The following frequency-based transformations, i.e., Fourier transform, Gabor filtering, and wavelet transform, have been effectively used for barcode localization. Jain and Chen [10] propose a multichannel Gabor filtering technique together with either the supervised one-layer feed forward neural network learning or the unsupervised clustering technique to localize barcodes. Their techniques are robust to variations in orientation, scale, and shift of the barcode. Bai, Zhang, Wu, and Chen [11] propose a Fourier transformed multichannel Gabor filtering technique together with the back propagation network to ensure rotation invariance in barcode localization. Oktem [12] propose a wavelet transform based technique together with the binary morphological operations to localize barcodes. All these methods utilize the frequency domain-based features together with the complicated learning methods or the complicated morphological operations to achieve the effective barcode localization results. However, the performance of these learning methods highly depends on the chosen training images. Furthermore, the learned topological structure needs to be saved and used for localizing the barcode regions in an image. This additional storage required for the learned topological structure may not fit into the limited memory and storage resources available on the cell phone. On the other hand, the morphological operations may require a series of dilation and erosion operations, which may drain up the limited computational power available on the cell phone. As a result, all these frequency domain-based method may not be successfully ported on a cell phone. In this paper, we propose a simple and efficient approach to localizing the barcode regions in an image. We first apply the multichannel Gabor filtering technique to extract eight directional texture features. We then apply a randomized hierarchical search strategy to quickly find a sufficient number of pairs of line segments, which have high frequency and high similarity measures. We finally employ the histogram analysis technique on the start and end points of each qualified pair of line segments to localize the barcode regions. Currently, the entire algorithm is implemented on a personal computer but it can be ported on a cell phone without much difficulty due to its little memory and CPU consumption requirement. The remainder of the paper is organized as follows: Section III presents the proposed barcode localization technique. Section IV shows the effectiveness of the proposed technique via extensive experiments to compare with two peer systems. Section V draws the conclusions and presents the future work. III.

THE PROPOSED APPROACH

Figure 2 illustrates the block diagram of our proposed barcode localization system. In the following subsections, we explain each of the three blocks in detail.

Extract Gabor Filter Based Features

Compute Frequency and Similarity Measures

Estimate the Histogram-Based Bounding Box

Figure 2. The block diagram of the proposed system.

A. Extract Gabor Filter Based Features Gabor filters can serve as excellent band-pass filters for uni-dimensional signals. They are linear filters that can be defined as a product of a Gaussian signal with a complex sinusoid. They are defined as:

⎛ x ' 2 +γ 2 y ' 2 ⎞ ⎛ x' ⎞ ⎟⎟ cos⎜ 2π + ψ ⎟ (1) g ( x, y; λ , θ ,ψ , σ , γ ) = exp⎜⎜ − 2 2σ λ ⎠ ⎝ ⎠ ⎝ where x' = x cos θ + y sin θ and y' = − x sin θ + y cos θ . Here, λ represents the wavelength of the cosine factor, θ represents the orientation of the normal to the parallel stripes of a Gabor function, ψ is the phase offset, σ is the sigma of the Gaussian envelop, and γ is the spatial aspect ratio and specifies the ellipticity of the support of the Gabor function. In our proposed system, we use σ=5, γ=1, ψ=0, and λ=42.25 as our constants. We use eight orientations, i.e., θ = {0˚, 22.5˚, 45˚, 67.5˚, 90˚, 112.5˚, 135˚, 157.5˚}, to generate a total of eight channels in the filter bank to find the orientation of the barcode with respect to the image. For each input image A, we first apply the canny edge detector to get its edge image B. We then apply each of the eight Gabor filters on the edge image B and rotate its Gabor filtered result by a certain angle, which is consistent with the θ’s used in the corresponding Gabor filter. That is, the Gabor filtered image is rotated in a clockwise direction by 45˚ for the Gabor filter of θ = 45˚. This rotation operation ensures that all the strong responses align along the y-axis (i.e., have the vertical orientation). We finally convert each of the eight rotated Gabor filtered images to a binary image by applying a simple thresholding approach. In our system, we set the threshold as 0.9. That is, any Gabor filtered responses whose values are bigger than 0.9 will be set as 1’s. The other smaller responses will be set as 0’s. Figure 3 illustrates the binarized Gabor filtering results for an image, which contains a vertical barcode at the right corner. It clearly shows that the Gabor filtered image with θ=0˚ achieves the highest responses. That is, the responses are more prominent for barcode lines that are oriented at an angle of 90˚ along the x-axis (i.e., θ=0˚ is the orientation of its normal) and less prominent for other elements.

(a) Original image

scaling mechanism is able to handle barcodes of various sizes.

(b) Gabor result with θ=0˚

(d) Gabor result with θ=45˚

(f) Gabor result with θ=90˚

(c) Gabor result with θ=22.5˚

(e) Gabor result with θ=67.5˚

(g) Gabor result with θ=112.5˚

(h) Gabor result with θ=135˚ (i) Gabor result with θ=157.5˚ Figure 3. Eight Gabor filter results.

Figure 4 illustrates the binarized Gabor filtering results around the expected angles for another image, which contains a tilted (around 135˚) barcode. It clearly shows that the Gabor filtered image with θ=45˚ achieves the highest response. The clockwise rotation of 45˚ will make the tilted barcode have a vertical orientation.

(a) Original image

(b) Gabor result with θ=67.5˚

(c) Gabor result with θ=45˚ (d) Gabor result with θ=22.5˚ Figure 4. Gabor filter results around the expected angles.

B. Compute Frequency and Similarity Measures For each rotated and binarized Gabor filtered image, we use a randomized hierarchical search strategy to quickly find 2,000 horizontal segments of the same lengths at each of the eight scales. This search strategy guarantees that most horizontal segments will be quickly searched starting from the smallest possible length (i.e., the width of the image divided by 10) at scale 1 to the largest possible length (i.e., the smallest possible length multiplied by 8) at scale 8. This

For each horizontal segment, we then compute its frequency measure. This frequency measure counts the number of changes from 0’s to 1’s and from 1’s to 0’s. For example, given the following segment of 1010001010, the frequency measure is computed as 7. If the frequency measure of a line segment is higher than a predefined threshold, we consider this line segment as high frequency segment. We experimentally decide the threshold to be 25. For each high frequency line segment, we pick out a line segment of the same length, which locates at 20 pixels below. We then apply the LCS (Longest Common Subsequence) algorithm [13] to find a common subsequence of two line segments which has the maximum possible length. The similarity between these two line segments is then computed as the ratio of the length of the LCS to the length of the original line segment. For example, the LCS of two line segments, 1010001010 and 0101000100, is 10100010. The similarity between the above two line segments is computed as ratio of the length of 10100010 (i.e., 8) to the length of 1010001010 (i.e., 10). That is, the similarity is 80%. In our system, we consider any pair of high frequency line segments with similarity higher than 90% as candidate barcode regions. This choice is mainly based on the four observations as illustrated in Figure 5: •

Case I (Red line segment pair): Both upper and lower lines have a high frequency measure. The similarity between these two lines is high.



Case II (Yellow line segment pair): Both upper and lower lines have a high frequency measure. The similarity between these two lines is high.



Case III (Blue line segment pair): The upper blue line has a low frequency measure and the lower blue line has a high frequency measure. The similarity measure between these two lines is low.



Case IV (Green line segment pair): Both upper and lower lines have a low frequency measure. The similarity between these two lines is high.

In order to quickly find the potential barcode region, we terminate the randomized hierarchical search once 50 pairs of line segments with high frequency and high similarity measures (i.e., 50 pairs of line segments satisfying the first two conditions) are found. That is, we claim that the corresponding rotated and binarized Gabor filtered image likely contains barcodes. All the 50 pairs will be passed to the last step for barcode localization. On the other hand, we claim that the rotated and binarized Gabor filtered image does not contain barcodes at the corresponding angle, if there are fewer than 50 pairs of line segments with high frequency and high similarity measures after searching the horizontal segments at all eight scales. For the two images shown in Figure 3(a) and Figure 4(a), 50 pairs of line segments with high frequency and high similarity measures are found in the binarized Gabor filtered image shown in Figure 3(b) and Figure 4(c), respectively. The other binarized and rotated Gabor filtered images yield fewer

than 50 pairs. Therefore, the start and end points of all the 50 pairs resulting from the correct Gabor filtered images will be passed to the last step.

Figure 5. Illustration of four cases of line segment pairs.

C. Estimate the Histogram-Based Bounding Box For each of 50 pairs of line segments obtained in a binarized and rotated Gabor filtered image, we compute its histogram along x-axis and y-axis, respectively. Specifically, we use the x-coordinates of the start and end points of all pairs as the input to compute the number of occurrences of each input value. Similarly, we use the y-coordinates of the start and end points of all pairs as the input to compute the number of occurrences of each input value. If an image contains a barcode, most of the lines will coincide with the barcode and hence we expect the histograms along the x-axis and y-axis to have a single peak whose amplitude is greater than a threshold. If an image does not contain a barcode, we expect the histograms along both x-axis and y-axis to be flat or to have peaks whose amplitudes are smaller than a threshold. In our proposed system, we consider the histogram contains a peak if the occurrence of a certain value is higher than 30% of the total number of occurrences (i.e., 50). Depending on the type of histogram observed, we conclude whether an image contains a barcode or not. The position of the barcodes is obtained by analyzing the start and end points of the peak of both histograms. Specifically, if the width of the histogram is w, the barcode is assumed to lie between 0.02w and 0.98w. The orientation of the barcode is the angle by which the image is rotated (i.e., the corresponding θ value used in the Gabor filtering operation). Figure 6 demonstrates the histograms along both x-axis and y-axis of all 50 pairs of line segments found for images shown in Figure 3(a) and Figure 4(a). It also shows the localized barcode region and the threshold lines on top of the histograms.

Figure 6. Illustration of histogram-based bounding box results of the images shown in Figure 3(a) and Figure 4(a).

IV.

EXPERIMENTAL RESULTS

We implemented the current system using Java on Pentium IV Dual CPU at 3.0GHz PC running Windows XP operating system. On average, it takes 4.05 seconds to localize the

barcode on images of size 300×300; it takes 5.29 seconds to localize the barcode on images of size 400×400; and it takes 6.97 seconds to localize the barcode on images of size 500×500. The complexity for extracting Gabor filter based features is O(nlogn), where n is the dimension (width or length) of an image. The complexity for computing frequency and similarity measures is O(Len2), where Len is the length of the line segment. The complexity for estimating the histogram-based bounding box is O(Len). To evaluate the performance of the proposed barcode localization approach, we conducted experiments on 35 images taken in a real shopping trip using a cell phone camera. We also compared the proposed approach with the two approaches (e.g., one-layer feed forward neural network-based supervised approach and clustering-based unsupervised approach) proposed by Jain and Chen [10] using the same 35 images. We chose these two approaches for comparison mainly because they also used the Gabor filters to obtain the features. Table I compares the performance of the three systems in terms of the TP (True Positive), TN (True Negative), FP (False Positive), FN (False Negative), precision (i.e., the ratio of TP/(TP+FP)), and recall (i.e., the ratio of TP/(FN+TP)). It clearly shows that our proposed system achieves the highest TP and TN, the best precision and recall, and the lowest FP and FN. That is, our proposed system achieves the best performance in successfully localizing the barcode regions in an image. TABLE I. Methods Ours Unsupervised Supervised

TP 24 19 17

COMPARISON OF THREE ALGORITHMS TN 6 4 6

FP 1 3 1

FN 4 9 11

Precision 0.96 0.86 0.94

Recall 0.86 0.68 0.61

Figure 7 demonstrates our barcode localization results on several images. Figure 7(a) shows four representative images which contain barcodes of different sizes. Our system successfully identifies and localizes all the barcodes inside these four images. Figure 7(b) shows a representative image which does not contain a barcode. Our system successfully claims that barcodes do not exist in this image. Figure 7(c) shows a representative image containing a blurred barcode and its corresponding binarized and rotated Gabor filtered result. Since it is impossible to read this blurry barcode, we consider the qualified barcode does not exist in the image. The Gabor filtering result clearly shows that the vertical lines in the barcode are successfully detected. However, the barcode lines are broken at a lot of places due to the blurred effect. As a result, the system identifies the barcode but fails to localize the barcode due to an insufficient number of pairs of line segment with high frequency and high similarity measures at the barcode portion. Figure 7(d) shows a representative image containing a barcode and its corresponding binarized and rotated Gabor filtered result. It clearly shows that the Gabor filter successfully detects the barcode lines and other lines that are perpendicular to the barcode and have high frequency and similarity measures. As a result, our system identifies the barcode and fails to precisely localize the barcode.

points of each qualified pair of line segments to quickly estimate the location of the barcode. We will develop a gradient-based method in the spatial domain to replace the Gabor filters to speed up the localization process. We will also experiment on a large set of varied images to improve the robustness of our approach in localizing the barcodes under different conditions. To further reduce the false negative rate, we will adaptively decide the threshold for computing the similarity measure used in the second step and increase the number of pairs selected for the third step. REFERENCES [1] (a) True positive cases

[2] [3]

[4] (b) True negative case

(c) False positive case [5]

[6]

[7] (d) False negative case Figure 7. Illustration of representative images for each case.

V.

CONCLUSIONS

In this paper, we propose a simple and efficient barcode localization approach to recognizing the position and orientation of the barcode in an image. Our approach can be easily ported on a cell phone to aid the blind to shop common grocery products. The major contributions are: •

Apply an eight-channel Gabor filter to find the possible orientation of the barcode with respect to the image.



Apply the randomized hierarchical search strategy to quickly find a sufficient number of pairs of line segments which have high frequency and high similarity measures.



Apply the histogram analysis on the start and end

[8]

[9]

[10]

[11]

[12]

[13]

J. Nicholson and V. Kulyukin, “ShopTalk: Independent Blind Shopping = Verbal Route Directions + Barcode Scans,” Proc. of the Rehabilitation Engineering and Assistive Technology Society of North America (RESNA) Conference. Phoenix, AZ, June 2007. Platform-presentation and paper. N. Normand, and C. Viard-Gaudin, “A Two-Dimensional Bar Code Reader,” Pattern Recognition, Vol. 3, pp. 201-203, 1994. C. Viard-Gaudin, N. Normand, and D. Barba, “Algorithm Using a TwoDimensional Approach,” Proc. of the Second Int. Conf. on Document Analysis and Recognition, No. 20-22, pp. 45-48, October 1993. S. Ando, and H. Hontanj, “Automatic Visual Searching and Reading of Barcodes in 3-D Scene,” Proc. of the IEEE Int. Conf. on Vehicle Electronics, No. 25-28, pp. 49-54, September 2001. S. Ando, “Image Field Categorization and Edge/Corner Detection from Gradient Covariance,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 22, No. 2, pp. 179-190, February 2000. S. Ando, “Consistent Gradient Operators,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 22, No. 3, pp. 252-265, March 2000. R. Muniz, L. Junco, and A. Otero, “A Robust Software Barcode Reader using the Hough Transform,” Proc. of Int. Conf. on Information Intelligence and Systems, No. 31, pp. 313-319, November 1999. S. Arnould, G.J. Awcock, and R. Thomas, “Remote Bar-code Localization Using Mathematical Morphology,” Image Processing and its Applications, Vol. 2, No. 465, pp. 642-646, 1999. S. J. Liu, H. Y. Liao, L. H. Chen, H. R. Tyan, and J. W. Hsieh, “CameraBased Barcode Recognition System Using Neural Net,” Proc. of Int. Joint Conf. on Neural Networks, Vol. 2, pp. 1301-1305, 1993. A. Jain and Y. Chen, “Bar Code Localization Using Texture Analysis,” Proc. of the 2nd Int. Conf. on Document Analysis and Recognition, pp. 41-44, 1993. Z. Bai, Z. Yang, J. Wu, and Y. Chen, “Region Localization Based on Rotational Invariant Feature and Improved Self Organized Map,” Proc. of the 3rd Int. Conf. on Intelligent System and Knowledge Engineering, pp. 703-706, 2008. R. Oktem, “Bar Code Localization in Wavelet Domain by Using Binary Morphology,” Proc. of the 13th IEEE Conf. on Signal Processing and Communications Applications, pp. 499-501, 2004. D. Hirschberg, “Algorithms for the Longest Common Subsequence Problem,” Journal of the Association for Computing Machinery, Vol. 24, No. 4, pp. 664-675, 1977.

Suggest Documents