OPTIMAL EMBEDDING OF QR CODES INTO COLOR, GRAY SCALE AND BINARY IMAGES. Gonzalo J. Garateguy

OPTIMAL EMBEDDING OF QR CODES INTO COLOR, GRAY SCALE AND BINARY IMAGES by Gonzalo J. Garateguy A PhD dissertation submitted to the Faculty of the Un...

Author: Elfreda Kennedy

20 downloads 4 Views 6MB Size

Report

Download PDF

Recommend Documents

QR Images: Optimized Image Embedding in QR Codes

Optimized Image Embedding in QR Codes

Implementation of 2D Optimal Barcode (QR Code) for Images

QR CODES DECODER ENG629

Binary and Ternary Codes of Hadamard Matrices

Prefix-reversal Gray codes

Rectangular Decomposition of Binary Images

Im Dialog: Audi QR-Codes

GTIN- (EAN-) und QR-Codes

Binary Error Correcting Codes

Persuasion Marketing Using QR Codes

Binary Images: Geometric Properties

QR Codes for Authentic Assessment

CREATING CONNECTIONS WITH QR CODES

free eshop money qr codes

EMBEDDING VIDEO INTO POWERPOINT

A comparison of constant stimuli and gray scale methods of color difference scaling

COMMON FEATURES TO BINARY AND ALPHANUMERIC CODES

Dynamic Binary Analysis and Obfuscated Codes

Semantic Processing of color Images

Care Control March Release Using QR Codes

QR-Codes for the Chronically Homeless

Business process innovation with QR Codes

Securing Documents through PKI & QR Codes

OPTIMAL EMBEDDING OF QR CODES INTO COLOR, GRAY SCALE AND BINARY IMAGES

by Gonzalo J. Garateguy

A PhD dissertation submitted to the Faculty of the University of Delaware in partial fulﬁllment of the requirements for the degree of Doctor of Philosophy in Electrical and Computer Engineering

Winter 2014

c 2014 Gonzalo J. Garateguy All Rights Reserved

UMI Number: 3617881

All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion.

UMI 3617881 Published by ProQuest LLC (2014). Copyright in the Dissertation held by the Author. Microform Edition © ProQuest LLC. All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code

ProQuest LLC. 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, MI 48106 - 1346

OPTIMAL EMBEDDING OF QR CODES INTO COLOR, GRAY SCALE AND BINARY IMAGES

by Gonzalo J. Garateguy

Approved: Kenneth E. Barner, Ph.D. Chair of the Department of Electrical and Computer Engineering

Approved: Babatunde Ogunnaike, Ph.D. Dean of the College of Engineering

Approved: James G. Richards, Ph.D. Vice Provost for Graduate and Professional Education

I certify that I have read this PhD dissertation and that in my opinion it meets the academic and professional standard required by the University as a PhD dissertation for the degree of Doctor of Philosophy.

Signed: Gonzalo R. Arce, Ph.D. Professor in charge of PhD dissertation

I certify that I have read this PhD dissertation and that in my opinion it meets the academic and professional standard required by the University as a PhD dissertation for the degree of Doctor of Philosophy.

Signed: Kenneth E. Barner, Ph.D. Member of PhD dissertation committee

I certify that I have read this PhD dissertation and that in my opinion it meets the academic and professional standard required by the University as a PhD dissertation for the degree of Doctor of Philosophy.

Signed: Javier Garcia Frias, Ph.D. Member of PhD dissertation committee

I certify that I have read this PhD dissertation and that in my opinion it meets the academic and professional standard required by the University as a PhD dissertation for the degree of Doctor of Philosophy.

Signed: Daniel L. Lau, Ph.D. Member of PhD dissertation committee

ACKNOWLEDGEMENTS

First of all I would like to thank my advisor, Dr. Gonzalo Arce for giving me the opportunity to pursue this long and rewarding journey, and for his help and guidance. I would also like to thank Diego Pienovi without whom I wouldn’t be here, for his friendship and support in the ﬁrst years of my PhD and for all the unforgettable days and nights we spent working and having fun together. To all my friends in Newark which make me feel at home and to all my friends in Uruguay who never let me forget where I came from. To my girlfriend Angela for her understanding, for loving me and for being there for me all these years. Most of all I would like to thank my mother Lidia Fleitas, for her unconditional support in every way and for her trust and love. To my father Eugenio Garateguy in which I undoubtedly see myself every day and who from heaven helped me to achieve what I once saw so far away. This was a very special period in my life in which I have great successes and catastrophic failures, in which I learned about myself and the others and in which I was reminded that what matters is always the journey and not the destination. None of these could have been possible without all of you, and for that I just would like to say thank you. Let’s close this chapter today and start a new one in this story, without forgetting what I learned, what I am, and what I want to become.

iv

TABLE OF CONTENTS

Chapter LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

viii

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ix

ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xiii

1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

2 QR CODE STRUCTURE . . . . . . . . . . . . . . . . . . . . . . . . .

9

2.1

Function Pattern Region . . . . . . . . . . . . . . . . . . . . . . . . .

10

2.1.1 2.1.2 2.1.3

Finder Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . Alignment Patterns . . . . . . . . . . . . . . . . . . . . . . . . Timing Patterns . . . . . . . . . . . . . . . . . . . . . . . . .

10 11 11

Encoding Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2.2.1

Data Capacity and Error correction . . . . . . . . . . . . . . .

12

3 QR DECODING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

2.2

3.1 3.2

3.3

Image acquisition assumptions . . . . . . . . . . . . . . . . . . . . . . Threshold Calculation for Binarization . . . . . . . . . . . . . . . . .

15 17

3.2.1 3.2.2 3.2.3

Global Thresholding Methods . . . . . . . . . . . . . . . . . . Local Thresholding . . . . . . . . . . . . . . . . . . . . . . . . Mean Block Binarization method . . . . . . . . . . . . . . . .

18 18 20

Sampling process . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20

v

4 QR CODE EMBEDDINGS IN COLOR AND GRAY SCALE IMAGES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 4.2 4.3 4.4 4.5 4.6

. . . . . .

24 25 25 26 26 27

5 BLUE NOISE MASKS DESIGN ALGORITHM . . . . . . . . . . .

29

5.1 5.2

Halftoning techniques . . . . . . . . . . . . . . . . Pixel selection . . . . . . . . . . . . . . . . . . . . Luminance modiﬁcation using one parameter . . . Luminance modiﬁcation using two parameters . . Luminance modiﬁcation considering central pixels Color Optimization . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

Blue noise mask design constraints . . . . . . . . . . . . . . . . . . . Centroidal Voronoi Tessellations . . . . . . . . . . . . . . . . . . . . . 5.2.1

5.3

. . . . . .

22

Blue noise pattern design based on Centroidal Voronoi Tessellations . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

Mask design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

32

6 QR CODE EMBEDDINGS IN STIPPLINGS AND HALFTONES 6.1

29 30

37

Luminance modiﬁcation considering central pixels . . . . . . . . . . .

39

7 PROBABILITY OF ERROR MODELS . . . . . . . . . . . . . . . .

41

7.1

Probability of binarization error . . . . . . . . . . . . . . . . . . . . . 7.1.1 7.1.2

7.2

7.3

Probability of binarization error for continuous tone embeddings using one luminance parameter . . . . Probability of binarization error for continuous tone embeddings using two luminance parameters . . . .

image . . . . . . image . . . . . .

42

43 44

Probability of detection error . . . . . . . . . . . . . . . . . . . . . .

45

7.2.1 7.2.2

Probability of detection error for binary image embeddings . . Probability of detection error for continuous tone image embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . .

46

Global probability of error . . . . . . . . . . . . . . . . . . . . . . . .

49

vi

47

8 QUALITY METRICS . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 8.2

50

Human Visual system . . . . . . . . . . . . . . . . . . . . . . . . . . . Visual quality metrics . . . . . . . . . . . . . . . . . . . . . . . . . .

50 51

8.2.1 8.2.2

Quality metrics for color image embeddings . . . . . . . . . . Quality metrics for gray scale and binary image embeddings .

52 53

9 OPTIMAL EMBEDDINGS . . . . . . . . . . . . . . . . . . . . . . . .

54

9.1

Optimization problems . . . . . . . . . . . . . . . . . . . . . . . . . .

54

9.1.1 9.1.2

Optimization of continuous tone images . . . . . . . . . . . . . Optimization of binary images . . . . . . . . . . . . . . . . . .

56 56

Visual quality prioritization . . . . . . . . . . . . . . . . . . . . . . . Optimization Methods . . . . . . . . . . . . . . . . . . . . . . . . . . Optimization results . . . . . . . . . . . . . . . . . . . . . . . . . . .

57 58 60

9.4.1 9.4.2

Color and gray scale image embeddings . . . . . . . . . . . . . Binary image embeddings . . . . . . . . . . . . . . . . . . . .

60 61

10 COMPARISON WITH EXISTING METHODS AND DECODING ROBUSTNESS . . . . . . . . . . . . . . . . . . . . . . .

68

10.1 Decoding robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Resiliency to printer halftoning eﬀects . . . . . . . . . . . . . . . . . . 10.3 Visual quality of the embeddings . . . . . . . . . . . . . . . . . . . .

68 70 71

11 DISCUSSION AND CONCLUDING REMARKS . . . . . . . . . .

74

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

77

9.2 9.3 9.4

Appendix A PROBABILITY OF BINARIZATION ERROR FOR ONE LUMINANCE PARAMETER . . . . . . . . . . . . . . . . . . . . . . B PROBABILITY OF BINARIZATION ERROR USING TWO LUMINANCE PARAMETERS . . . . . . . . . . . . . . . . . . . . .

vii

86 89

List of Tables

2.1

Sizes and data capacity for diﬀerent code versions . . . . . . . . . . . . .

10.1

The MSE and MSSIM are calculated over the luminance of the embedding

14

an the original image. The MSE presented here is normalized by the number of pixels in the image, MSE =

√1 ||(Y out N

viii

− Y )||F . . . . . . . . . . . . . .

72

List of Figures

1.1

Integration of QR codes into publicity materials . . . . . . . . . . . . . .

2

1.2

QR code embedding methods by QR module manipulation

. . . . . . .

4

1.3

Integration of QR codes into publicity materials . . . . . . . . . . . . . .

5

1.4

QR code embeddings in binary images . . . . . . . . . . . . . . . . . . .

6

2.1

QR code regions with the location of each functional pattern highlighted in color . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.2

Diﬀerent QR code versions and their respective arrangement of function pattern modules and encoding region modules . . . . . . . . . . . . . . .

3.1

9

12

This ﬁgure shows the simpliﬁed image acquisition model for typical printing and camera resolutions, (a) Rendering of the code in the image plane considering a pinhole model, (b) original printed image at 300 dpi, (c) image captured by the cell phone camera assuming a pixels size of 1.5 μm and a scanning distance of 4.5 inches

3.2

. . . . . . . . . . . . . . . . . . . . . . .

16

Binarization results for diﬀerent thresholding methods. a) Original image, b) Binary image using the Otsu’s method, c) Niblack’s method, using a window of size 8 × 8 d) Zxing library method using a windows of size 8 × 8

3.3

19

Block subdivision used in the threshold calculation algorithm of the Zxing library. The threshold values for each block are calculated as the average of luminance values in a 5 × 5 window of blocks . . . . . . . . . . . . . . .

3.4

19

a) Acquired image of the QR modules showing the length of the module and the center region in pixels. b) Diagram of the module with the possible locations of sampling points . . . . . . . . . . . . . . . . . . . . . . . . .

4.1

21

Examples of a embedded QR codes for increasing concentrations of modiﬁed pixels pc and a ﬁxed luminance level α = 0.1 . . . . . . . . . . . . . . . .

ix

23

5.1

Stacking constraints for gray levels i0 < i1 < i2 . White pixels represent a 1, black pixels represent a 0 and gray pixels represent an undetermined value in Ii1 . These values are the ones that must be set to either 1 or 0 using the binary pattern design algorithm. . . . . . . . . . . . . . . . . . . . . . .

5.2

Voronoi tessellations of patterns from (a) VaC; (c) DBS; (e) CVT and RAPSD of patterns from (b) Vac; (d) DBS; (f) CVT. . . . . . . . . . . . . . . . . .

5.3

30

33

Comparison of binary patterns generated using (a)VaC; (b) DBS and (c) CVT algorithm. The number shown over the binary patterns is the ink coverage percentage of the pattern and the column on the right of each binary pattern correspond to the absolute value of its DF T without the DC component. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.1

Example of a binary stippling and the embedding generated by the proposed method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.1

36

39

a) Probability of binarization error (PBE) using model in Eqn. (7.5) for diﬀerent values of pc and α, and no noise b) Empirical and modeled PBE for a ﬁxed concentration of modiﬁed pixels pc = 0.5 and no noise at the detector c) Same as in b) for the case of a Gaussian noise with σn = 0.01 at the detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.2

44

a) Probability of binarization error (PBE) as a function of α for ﬁxed β and pc b) PBE as a function of β c) PBE as a function of pc . The variance of the noise used in all simulations was ση = 0.1. . . . . . . . . . . . . . . .

7.3

46

The Figure show the agreement between the empirical probability of error and the predictions from the model in (7.13). (a) Probability of detection error (PDE) as a function of βc with a noise level of ση = 0.03 for α0 , β0 , α1 , β1 , αc ﬁxed, (b) PDE as a function of αc with a noise level of ση = 0.03 (c) PDE as a function of β1 with ση = 0.03. The empirical probability of error is obtained by setting the luminance parameters, adding noise and binarizing the image according to (7.14). All plots are calculated for a randomly selected window in the given binary image and QR code. . . .

x

48

9.1

Diﬀerent stages of the QR embedding procedure. The inputs are the original image, the QR code, the halftone mask, the masks used for image prioritization and global value of Pmax and da . These images are divided in local windows and then optimized independently and in parallel. Finally the results are interpolated to combine them in the ﬁnal result. . . . . . . . .

9.2

Adaptation of luminance levels as a function of the noise power and luminance of the image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9.3

55

62

The columns from left to right correspond to values of Pmax = 0.2, 0.15, 0.1 respectively. (Top row) Optimization results for a size of the center region of da = 2 (Middle row) Optimization results for da = 3, (Bottom row) optimization results for da = 4. The noise level assumed in all the optimization was ση = 0.2 and the size of each QR module was Wa = 8 . . . . . . . .

9.4

63

The columns from left to right correspond to values of Pmax = 0.2, 0.15, 0.1 respectively. (Top row) Optimization results for a size of the center region of da = 2 (Middle row) Optimization results for da = 3, (Bottom row) optimization results for da = 4. The noise level assumed in all the optimization was ση = 0.2 and the size of each QR module was Wa = 8 . . . . . . . .

9.5

64

Comparison of image embedding using diﬀerent halftoning mask, (Top row) blue noise masks, (middle row) green noise mask, (bottom row) clustered dot mask aligned with the QR module centers. The parameters used to generate this optimizations were Pmax = 0.15 and ση = 0.1 . . . . . . . .

9.6

65

Comparison of image embedding using diﬀerent halftoning masks, (Top row) blue noise mask, (middle row) green noise mask, (bottom row) clustered dot mask aligned with the QR module centers. The parameters used to generate this optimizations were Pmax = 0.15 and ση = 0.1 . . . . . . . . . . . . .

9.7

66

Embedding of diﬀerent Stipplings into a QR code (Top row) Optimization results for Pmax = 0.2 (Bottom row) Optimization results for Pmax = 0.05. All embedings were generated assuming a noise level of ση = 0.2 and center size of da = 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xi

67

10.1

(Top row) Image embeddings for a) da = 2 pixels (J=0.4), b) da = 3 pixels (J=0.4053), c) da = 4 pixels (J=0.4194). The QR modules in this example had a size of Wa = 8 pixels, and the global probability of error used was Pmax = 0.2. A noise level of ση = 0.1 was assumed. (Bottom row) Images acquired using a Google Nexus 7 tablet at 12in from the target. The images were printed in a resolution of 150dpi . . . . . . . . . . . . . . . . . . . .

10.2

69

Images captured with a smartphone for diﬀerent sizes of the embedding. (Top row) Color codes printed at 300 dpi with an Laserjet color printer. The corresponding sizes of the codes are (a) 150 QR pixels per inch, (Error rate = 0),(b) 200 QR pixels per inch (Error rate = 0) (c) 300 QR pixels per inch (Error rate = 0). (Bottom row) Black and white codes printed in a Laserjet printer at 300 dpi using a generic postcript driver (d) 150 QR pixels per inch (Error rate = 0) (e) 200 QR pixels per inch (Error rate = 0) (f) 300 QR pixels per inch (Error rate = 70%). The embeddings used in the experiment were designed using Pmax = 0.15, ση = 0.1, da = 4 and Wa = 10

10.3

70

Comparison of the embedding of diﬀerent images and logos into a QR code. a) Embedding of an image for a center region size of da = 2 pixels and a global probability of error of Pmax = 0.2 using a blue noise mask b) Embedding of a logo for da = 1 and Pmax = 0.15 using a clustered dot mask aligned with the centers of the QR code modules. All embeddings assume a noise level of ση = 0.1 c) Embedding of a logo for da = 3 and Pmax = 0.15 using a blue noise mask. The embedding in d) and e) where generated using the method in [1] while the embedding in f) corresponds to [4]

10.4

. . . . .

72

Comparison of binary image embeddings for diﬀerent stipplings. (Top row) Correspond to the embedding using the proposed method, the embeddings were generated using a value of Pmax = 0.1 and assuming a noise level of ση = 0.1 (Bottom row) These embedding were generated from the same input binary image using the method in [13] . . . . . . . . . . . . . . . .

xii

73

ABSTRACT

The grow of smart phone and mobile devices market, has created a new set of opportunities for companies to develop new publicity strategies. In particular the association of printed materials with online content is increasingly used. One of the most widespread forms of engaging mobile users from printed materials is based on the use of QR codes, which have been adopted for many diﬀerent applications such as accessing web sites or downloading premium content. QR Codes are a very reliable and convenient way to introduce textual information into mobile devices without the hassle of typing complicated chains of characters. These applications are outside of the original functional purpose for which these codes were designed in the auto part industry and other considerations besides robustness and speed of decoding have become increasingly important. The problem related to QR code integration into billboards and printed materials presents and important and interesting design challenge since this integration pursue two conﬂicting objectives: the improvement of visual appearance and the maximization of decoding robustness. This thesis focuses on the development of algorithmic techniques for embedding QR codes into logos or images in order to make them visually appealing to the user while maintaining acceptable decoding robustness. In contrast to previous approaches the methods presented here allows to automatically embed QR codes into color, gray scale or binary images with bounded probability of detection error and minimal intervention of the user. These embeddings are designed to be compatible with standard decoding applications and can be applied to any color or gray scale image with full area coverage. The embedding problem is solved by the integration of diﬀerent halftoning and visual quality assessment techniques with numerical optimizations specially tailored for each problem. A model of the probability of error at the QR detector is

xiii

developed and then used in the optimization to yield the best possible combination of transformation parameters for each particular image. Finally we show through experimental results that the probability of detection error of these embeddings is similar to their monochromatic counterparts while yielding considerable visual quality improvements with respect to existing methods.

xiv

Chapter 1 INTRODUCTION

QR codes were originally developed by the Japanese company Denso Wave Corporation [25] with the goal of improve upon the speed and data capacity of traditional one dimensional bar codes. They were released in 1994 and one of the early adopter of this technology was the auto parts industry, for tracking and identiﬁcation of parts in the production line. Following this, QR codes became of common use in the food production industry as an eﬀective way to implement food traceability as well as in other industrial settings. One of the main reasons for the rapid spreading of its use was the decision of Denso to make the speciﬁcations of the QR code publicly available. This and the appearance of camera equipped phones in the Japanese market, capable of scanning QR codes without the need of specialized scanners further increase their use in Japan and eventually brought these codes to the general user. QR codes were standardized, ﬁrst by AIM (Automatic Identiﬁcation Manufacturer Association) International in 1997, then by JEIDA (Japanese Electronic Industry Development Association) in 1998 [29], by the JIS (Japanese Industrial Standards) in 1999 [30] and ﬁnally by ISO as an international standard [27] in 2000. Today QR codes are used in a myriad of applications such as inventory tracking and identiﬁcation in the transport and manufacturing industries[64], as a way of payment in retail stores, for access control and security, or in publicity brochures and artistic works. Their popularity is due in part to the proliferation of smart phones, capable of decoding and accessing on line resources as well as its high storage capacity and speed of decoding. QR codes are a reliable and convenient way to introduce textual information into mobile devices without the hassle of typing complicated chains of characters. They are used to access

1

(a) Low visual impact

(b) High visual impact

Figure 1.1: Integration of QR codes into publicity materials websites, download personal card information, post information to social networks, initiate phone calls, reproduce videos or open text documents. In the publicity industry, the proliferation of mobile devices, has opened a wide range of possibilities in the design of interactive publicity material. Companies use diﬀerent combinations of media to engage new potential customers such as printed brochures, public billboards or online advertisements. The combination of this publicity channels play an important role in modern publicity campaigns which take advantage of new technologies to track, measure and target their audiences with unprecedented accuracy and eﬃciency. In this context QR codes have found great applicability since they provide an eﬀective way to measure the impact and reach of publicity campaigns. Each code scan can be used to trace back the location, date and time in which a particular user expressed interest in the products and this allows an accurate mapping and evaluation of the campaign impact. These use scenarios are clearly outside of the original functional purpose for which QR codes were designed and considerations such as visual appealing and ease of integration into publicity designs play and important role in addition to robustness and speed of decoding. This problem presents an interesting design challenge since the integration of codes into printed materials pursue two conﬂicting objectives as to enhance visual appearance and maximize decoding robustness. Sometimes depending

2

on the nature of the design, the square shapes of QR codes may become a feature as in the example of Figure 1.1a but sometimes they present a problem as in Figure 1.1b from an aesthetic perspective. In any case the impact on the aesthetics of publicity designs is their main problem since in general the integration of QR codes must be performed by trial and error which is very time consuming and expensive. This challenge has generated great interest for algorithms capable of embedding QR codes into color or gray scale images without loosing decoding robustness. There have been several eﬀorts both from the academic and commercial sectors to improve the appearance of QR code embeddings. These methods can be classiﬁed in two main categories depending on the approach used to embed the image into the QR code. The ﬁrst group corresponds to direct manipulation of QR modules, some of this methods allow to embed high resolution images in the code while others only support binary images with the same pixel size as the QR modules. The methods presented in [50, 60] base the strategy on ﬁnding the best group of QR modules to substitute by the image or logo. In [50] genetic optimization algorithms are used to select the angle, size and position of the logo to be inserted by maximizing the probability of correct decoding with multiple QR readers. The resulting embeddings in general have an area coverage proportional to the error correction capability of the code which at most can reach 30%. These embeddings have a high visual impact as depicted in Figure 1.2a and may be unsuitable for particular design requirements. An important conclusion from these studies is that there are areas in the code less susceptible to substitution than others. For example for large code sizes if the number of information bits is relatively small, there might be a substantial number of padding code words which do not carry any information and can be replaced with minimal impact in decoding robustness [60]. Examples of methods that take advantage of unused modules in the padding regions are [72, 51, 24]. These methods identify the information carrying code words and determine a region in which the image can be safely inserted with minimal impact in the correction capacity (see Figure 1.2b). In general these approaches do not take advantage of the codeword generation process which impose severe area restrictions in

3

(a) Genetic [50]

(b) QRjam [24]

(c) QR Maker [20]

(d) QRArt[15]

Figure 1.2: QR code embedding methods by QR module manipulation the location of modiﬁed modules. This problem was addressed by recently developed techniques [20, 41, 15] which manipulate the Reed Solomon encoding procedure to maximize the area coverage. In [20] for instance, non-systematic encoding is used to manipulate the information and padding modules such that the resulting sequence of Reed Solomon code words has minimum hamming distance with respect to the binarized image (see Figure 1.2c). Figure 1.2d depicts the approach in [15] which is similar to [20] and takes advantage of the linear properties of Reed-Solomon codes to manipulate the information modules such that the resulting code resembles the original image. The second category of QR embedding algorithms is based on the modiﬁcation of the luminance of individual pixels without altering the location of QR modules. The approach in [1] is based on a method where the luminance of the image is changed according to the code structure. The visual quality is improved by altering the luminance at the center of each QR module according to its value while the remaining pixels in the module are less altered. This approach provides better trade oﬀ between robustness and distortion for a general image but the coarse structure of QR modules creates undesirable artifacts (see Figure 1.3b). Another method implementing a similar idea developed by Denso Wave is LogoQ [26] which also modify the luminance of the image pixels according the values of the QR code. This method achieves acceptable results for moderately simple images such as logos with few colors (see Figure 1.3a) but due to the percentage of pixels that are modiﬁed is not suitable to embed images with many details. The method in [4] develops a similar approach to the one presented in this

4

(a) LogoQ [26]

(b) Visualead [1]

(c) QRimages

Figure 1.3: Integration of QR codes into publicity materials thesis, changing the luminance values of the code to approximate the desired image. Statistical method to calculated the probability of error are developed, but do not consider the distinction between central and non-central pixels in the QR module or the fact that binarization thresholds are a function of the embedding luminance values. Evidence of the growing interest in algorithms capable of embed images into QR codes is the number of patent applications that have been ﬁled in recent years [21, 78, 31, 46] by important technology companies proof of the commercial value of this techniques. Finally Figure. 1.3c depicts the results of embedding a QR code using the method proposed in this thesis where the integration of halftoning masks and numerical optimization techniques allows to improve both the area coverage and the image quality while minimizing the probability of error. Note that the size and luminance of each cluster changes depending on the neighbouring clusters as well as on the color of the image. These type of algorithms are preferable for the embedding of color images since by changing only the pixels luminance is possible to retain the color information while still being able to minimize the probability of detection error. Another important problem which have remained very challenging is how to embed QR codes into high resolution binary images such as halftones or stipplings. This is a relevant problem since printed materials such as newspapers usually render gray scale images using halftones or stipplings and in some cases it might be desirable to embed QR codes into those images for either practical or artistic reasons. The main constraint for these embeddings are the limited degrees of freedom to manipulate the

5

(a) Original halftone

(b) Proposed

(c) Visual Lead [1]

(d) HalftoneQR [13]

Figure 1.4: QR code embeddings in binary images code when compared with the case of grayscale or color images. The method in [24] for example takes advantage of padding modules in the code to embed the binary image without altering its correction capacity. Another example is the method proposed in [15] which only allows to encode URLs (see ﬁgure 1.2d) generating an error free QR symbol. The method proposed in [13] achieves improved resolution by dividing each QR module in 9 pixels with an increase in resolution by a factor of 3 (see Figure 1.4d). This algorithm keeps the pixel at the center of the module mainly unchanged while optimizing the values of the remaining binary pixels to maximize reliability and visual quality. Finally ﬁgure 1.4b depicts the results obtained by our embedding method which consist on the modiﬁcation of the luminance of each pixel according its value in the binary image and the code. By allowing each pixel to take an intermediate luminance value, the binary constraints are relaxed and it is possible to achieve both high ﬁdelity with respect the original image and high decoding robustness. For any embedding technique, the main challenge is how to overcome the limitations imposed by the QR code standard. The predeﬁned redundancy levels and simple binarization techniques set a hard limit on how much a QR code can be altered while still remaining decodable. Luminance modiﬁcation algorithms introduce distortions with respect to a monochromatic code, altering the binarization thresholds and thus increasing the probability of detection error. Another challenge concerns the problem of using the entire area of the code in which the image or logo is to be embedded. This cannot be achieved by simply replacing information modules with the desired

6

image since the number of modules that can be replaced is at most proportional to the correction capacity of the code [60]. A good embedding method, should minimize the number of corrupted modules and use the greatest possible area while keeping visual ﬁdelity to the original image. To achieve the maximum possible quality, the embedding method must take into account the binarization process and in particular the threshold calculation process which has been largely overlooked by previous embedding methods. This thesis focuses on the development of algorithmic techniques to embed QR codes into color, grayscale or binary images maintaining visually ﬁdelity and good decoding robustness. In contrast to previous approaches the method presented here allows to automatically embed the codes with bounded probability of detection error and minimal intervention of the user. This is achieved by using prior knowledge of the thresholding techniques used in the decoder, thus generating embedded images that are decodable with standard reading applications. Any color, gray scale or binary image can be embedded covering the full area of the code, regardless of the error correction level used. The algorithm proposed fall in the category of luminance modiﬁcation algorithms and is based on the transformation of the image luminance according to the values of a desired QR code at carefully selected locations. The selection of the luminance levels as well as the concentration of modiﬁed pixels is solved by the integration of diﬀerent halftoning and visual perception techniques as well as a probabilistic model for the detection error at the decoder. A fundamental contribution in this thesis is the development of simple yet useful models for the probability of detection error adapted to the embedding methods in color, grayscale and binary images. These models are used in conjunction with a visual perception regularization to optimize the parameters of the embedding algorithm. In addition a novel blue noise masking design algorithm based on centroidal voronoi tessellations is presented which improves upon previous methods in the spectral and spatial domains. This algorithm contribute to the blending between the QR code and the image and minimize the visual perception of modiﬁed pixels.

7

The embedding algorithms proposed are rather simple, consisting on the modiﬁcation of the luminance from a group of pixels in each QR code module. This pixels are selected by thresholding a halftoning mask for the case of continuous tone images or based in the image contrast for the case of binary images. There are few tunable parameters in the transformation and their values are obtained as the result of an optimization problem.

8

Chapter 2 QR CODE STRUCTURE

The patterns and structures inside a QR code have well deﬁned functions which include symbol alignment, sampling grid determination, and error correction. The information is encoded in square black and white modules of several pixels wide. Finder patterns play a central role in the speed and success of decoding and are located in three corners of the symbol as depicted in Figure 2.1. QR readers use binary images resulting from thresholding the captured gray scale image with local or global thresholds. This particular feature simpliﬁes the computations and reduce the processing requirements for QR decoding. Figure 2.1 shows the main regions in the QR symbol and their functions. The modules in a QR code can be classiﬁed in two main categories: function pattern region and encoding region. The function pattern region includes the ﬁnder and alignment patterns as well as the timing patterns. The encoding region contains

Figure 2.1: QR code regions with the location of each functional pattern highlighted in color

9

the information codewords, the error correction codewords and the modules used for the determination of the version and type of encoded data. 2.1

Function Pattern Region This regions contain all the necessary information to successfully detect and

sample the information bits of the code. Finder and alignment patterns are the most essential modules in the region and are key to locate, rotate and align the QR code as well as to correct for deformations in the printing surface. Precision in the detection and identiﬁcation of ﬁnder and alignment patterns is central for a successful decoding. It has been shown in experiments that if the ﬁnder patterns cannot be recognized, then the decoding fails completely [60]. For small code sizes these patterns also play a central role in the determination of the sampling grid from which the codewords are extracted. In addition to ﬁnder and alignment patterns, timing patterns also aid in the determination of the sampling grid especially for large code sizes. 2.1.1

Finder Patterns Finder patterns are easily identiﬁable as 3 concentric square structures in the

corners of the code. They are designed to have the same ratio of black and white pixels when intersected by a line at any angle, allowing to determine its center even if the code is scanned at arbitrary angles. The ratio of black and white pixels along this line must follow the ratio 1:1:3:1:1. This means that when a ﬁnder pattern is intersected by a line, the resulting waveform should be composed of one pulse of amplitude zero followed by another pulse of the same width and amplitude one and then three consecutive pulses of amplitude zero followed by one pulse of amplitude one and one pulse of amplitude zero. All this pulses should have approximately the same length for the line to be considered as part of a ﬁnder pattern. The centers of the patterns are found by taking all contiguous lines in which the expected waveform was detected and calculating its midpoint between the starting and ending point of the pattern and along vertical and horizontal directions. Finder patterns are surrounded by two guard zones of one QR

10

module wide called the separators. These zones aid in the separation of ﬁnder patterns from the encoding region and in the identiﬁcation of the proper sequence of black and white pulses further improving the location accuracy. 2.1.2

Alignment Patterns Alignment patterns on the other hand are used to determine the sampling grids

from which codewords are extracted and to correct for possible deformation of the printing surface. They are easily identiﬁable as concentric square structures and are evenly distributed along the code area. The sequence of black and white pulses when intersected by a line is 1:1:1:1:1. This means that the waveform must be composed of a train of ﬁve alternating pulses of the same width starting with a pulse of amplitude zero and ending with another pulse of amplitude zero. 2.1.3

Timing Patterns The standard also deﬁnes two zones consisting on one row and one column of

alternating black and white QR modules, denoted as the timing zones and located between ﬁnder patterns. This patterns aid in the determination of the sampling grid and the correction for perspective transformation in conjunction with alignment patterns. 2.2

Encoding Region The code area delimited by ﬁnder patterns is denoted as the encoding region,

where data, parity modules and decoding information is stored. This area is divided into codewords consisting of blocks of 8 QR modules. Two dimensional shapes of this codewords depend on the version of the code and are designed to optimize area coverage (see for example Figure 2.2). This region also contain version and format modules that carry information about the data type stored in the code as well as its expected size. The version information allows to disambiguate and correct an improper estimation of the module size since it determines the total number of modules in the code.

11

Figure 2.2: Diﬀerent QR code versions and their respective arrangement of function pattern modules and encoding region modules 2.2.1

Data Capacity and Error correction Diﬀerent types of QR codes deﬁned in the standard [27], are identiﬁed by their

version and error correction level. The version of the QR code determines its size and goes from 21 × 21 modules for version 1 up to 177 × 177 for version 40 (see table 2.1). There are 4 types of error correction L (low), M (medium), Q (quartile) and H (high) that allow to correct up to 7%, 15%, 20% and 30% of codewords in error respectively. Reed-Solomon codes are used, with correction and detection capacity given by e+2t ≤ k−p where k is the number of error correcting codewords, p the number of miss decoded protection codewords, e the number of erasures and t the number of errors [27]. Only versions 1 to 3 use detection codewords which allow to identify a number of errors greater than the correction capacity and fail the decoding. Maximum data capacity is given by the size and correction level of the code. For example a code of version 1 and 7% correction capacity has a total number of codewords n = 26 and k = 7 correction codewords. This code is capable of storing n − k = 26 − 7 = 19 codewords and has an error correction capacity of 2 codewords. The maximum number of data encoded symbols depends on the format used and the highest capacity is obtained for version 40 − L which is capable of storing 7089 numeric symbols {0, ..., 9}. The

12

standard also provides the possibility to store diﬀerent data types inside the symbol as alphanumeric {0, .., 9, A, .., Z, space, $, %, ∗, +, −, ., /, :, }, or other japanese characters as 8-bit JIS symbols (Latin and Kana) or Kanji Characters. For a complete list of version sizes an data encoding capacity refer to table 2.1.

13

Version

Modules per side

Function pattern modules

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101 105 109 113 117 121 125 129 133 137 141 145 149 153 157 161 165 169 173 177

202 235 243 251 259 267 390 398 406 414 422 430 438 611 619 627 635 643 651 659 882 890 898 906 914 922 930 1203 1211 1219 1227 1235 1243 1251 1574 1582 1590 1598 1606 1614

Format and Version modules 31 31 31 31 31 31 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67

Data modules

Code word capacity

Reminder modules

208 359 567 807 1079 1383 1568 1936 2336 2768 3232 3728 4256 4651 5243 5867 6523 7211 7931 8683 9252 10068 10916 11796 12708 13652 14628 15371 16411 17483 18587 19723 20891 22091 23008 24272 25568 26896 28256 29648

26 44 70 100 134 172 196 242 292 346 404 466 532 581 655 733 815 901 991 1085 1156 1258 1364 1474 1588 1706 1828 1921 2051 2185 2323 2465 2611 2761 2876 3034 3196 3362 3532 3706

0 7 7 7 7 7 0 0 0 0 0 0 0 3 3 3 3 3 3 3 4 4 4 4 4 4 4 3 3 3 3 3 3 3 0 0 0 0 0 0

Table 2.1: Sizes and data capacity for diﬀerent code versions 14

Chapter 3 QR DECODING

After acquiring the image and calculating its luminance from the RGB components, the decoding process continues with three basic stages: binarization, detection, and decoding of the bit stream. In the binarization stage, the gray scale image captured by the camera is segmented into black and white pixels. This binary image is used to determine the QR modules centers and the sampling grid from which the codewords are extracted. After this process, detected codewords are corrected using the Reed Solomon algorithm and then decoded according to its format and version. 3.1

Image acquisition assumptions The QR decoding procedure speciﬁed in the standard is deﬁned for gray scale

images and therefore the conversion from color is the ﬁrst step in the decoding process. This process includes several steps which are in general highly dependent on the camera model and color pipeline implemented [58]. In he following sections we simplify this process assuming that the luminance values used for QR decoding are obtained as Y = 0.2989R + 0.5870G + 0.1140B, where R, G, B are the values stored in the captured image ﬁle. This implies that the observed colors are close the intended RGB colors of the embedded code and there is an insigniﬁcant color distortion introduced by the printing device and the camera. The image acquisition model is then given by Y = Y p ∗ hC + n

(3.1)

where Y is the captured image, Y p is the luminance of the printed embedded code, hC represent the point spread function of the camera and n is an array of white Gaussian random variables modeling the noise at the detector. For typical camera and printing

15

(a) Simplified acquisition model

(b) Original image

(c) Capture image

Figure 3.1: This ﬁgure shows the simpliﬁed image acquisition model for typical printing and camera resolutions, (a) Rendering of the code in the image plane considering a pinhole model, (b) original printed image at 300 dpi, (c) image captured by the cell phone camera assuming a pixels size of 1.5 μm and a scanning distance of 4.5 inches resolutions it is possible to approximate hC by the ideal point spread function hC = δ(x, y) since typically each pixels of the printed QR code spans several pixels in the captured image. Approximating the optical system by a pinhole camera it is easy to see (Figure 3.1a ) that the minimum dot size in the printed code in order to resolve individual pixels in the camera is given by D≈

Dp d f

(3.2)

where Dp is the pixel size in the camera sensor, f is the focal distance and d is the scanning distance from the code. In line with the pinhole camera assumption, we consider as it is customary in QR decoding applications that the captured code is always in focus at the sensor plane and there is no geometric blur. Consider for instance the typical values for an iPhone camera which has pixel size of approximately 1.5 μm and focal length of f = 3.0 mm. For a scanning distance of 4.5 inches the corresponding maximum image printing resolution is approximately 440 dpi, which is close to the standard resolution of most inkjet and laser printers. If for example the code is printed at 300 dpi then each pixels in the code will span approximately 1.5 pixels in the image. The number of acquired pixels per QR pixel can be increased or decreased by changing the scanning distance around these typical values. In the

16

remaining of this thesis we assume this simpliﬁed model and consider the point spread function hC = δ(x, y) as ideal. If this hypothesis is not true for a particular camera we can always improve the number of acquired pixels per QR pixel by reducing the scanning distance to the code. As long as we deal with codes of small sizes this should not present an issue. However for higher code sizes the point spread function eﬀects should be consider and included in the probability of error model. For details about appropriate lens distortion models refer to [12]. Finally the luminance captured by the camera depends on the reﬂectance of the printed code and the intensity of the illumination source. We will assume that this illumination can be locally model as a constant factor neglecting its relevance at the binarization process since all the pixels are multiplied by the same factor. 3.2

Threshold Calculation for Binarization A salient feature of QR codes which plays a central role in their decoding speed

is the use of binarization as the ﬁrst step in the decoding process. Binary images are obtained by thresholding the captured gray scale image as ⎧ ⎨ 1 if Y > t i,j i,j IB [i, j] = ⎩ 0 if Y ≤ t i,j

(3.3)

i,j

where Y is the image captured by the camera, ti,j is the threshold assigned to pixel [i, j] and IB [i, j] is the binary output. Determining the values of the thresholds ti,j for an optimal binarization is a challenging task due to the variability in lighting conditions. The QR code standard does not deﬁne speciﬁc binarization methods and these are usually selected making a trade oﬀ between speed and quality. Thresholds can be calculated globally using all the pixels in the image or locally considering only reduced windows. Global segmentation methods as [52, 8] have been tried and proved eﬀective for uniform illumination conditions only. Local thresholding strategies on the other hand are better suited to illumination variation and several strategies such as [49] and [61] have been employed with relatively successful results [79]. Adaptive methods such as the one presented in[76] show better binarization accuracy but at the cost of

17

increased computational complexity which is not always possible to satisfy in embedded devices. 3.2.1

Global Thresholding Methods Global thresholds are calculated based on all the pixels in the gray scale image

Y . A well known algorithm for automatic global thresholding is Otsu’s method [52] which assumes that the image histogram has two deﬁned modes and selects a threshold by minimizing the intra-variance between these modes. There are similar methods as the one presented in [8] which maximizes the intra-class entropy to ﬁnd the threshold. In general global methods, have successful results for QR code binarization when the illumination is uniform while having a medium to low computational complexity. Simpler thresholds like t = mean(Y ), t = median(Y ) or t = max(Y )+min(Y ) perform 2

slightly worse than Otsu’s method but the results are acceptable as long as the image has suﬃcient contrast. In general as is the case for any image segmentation task, global strategies present strong limitations when the illumination is poor or changes drastically in diﬀerent parts of the image. Figure 3.2b depicts the result of binarizing an uneven illuminated QR code, using Otsu’s method. In this case the decoding fails at the binarization stage because the third ﬁnder pattern of the code is lost in the process. 3.2.2

Local Thresholding Local thresholding strategies are more robust to illumination variation since

smaller patches in the image are more likely to be bimodal. Two local thresholding strategies are presented in Niblack’s [49] and Sauvola’s [61] methods. Niblack’s method deﬁne the thresholds as ti,j = Y¯Bi,j + k

18

V ar(YBi,j )

(3.4)

Figure 3.2: Binarization results for diﬀerent thresholding methods. a) Original image, b) Binary image using the Otsu’s method, c) Niblack’s method, using a window of size 8 × 8 d) Zxing library method using a windows of size 8 × 8 where Bi,j is a patch centered around pixel [i, j] and Y¯Bi,j = mean(YBi,j ). The threshold in Sauvola’s method have similar form given by

V ar(Y Bi,j ) −1 . ti,j = Y¯Bi,j 1 + k R

(3.5)

In these methods, the value of the local thresholds depend on the patch size, and a few tunable parameters that can be adjusted to optimize the segmentation. Other functions such as, local mid gray ti,j = (max(YBi,j ) + min(YBi,j ))/2, local median ti,j = median(YBi,j ) or local mean ti,j = mean(YBi,j ) are commonly used with slightly inferior performances to Niblack’s or Sauvola’s methods [79]. 3.2.3

Mean Block Binarization method One of the most popular libraries for QR code generation and reading is the

open source Zxing library [54]. The thresholds used in the binarization functions of this library, are calculated through a local method that use the average luminance in a set of overlapping square windows. Figure 3.3. shows a diagram of this subdivision.

19

Figure 3.3: Block subdivision used in the threshold calculation algorithm of the Zxing library. The threshold values for each block are calculated as the average of luminance values in a 5 × 5 window of blocks The captured image is divided into non-overlapping blocks Bm,n of 8 × 8 pixels and then the average luminance in a window of 5 × 5 blocks is calculated according to Tm,n

p=m+2 q=n+2 1 = 25 × 64 p=m−2 q=n−2

Y [k, l].

(3.6)

(k,l)∈Bp,q

The averages calculated for each block Bm,n are assigned to the pixels in the block as ti,j = Tm,n for [i, j] ∈ Bm,n . Segmentation performance is inferior to the methods presented in [49] and [61] but this method has the advantage of being computationally less expensive. This is the technique assumed in the following sections to develop the probability of error model and the QR embedding algorithm. 3.3

Sampling process Once the binary image IB is obtained, codewords are extracted by sampling

on a grid estimated using ﬁnder and alignment patterns. The points in this grid are generated by drawing parallel lines between the estimated centers of ﬁnder and alignment patterns and the spacing between lines is set to the estimated width of a QR module Wa . For larger code sizes, multiple sampling grids are used to compensate for local geometric distortions.

20

Figure 3.4: a) Acquired image of the QR modules showing the length of the module and the center region in pixels. b) Diagram of the module with the possible locations of sampling points In order to safely detect the binary value, the luminance around the center of the module should be clearly deﬁned. If we deﬁne a region of size da × da pixels centered in the QR module, the probability of sampling outside this region can be obtained by assuming a Gaussian distribution of the sampling point around the center and integrating outside the region. In the following section we model the sampling distribution as a Gaussian with σ = W a/4. The probability of sampling error denoted by ps can be precomputed for diﬀerent sizes of Wa and da to be used in the algorithm.

21

Chapter 4 QR CODE EMBEDDINGS IN COLOR AND GRAY SCALE IMAGES

This Chapter presents the QR embedding methods for color and grayscale images which encode the information bits into the luminance values of the image in such a way that the average luminance is increased for light regions in the code and decreased for dark regions. In fact any embedding algorithm tailored for a standard decoder must be a variation of this type since the binarization thresholds are usually calculated as local averages of luminance pixels. Without modifying the detector or adding conditioning layers before the decoder, there are only two degrees of freedom in the transformation. The ﬁrst is the selection of the number and location of modiﬁed pixels and the second is the luminance level to which this pixels are to be transformed. The number and distribution of modiﬁed pixels can be organized in diﬀerent ways. The algorithm in [1] for example chooses central pixels of each module since these are the ones sampled after binarization; however, the number and luminance change of modiﬁed pixels creates undesired square structures that distort the original image (see Fig. 1.3b). This problem can be attenuated by choosing a random distribution of pixels but this might decrease the probability of correct sampling. The technique presented in this thesis consists of two components. The ﬁrst is the use of halftoning techniques to select the set of modiﬁed pixels and break or attenuate the coarse square structures of QR modules and the second is the optimization of the luminance and concentration of modiﬁed pixels in local neighborhoods. Diﬀerent luminance modiﬁcation and pixel selection strategies are proposed depending on the type of image to be embedded and the level of robustness desired. The simplest possible strategy consist on changing the luminance of modiﬁed pixels to a luminance level of either α or 1−α where α ∈ [0, 1] depending on the value of

22

Figure 4.1: Examples of a embedded QR codes for increasing concentrations of modiﬁed pixels pc and a ﬁxed luminance level α = 0.1 using the luminance transformation with a single parameter and a blue noise halftoning the QR code at that location. This method does not distinguish between the pixels in the center of the module and tries to distribute the modiﬁed pixels such that in average the probability of sampling a pixel with the desired luminance is maximized. In our experiments, for most images the optimal concentration is in the order of 30% − 40%. This yield acceptable results for color images since the perceptual diﬀerence can be attenuated by changing the luminance to the desired level while keeping the color unchanged. In the case of gray scale images there is a greater visual impact in order to maintain decodability. The second method proposed tries to minimize this problem using two independent luminance levels α and β with α, β ∈ [0, 1]. This increases the degrees of freedom in the optimization and allows to improve the visual quality. In both approaches, the code and image are divided in a set of non-overlapping blocks and the luminance parameters are optimized independently for each block considering the luminance values in an extended window centered at the block of interest. The concentration of modiﬁed pixels denoted by pc is also optimized independently for each block and uniquely deﬁnes the number and location of modiﬁed pixels. Since the distribution of modiﬁed pixels is obtained by thresholding diﬀerent halftoning masks, the type of pixel distribution i.e : blue noise patterns, clustered dot or green noise pattern, can be controlled by simply changing the masks without any modiﬁcations to the algorithm. For low concentrations of modiﬁed pixels pc ≈ 0, the embedding resembles the original image

23

while for high concentrations pc ≈ 1 it resembles the QR code (see Fig. 4.1 for an example using blue noise patterns). The third embedding approach considers that the sampling accuracy of QR module centers is high and it is possible to resolve individual pixels in the code. In addition to the levels α and β two additional levels αc and βc are deﬁned for the center pixels in each QR module and its values optimized independently from α and β. This has a profound impact in the visual quality of the embedding and reduces the number of modiﬁed pixels dramatically at the cost of decreasing the tolerance to sampling error. 4.1

Halftoning techniques The method proposed to select modiﬁed pixels is based on halftoning techniques

in order to minimize the appearance of blocks while preserving the high frequency details. If modiﬁed pixels are randomly but uniformly distributed in space, the visual impact of the embedding is minimized since these patterns concentrate most of their energy at higher frequencies where the human visual system is less sensitive. This eﬀect is commonly used in digital halftoning [38] where diﬀerent algorithms to generate even distributions of points with particular spectral properties have been proposed. Examples of such algorithms are, error diﬀusion [19, 66], blue noise masks [39, 34, 22], green noise masks [39], or direct binary search [3]. Error diﬀusion has a long history in the printing industry where it has been used for the last 30 years; however, being a recursive algorithm, the processing time increase considerably for very large images. Blue and green noise masking techniques in contrast, generate high frequency binary patterns by thresholding a carefully designed multilevel array. These patterns can be computed in parallel which greatly increase the speed of generation. Many algorithms to design halftoning masks were presented in the literature such as, Void and Cluster [71], direct binary search [3], green noise masks [37, 39], blue noise multitone dithering [28] and techniques based on centroidal voronoi tessellations [22]. Each of these techniques have diﬀerent computational complexities, but since the mask design process is performed oﬄine the speed of pattern generation does not change. Chapter 5

24

introduce a new blue noise masking design method which can be used in the distribution of modiﬁed pixels. 4.2

Pixel selection Ideally, only the pixels at the center of the QR module are relevant for correct

decoding, however, due to errors in the determination of the sampling grid, adjacent pixels play an important role in the decoding process. To account for this we make a distinction between the pixels in the QR module. A square of size da × da is always selected for modiﬁcation and the remaining pixels in the module are selected using a halftone mask. The size of the square of center pixels is selected by the user and regulates the robustness of the embedding but also aﬀects its visual quality. The distribution of modiﬁed pixels in non central pixels is generated by thresholding a blue or green noise mask to create a binary pattern with a dot concentration of pc . To simplify the notation we denote by Ipc the binary pattern generated by the halftoning mask and by M a mask that is 1 on the set of central pixels and 0 otherwise. 4.3

Luminance modiﬁcation using one parameter After selecting the pixels to be modiﬁed, the algorithm needs to change its

luminance to match one of the corresponding levels α or 1 − α. Diﬀerent values of α are used for each block in the image, and the optimal values are determined by minimizing the visual image distortion while keeping the probability of error below a out at (i, j) is selected predeﬁned threshold. The luminance of the embedded image Yi,j

as a function of the QR code value qi,j and the luminance of the original image Yi,j as, ⎧ ⎪ ⎪ 1 − α if Ipc ,(i,j) = 1 , qi,j = 1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ out (4.1) Yi,j = α if Ipc ,(i,j) = 1 , qi,j = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ Y otherwhise. i,j

25

This transformation changes the luminance of pixels selected according to the halftoning mask and keep the remaining pixels in the image unchanged. 4.4

Luminance modiﬁcation using two parameters The second variant proposed for the luminance transformation uses two inde-

pendent luminance parameters α and β to improve the visual quality of the embedding. This transformation is deﬁned analogously to the one parameter transformation, changing the luminance of modiﬁed pixels while maintaining the values of unmodiﬁed pixels equal to the original image. The output luminance of the transformation is given by ⎧ ⎪ ⎪ ⎪ β if Ipc ,(i,j) = 1 , qi,j = 1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ out (4.2) Yi,j = α if Ipc ,(i,j) = 1 , qi,j = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ Y i,j otherwhise. 4.5

Luminance modiﬁcation considering central pixels The previous transformations do not prioritize central pixels from the rest of

the pixels in the QR module. However when the sampling accuracy is high it is advantageous to consider this case since it allows to greatly reduce the number of modiﬁed pixels and improve the visual quality. The luminance transformation in this case is

26

given by

out = Yi,j

⎧ ⎪ ⎪ β ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ α ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨

if Mi,j = 0 , qi,j = 1, Ipc ,(i,j) = 1

if Mi,j = 0 , qi,j = 0, Ipc ,(i,j) = 1

βc if Mi,j = 1 , qi,j = 1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ αc if Mi,j = 1 , qi,j = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ Y i,j otherwise.

(4.3)

This transformation changes the luminance of the pixels that are selected according to the halftone distribution and keep the remaining pixels in the image unchanged. The pixels at the center of the QR module are assigned with diﬀerent luminance levels, since they play a central role in the detection of binary values. 4.6

Color Optimization To determine the best possible color that fulﬁlls the luminance constraint, it is

necessary to measure the color diﬀerences into a perceptually uniform color space. One possible choice is to use the Lab color space since in this space, euclidean distances are approximately proportional to perceptual changes [68]. Another alternative is the HSL color space commonly used in computer graphics which involves simpler computations and it is still capable of representing perceptual diﬀerences using euclidean metrics. This is the selected color space for all the simulations presented here since the objective function in the optimization requires the repetitive calculation of this transformations and computational complexity is of high concern. To obtain the optimal RGB values given a luminance target lt , the original RGB vector is transformed into the HSL color space and then the L component is optimized keeping S and H ﬁxed until reaching the desired luminance Y = lt . By keeping S and

27

H unchanged the perceptual color diﬀerence is minimized while we are still able to reach any desired luminance target. The relationship between the luminance deﬁned as Y = 0.2989R + 0.5870G + 0.114B and the L = min(R,G,B)+max(R,G,B) component 2

in the HSL color space is a piecewise linear and monotone function Y = f (L). If the weight luminance vector is deﬁned as w = [0.298, 0.587, 0.1140]T , then f (L) is given by Y = f (L) = w T T −1 (H, S, L),

(4.4)

where (R, G, B) = T −1 (H, S, L) is the backward transformation from HSL to RGB color spaces. The optimal value of L is obtained from the luminance target Y = lt as the solution of L∗ = argmin|f (L) − lt |. L

(4.5)

This problem reduces to root ﬁnding of a monotone function of one variable which can be easily solved for example by the bisection method. Once the optimal value L∗ is determined, the new RGB components of the pixels are obtained by using the forward transformation between the HSL and RGB color spaces (R∗ , G∗ , B ∗ ) = T (H, S, L∗ ). Summarizing the diﬀerent steps, the transformations to obtain the RGB value of a modiﬁed pixel, given the target luminance value lt is described in algorithm 1. Algorithm 1 Color modiﬁcation to achieve luminance target Require: (R, G, B) pixel and target value of Y = lt (H, S, L) ← T (R, G, B) find L∗ = argmin|f (L) − lt | L

(R∗ , G∗ , B ∗ ) ← T −1 (H, S, L∗ ) return (R∗ , G∗ , B ∗ )

28

Chapter 5 BLUE NOISE MASKS DESIGN ALGORITHM

This section presents a procedure to design blue noise masks which can be used to select the distribution of modiﬁed pixels in any of the embeddings methods. Blue noise Masking is a method to generate halftones from a gray scale image by thresholding the image with a pre-designed dither array. Early development of blue noise masking design techniques [45, 69, 77, 2, 14, 65, 35], introduced diﬀerent methods where consecutive thresholds are distributed in a pseudo-random fashion to create the illusion that the resulting halftone was created by means of error diﬀusion. As blue is the high frequency component of visible white light, Ulichney [70] coined the term “blue-noise” to describe these patterns. The human visual system, being less sensitive to random patterns while ﬁnding isolated dots harder to see, would ﬁnd the resulting texture less visible overall. 5.1

Blue noise mask design constraints The masking process is a point process by which each pixel of the continuous

tone input image Y [m, n] of G gray levels is compared to a corresponding pixel in a threshold matrix S[m, n] of size M × N called mask or screen. ⎧ ⎨ 1 if Y [m, n] > S[m, n] Ib [m, n] =

⎩

(5.1) 0 if Y [m, n] ≤ S[m, n].

In masking processes, there is a unique output binary pattern Ii corresponding to each constant input of gray level i with i ∈ {0, 1, ..., G − 1}. The number of pixels set to one in each binary pattern corresponds to the gray level it represents. For instance, I0 is the all zero matrix while IG−1 has all its entries equal to 1. In general

29

an intermediate binary pattern Ii has M.N.i/(G − 1) + 0.5 pixels set to 1 such that the average of all the pixels equals the desired gray level. If a position in Ii0 is set to 1 at gray level i0 that position remain set to 1 for any other Ii1 with i1 > i0 . Conversely if a position in Ii2 is set to 0 at gray level i2 then Ii1 is set 0 in the same position for any i1 < i2 (see Fig. 5.1). These restrictions are called stacking constraints and are given by if

Ii−1 [n] = 1 ⇒ Ii [n] = 1

if

Ii+1 [n] = 0 ⇒ Ii [n] = 0.

(5.2)

Having all the binary patterns corresponding to each of the input gray levels, the threshold matrix S[n] is given by S[n] =

G−1

Ii [n].

(5.3)

i=0

Ii0

constraints

Ii2

Ii1

Figure 5.1: Stacking constraints for gray levels i0 < i1 < i2 . White pixels represent a 1, black pixels represent a 0 and gray pixels represent an undetermined value in Ii1 . These values are the ones that must be set to either 1 or 0 using the binary pattern design algorithm. 5.2

Centroidal Voronoi Tessellations The mask design algorithm presented here is based on the use of centroidal

Voronoi tessellations and allows for a ﬁne tuning in the distribution of minority pixels

30

and a ﬁne control over the quality of binary patterns at diﬀerent gray levels. Centroidal Voronoi tessellations (CVTs) have been used in many applications ranging from data compression to optimal quadrature rules, optimal representation, quantization and clustering [16]. In general the usefulness of CVTs relies on its minimization properties and the capacity to create uniform distributions of dots. In the mask design we take advantage of these properties to optimize the distributions of minority pixels in the binary patterns of the Mask. By introducing a modiﬁcation of Lloyd’s algorithm this optimization can be controlled as a function of the gray level, yielding binary patterns of even quality across the complete gray scale. 5.2.1

Blue noise pattern design based on Centroidal Voronoi Tessellations The Blue noise model deﬁned by Lau and Ulichney in [33] established that the

mean distance between minority pixels in an ideal blue noise pattern is determined by the principal wavelength ⎧ ⎪ ⎪ ⎪ ⎪ ⎨ λb =

√ 1/ g

, for 0 < g ≤

1 4

1 4

ti.j |qi,j = 0) = P (Yi,j

P (α > ti.j )pc + P (Yi,j > ti.j )(1 − pc )

(B.6)

and out < ti.j |qi,j = 1) = P (Yi,j

P (β < ti.j )pc + P (Yi,j < ti.j )(1 − pc ).

90

(B.7)

As before Yi,j is not considered as a random variable here but rather as the average value of the image luminance at the local window. Using (B.6) and (B.7) and substituting in (7.1) we obtain the probability of binarization error Perr = pc [p0 P (ti,j < α) − p1 P (ti,j < β)] + (1 − pc )(p0 − p1 )P (ti,j < Yi,j ) + p1

(B.8)

as a function of the cumulative distribution of ti,j . Substituting the expression in (B.5) it is possible to write the probability of error as a function of F (y) which includes the distribution of the noise and the average image values Perr = pc

n

+(1 − pc )

k=0 wk

n

k=0

(p0 F (α − tk ) − p1 F (β − tk ))

wk (p0 − p1 )F (Yi,j − tk ) + p1

n k n−k where for compactness we denote wk = p p . The probability of error depends k 1 0 on the distribution of the image, the QR code, the transformation parameters α, β, pc and the noise level. This dependence is reﬂected in the location of the discrete shifting parameters tk which are a function of α and n as well the binomial coeﬃcient wk that also depends on n = pc (64 × 25) .

91