Image Specific Color Representation: Line Segments in the RGB Histogram

Image Specific Color Representation: Line Segments in the RGB Histogram Msc. Thesis submitted by: Ido Omer School of computer science and engineerin...
6 downloads 2 Views 514KB Size
Image Specific Color Representation: Line Segments in the RGB Histogram

Msc. Thesis submitted by: Ido Omer

School of computer science and engineering The Hebrew University of Jerusalem 91904 Jerusalem, Israel

This work has been carried out under the supervision of professor Michael Werman at the School of Computer Science and Engineering, the Hebrew University of Jerusalem.

Abstract Color representation is a problem of significant importance in the fields of computer vision and image processing. Traditionally, many computer vision and image processing tasks where developed for gray level images. In recent years, as both the means of obtaining digital color images and the computational power needed to process such images have become more available, these algorithms have been adapted for color images. Unlike gray level images, where the way of representing a pixel’s gray level is very intuitive, a good representation for a pixel’s color is a problem yet to be solved. Throughout the years, many color spaces have been suggested as most appropriate for computer vision tasks, yet none of them have proven to be as such. The major drawback of all of these methods is that they all hold an implicit assumption that color is preserved up to a linear transformation due the process of image capturing. In practice, this assumption is not valid, and different color sensors distort the color information in various non linear ways. Most images used for computer vision and image processing tasks are captured using digital cameras. In this work we investigated the color distortion created by digital camera sensors and attempt to compensate for it by using our color representation. Our research revealed that one general model for this distortion does not exist and that for each combination of camera, scene and illumination conditions a different distortion takes place. Nevertheless we did manage to model the nature of the color distortion, and given an image taken by digital camera, we created a method of recovering the color model best describing the image. First we describe the nature of the color distortion done by the CCD sensors and construct a method for recovering a color model given an image. Than the advantages of this image specific color model over traditional general color models is discussed in the context of image segmentation, image compression and other computer vision and image processing applications. Finally we have a summary of the work done and its contribution to computer vision, followed by a suggestion of ideas for further work in this area.

Contents 1 Overview

1

1.1

Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.2

Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.3

Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

2 Color

3

2.1

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

2.2

Color Lines Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

2.2.1

Image Capturing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

2.2.2

Modeling Color . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Applications 3.1

13

Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.1.1

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.1.2

Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.1.3

Segmentation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2

Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.3

Saturated Color Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.4

Color Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.5

Noise Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.6

Color Editing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4 Conclusion and Future Work

26

4.1

Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.2

Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

ii

Chapter 1 Overview 1.1 Thesis Organization The thesis is divided into the following chapters: Overview This part describes the thesis structure, the motivation for researching color and the means by which this work has been carried out. Color The color chapter gives a general survey on color and its use in computer vision and then describes the physical justification for the new proposed color model and the model details. Applications The applications chapter describes several applications that utilize the image specific color representation and show the advantage of this model in the context of these applications. Conclusion and Future Work The last chapter summarizes the work done and its importance to computer vision. It then suggests possible future work in this area.

1.2 Motivation The problem of color representation affects almost every field in computer vision. Many ways have been suggested for modeling and representing colors, yet none of these ways has proved to be better than all others. Many computer vision applications depend on a good color representation in order to achieve good results, of these applications; the most important is probably image segmentation. Throughout the years many works in computer vision addressed the topic of the geometric distortion an image undergoes due to the cameras’ optical system [1, 2], nevertheless, when considering the color information, previous works addressing this problem of color representation all held the implicit assumption that colors in 1

CHAPTER 1. OVERVIEW

2

the image had, at most, undergone a linear transformation by the capturing device. In practice, this assumption is not valid, and different means of image capturing usually distort color information in a nonlinear way. In this work we focused on images taken by typical consumer digital cameras and tried to model the color distortion created by their sensors in order to create a color representation that will be more suitable for images taken by such cameras.

1.3 Means The work took place at the computer vision lab in the school of computer science and engineering at the Hebrew University of Jerusalem. The test images were mostly taken using Canon Optura and Sony TRV10E video camcorders, with the exception of a few images taken by the Sony DCR-VX2000 high end video camcorder. The software development was done in Matlab and Java. The code for the software can be found in www.cs.huji.ac.il/ idom along with some of the test images used.

Chapter 2 Color 2.1 Background The color reflected from an object’s surface under a specific light is a continuous function that can be arbitrarily complicated and is hard to model. The human visual system, however, samples this function using three types of ’sensors’ called cones; each type of cone is sensitive to different wavelengths. The different cones have a peak response in 560nm, 530nm and 420nm and they are called either red, green and blue cones or long medium and short cones respectively. The different sensors’ response can be seen in figure 2.1. The way we experience color is caused by the vision system responding differently to different wavelengths of light. The response of each sensor can be weak or strong, but it can’t provide information about the wavelength of the light falling on it. This experimental result is called the principle of univariance and it provides us with a simple and useful model of human reaction to color. As a result, in order to simulate the response of the human visual system to the light reflected from a certain object under certain conditions, one has only to provide a stimulus which would produce the same response of all three sensors, regardless of the actual wavelength of the light falling upon them. If our goal is to provide the viewer with an image that would look to him like a good reconstruction of a scene from the world, we don’t have to sample the scene in a perfect way. There would be no need to describe the complex reflection functions of each object in detail; instead, we would need to sample the scene using sensors that are as similar as possible to the sensors of the human visual system. We should then display the image, reconstructing the colors in the ranges of these three wavelengths alone. It is mainly because of this reason, that the RGB color model is widely used for image capturing and displaying. Although the RGB color space is highly suitable for image capture and display, it is not considered an 3

CHAPTER 2. COLOR

4

Figure 2.1: The cones response to different wavelengths. appropriate color space for computer vision and image processing tasks. This is mainly due to the strong correlation between the three coordinates in that space. This correlation creates a coordinates system in which dark green is much ’closer’ to dark red than it is to light green. Throughout the years, the problem of finding a good color representation for computer vision and image processing tasks has drawn a lot of attention. Many ways have been suggested for modeling and representing colors. Different color spaces have been suggested in order to separate color from intensity and create a more intuitive color representation. These color spaces can be divided into two main groups: The first group consists of color spaces that are a linear transformation of the RGB color space and therefore usually called linear color spaces. The most commonly used among these color spaces is the     color space and its variants that try to linearly separate the information into an intensity coordinate ( ) and two color coordinates ( &  ). These color spaces are widely used for television broadcasting and for compression, The JPEG file format for example converts the image to     and then applies different compression to each channel. Another widely used linear color space is the CIE-XYZ color space that uses different basis functions than the RGB color space and by that is able to describe a larger portion of the visible spectrum without the need to use negative values (color subtraction). This color space was suggested by the CIE (Commission International de l’Eclairage) as a colorimetry standard, and is widely used in vision and graphics textbooks, however it is usually regarded as out of date with the exception of being used as a basis for the CIE-LAB and CIE-LUV color spaces.

CHAPTER 2. COLOR

5

The second group consists of color spaces which are a nonlinear transformation from the RGB color space and therefore usually called the nonlinear color spaces. The CIE-LAB and CIE-LUV color spaces separate color into one illumination coordinate and two color coordinates and try to create a perceptually uniform color system [3, 4]. These color spaces provides a good separation between color and intensity information, and are widely used for image segmentation and image processing. The HSV color space and its variants try to decorrelate the information by separating into hue, saturation and value [3]. The Hue stands for the color, the saturation stands for color saturation (or how far is it from gray), and the value represents the intensity. This color space is widely used for image segmentation and other vision and graphics tasks. It provides a reasonably good separation between intensity and color at a lower computational cost than the CIE-LAB and LUV color spaces. Attempts have also been made to find application specific color spaces. A good example of this is the

½ ¾ ¿ color space suggested by Ohta et al.[5] as most appropriate for segmentation tasks. All these color representations do not take into account the color distortion caused by digital cameras and assume perfect color preservation of the camera. Unlike lenses and geometric distortion, digital camera color distortion is a topic that has been rarely researched. In my thesis I suggest a way for modeling color that is image specific. This method tries to minimize the affect of color distortion caused by the camera and therefore is more suitable for real images. I outline the advantages of this model for image segmentation and present different applications that make use of such segmentation. I show that this model is both simple and strong and thus yields very good results for segmentation while enabling simple and efficient color manipulation.

CHAPTER 2. COLOR

6

2.2 Color Lines Model Until recently digital camera sensors had a very nonlinear response to illumination. The histogram of an image taken by such a camera was usually very noisy and it was hard to identify clear structures in it. The advancement of digital technology has given camera sensors a better and more linear response. When looking at the RGB histogram of an image taken by a modern camera it is possible to see structures as can be seen in figure 2.2. In this section I explain the nature of the structures that can be found in the RGB histogram of an image and create a model that describes these structures.

Figure 2.2: An RGB histogram.

2.2.1 Image Capturing A color image is a function of many parameters, the important of which are: source light color, source geometry, scene and object geometry, object albedo and camera parameters (both geometric and light sensitivity parameters). Assuming our world consists only of Lambertian objects, we can divide these parameters into two groups. The first group consists of geometric parameters which have no effect over the wavelength of the light which leaves the object’s surface. The geometric parameters have an affect only over the intensity of the light reaching the camera and not over its R, G, B ratio. The second group includes source light color, the object’s surface color and the camera sensors’ response to light, which are the main factors affecting color. If we ignore the first group, which changes the intensity alone and

CHAPTER 2. COLOR

7

has no affect on color, we are left with the second group of parameters. Assuming we have a camera with linear sensor response, all pixels belonging to the same object with homogeneous color lie on a straight line through the origin in the RGB histogram. This is in accordance with the laws of colorimetry [6]. This model yields the following formulation which is widely used for describing the RGB values of each pixel:

 



  

        

(2.1)

Where   and  are the sensors response to the incoming radiance  , and  is the light’s wavelength. According to this model, an efficient and accurate way of separating color information from intensity will be very simple. Instead of representing each point in the RGB cube using its R, G and B coordinates, one only has to change the representation to a vector through the origin, using two spherical angles and a norm, this model called the Normalized RGB (Nrgb) color model. In practice the Nrgb color space doesn’t yield good results for image segmentation and other computer vision and image processing tasks. Some of the reasons for this are found within the scene itself: specularity, reflectance and inconsistent light and object colors. However, another important reason is camera inaccuracy. Camera sensor inaccuracy can be divided into three main types:

 In low intensities the sensors tend to be nonlinear.  In high intensities the sensors reach saturation.  The camera samples the world in a noisy manner. Despite these inaccuracies, modern cameras have an almost linear response to light in a wide range of intensities. In this range pixels belonging to an object with homogeneous color will align roughly to the same line in the RGB color space, although this line doesn’t necessarily pass through the origin. The reason for this is the sensors’ response to light, which can’t be modelled using a linear function [7]. Two different phenomena break the linear model: saturation and cutoff. As a rule of thumb, the better the camera, the larger its linear response range and the better it fits this model. A typical sensor response is shown in figure 2.3 An immediate result of this phenomenon is the fact that

CHAPTER 2. COLOR

8

Figure 2.3: A typical CCD sensor response. we can’t model the color of a pixel using equation 2.1 and we should change the equations as follows: 

          

         

         

(2.2)

Where    are the cutoff values of the  and sensors and    are their saturation values. Figure 2.4 shows two images of the same scene (objects and lighting) taken by two different cameras and the lines best describing their histograms. We see that not only do the lines not intersect the origin, but different lines are produced by different cameras, depending on the cameras sensors and the scene (even two images from the same camera with different lighting conditions will qualitatively differ). As a result, images taken by such cameras don’t preserve the laws of colorimetry [6]. We therefore can’t create a single representation of color that is accurate for every image. Instead we should analyse each individual image in order to find the color representation that fits it best.

CHAPTER 2. COLOR

9

(a)

(b)

(c)

(d)

Figure 2.4: Figure (a) is an image taken with a Sony TRV10E. Figure (b) is an image of the same scene taken with a Canon Optura. Figures (c) and (d) show 2D views of the best lines describing the histogram of the above images.

CHAPTER 2. COLOR

10

Figure 2.5: Color line consisting of 5 color segments.

2.2.2 Modeling Color According to the physical camera model, color should be modelled using a general affine line in 3D rather than using a line through the origin. In cases of an image with saturated colors, we link two and even more color segments together in order to create a single model that describes the saturated, non saturated and low intensity pixels of the same object which has a single color in the world. We call a group of one or more linked color segments a emphcolor line. Up to two color segments can be used in order to describe the saturated object’s color (once the third sensor becomes saturated, we loose all color information and can no longer recover any color information from the histogram). Theoretically, two more color segments can be used in order to describe the intensities below cutoff values, in practice though, these regions in the histogram are usually very dense and it is difficult to separate the different segments. Figure 2.5 shows a color line consisting of 5 color segments. Cases where a saturated color segment is linked to the main one in order to describe an object’s color are quite common, whereas cases with more than two segments are rare. We suggest that modeling color as color lines in RGB color space is a better color model then other linear or non-linear color models for images taken by digital cameras. The model is useful for overcoming digital camera inaccuracy. In order to compare different color models we manually segmented an image to its components. We then chose the best model for each component’s color - in the  ½ ¾ ¿ , HSV and CIE-LAB color spaces, the model is a point in 2D. For the Nrgb color space it is a line through the origin in 3D (or a point in 2D). For our model it is a color line. We now attributed each pixel to its closest model according to the different color representations. In this way we tried to use color information alone for

CHAPTER 2. COLOR

11

the segmentation task and neglect all other segmenter specific parameters. The results seen in figure 2.6 show that our color representation modeled the color better then the other color representations. In practice, the recovery of the color model from an image is done in one of two ways. The first way

(a)

(b)

(c)

(d)

(e)

(f)

Figure 2.6: (a) Original image; (b-e) Segmentation according to (b) our color lines model, (c) HSV model, (d) CIE-LAB model, (e) ½ ¾ ¿ model, (f)Nrgb model. is through image segmentation, this method is discussed in detail in the next chapter. The second way is recovering the model directly from the histogram. Although searching the RGB histogram for general 3D lines without making any assumptions is computationally very expensive and the process is prone to errors due to local minima, the fact that we have a strong knowledge as for the expected nature of these lines makes it feasible and good results can be achieved at a very low computational cost. I implemented a tool for recovering the color lines directly from the histogram by slicing the histogram at different dis-

CHAPTER 2. COLOR

12

tances from the origin, then searching for the local maxima in each slice; these maxima are points along the model lines. Finally the points are matched together to form lines according to the color lines model. This algorithm works well even for complicated, highly textured scenes. The results of this algorithm can be seen in 2.7 Although this color representation is stronger than the more common representations, as shown above, it is still a very simple one. It is very easy to manipulate a pixel’s color by manipulating the color line as can be seen in the following sections.

(a)

(b)

(c)

(d)

Figure 2.7: (a) Original image containing 84825 different colors. (b) Segmented image containing 38 different colors. (c) Original image containing 77170 different colors. (d) Segmented image containing 43 different colors. We can see that the in spite of the complex nature of the images, we achieved a segmentation containing a small number of different colors.)

Chapter 3 Applications This new proposed color representation approach has several advantages over other common ones: Representing the color using color lines results in a model that is strong enough to define color and at the same time is very compact and easy to manipulate. Many applications can utilize this representation for compression or various types of color manipulations such as: color correction, enhancing the illumination, changing color saturation and even replacing colors of entire objects.

3.1 Segmentation 3.1.1 Background Image segmentation is the process of dividing an input image to different homogeneous regions. It is the first step in almost every image analysis and pattern recognition process and is of importance for many image processing tasks as well. Color image segmentation has been drawing an increasing amount of attention in recent years mainly due to the following reasons: (1) color images have a larger amount of information than gray level images and can therefore yield better segmentation results; (2) the power of computing is increasing rapidly and PCs can handle this larger amount of information in a reasonable time. It is because of the above mentioned reasons, that many attempts have been made to find the best color space for this task throughout the years. Yet no one model has proven to be superior [11, 12]. In my work I focused on image segmentation using clustering techniques, however I would like to begin by mentioning a few other segmentation approaches: Edge Detection These techniques ware very popular for gray level images and have been extended to color images as well. In this approach an edge map of the image is calculated and then, segments 13

CHAPTER 3. APPLICATIONS

14

are constructed according to it. Two pixels will belong to the same segment iff there is a pass between them that doesn’t cross an edge. Examples for using edge detection techniques for color image segmentation can be found in [14, 15]. Histogram Thresholding Like edge based techniques, histogram thresholding is an approach widely used for gray level image clustering. It assumes the image to be composed of regions with different gray level (color) ranges and splits the histogram into a number of peaks, each corresponding to one region. The extension of the approach to color is not trivial as in the case of edge detection techniques and is more computationally expensive, yet several attempts of doing so have been made [16] Region Based In the region growing approach a seed for a region is first selected and then expanded to include all homogeneous neighbors. This process is repeated until no more unclassified pixels are left. In the region splitting, the whole image is selected as a seed, it is then repeatedly divided to form homogeneous regions (the division is usually to four equal parts). Some region based techniques combine region growing and region splitting in order to yield a better result. Examples of region based segmentation algorithms might be found in [17, 18]. Fuzzy and Probabilistic Techniques Some image segmentation techniques allow some degree of freedom in the pixel classification using fuzzy or probabilistic approach, the uncertainty can then be solved using neighbors and region information. Image segmentation using fuzzy techniques has been suggested by [19, 20]. I will now focus on segmentation using clustering techniques. Color image segmentation is often viewed as a clustering problem in a 5 dimensional space or higher, where the pixels are projected into a feature space containing spatial features (the pixel’s location in the image), color features (color space coordinates) and possibly other features (for example, the gradient value) [13]. Many different approaches have been suggested for the actual clustering method. The two most natural paradigms are the divisive algorithms and the agglomerative ones. The divisive approach treats the whole data set (image) as a single cluster and then recursively splits the data to yield a good clustering. The agglomerative approach treats each data unit (pixel) as a cluster and then tries to merge similar clusters until a good clustering achieved. An example for a widely used agglomerative segmentation algorithm is the mean shift algorithm. A popular Segmentation by clustering technique is the K-means. The K-means algorithm assumes the data can be divided into  different clusters, where  is known, each cluster has a center. It starts with randomly choosing  cluster centers. It then allocates each point to its nearest cluster center and recomputes new cluster centers. The algorithm is guaranteed to converge to a local minimum. Other

CHAPTER 3. APPLICATIONS

15

clustering algorithms include graph theory based algorithms. These algorithms build a full graph, where each node represents an image pixel and sets the weight for the edge to be a similarity measure of their features (high edge value indicates similar pixels). It then separates the graph into strongly connected components.

3.1.2 Implementation In order to show the advantages of our color model we implemented a simple agglomerative image segmenter. The segmenter does not depend on previous analysis of the image and is in fact creating the color model during the segmentation process. Searching the RGB color space for line like clusters without any assumptions as for the lines structures is a difficult problem (NPC). Approximations like the Hough transform and Ransac [8, 9] are also computationally expensive. The large amount of noise in the histogram domain does not make the search any easier as well. This noise is a result of various reasons, some of which are easy to handle. A good example of this is the areas along edges between different objects, where the camera interpolates the colors of the objects. As a result of this interpolation we have a thin membrane connecting the clusters, creating a planar cluster rather than two separated line like clusters. Either ignoring pixels along edges or cleaning the histogram using filtering can handle this. Specularity is yet another problem that changes the object’s color, which can also be handled through special techniques [10]. Another difficulty in finding the best lines describing the histogram is due to the fact that in most images the colors tend to be very grayish and the histogram points are grouped in a small region along the (0,0,0) - (255,255,255) line. Due to these reasons and in order to minimize the effects of noise within the histogram space, we decided to begin by searching the image space locally for line like clusters in the histogram, and then combine clusters from different image regions. We also used prior knowledge for the expected color line orientation and ruled out possible lines due to low orientation score. The segmentation algorithm is outlined in figure 3.1. Theoretically it is possible to adjust almost any given color segmenter to work with this new representation of color, instead of using a color point or a restricted color line as in the case of Nrgb. The color line representation can be used to achieve better results. The drawback will be that searching for line like clusters in 3D is computationally more expensive than searching for clusters in 2D.

3.1.3 Segmentation Results After comparing the results produced by our color lines segmenter to those of a mean shift image segmenter working in the LUV color space downloaded from [21], our segmenter was found to be much less sensitive to illumination changes and yet not group different colored pixels. As a result, although

CHAPTER 3. APPLICATIONS 1.

I  I  (E  S);

2.

for each p  I {

3.

16

c = Cluster(p);

4.

}

5.

for each c  C { for each n   {

6. 7.

if (similarity(c,n) < T) {

8.

merge(c,n);

9. 10.

} }

11. } 12. globalMerge(); where: I - The image’s pixels. E - The edge pixels. S - The saturated pixels. C - The group of clusters.

 - The neighbours of cluster c.

Figure 3.1: The segmentation algorithm it segmented the image into a similar number of segments as the mean-shift segmenter, the segments produced by our segmenter are superior as can be seen in figures 3.2 and 3.3.

CHAPTER 3. APPLICATIONS

17

(a)

(b)

(c)

Figure 3.2: (a) An image captured using a digital camera. (b) Using a mean shift segmenter working in the LUV color space - 9 segments found. (c) Using our color lines segmenter - 10 segments found. It is easy to see that our segmenter is more sensitive to changes in color and less sensitive to changes in illumination.

CHAPTER 3. APPLICATIONS

18

(a)

(b)

(c)

Figure 3.3: (a) An image captured using a digital camera. (b) Using a mean shift segmenter working in the LUV color space - 8 segments found. (c) Using our color lines segmenter - 3 segments found. Here again we can see that our segmenter managed to ignore intensity changes and to segment the image according to its colors (another segment was created due to specularity.)

CHAPTER 3. APPLICATIONS

19

3.2 Compression Modeling a pixel’s color as a point along a color line in RGB creates a compact representation of the color information in the image. Rather than using three numbers to represent color, it can now be represented using only two: a pointer to a color line and a norm along this line. This can be used to create an image format that uses a lookup table, like the familiar GIF or PNG image formats [22, 23], but stores a color lines in this table instead of a color points. Two numbers are stored to represent each pixel: a pointer to an entry in the table (a color line), and a norm along the line. Thus, we are able to store a small table of color lines that nevertheless represent a large number of different RGB triplets. A typical scene is usually segmented to 6-20 different color line, so the table’s size is usually very small. An experimental implementation of this new image format yielded good results and made clear its main advantage over the PNG and GIF formats, better and more efficient usage of the local properties of an image. Neighboring pixels that are classified as having the same color, but are of different intensities have an identical color line pointer and differ in their norm only according to differences in their intensities. This difference is usually small, since illumination changes are typically gradual. In the PNG or GIF formats only one number is needed to represent a pixel’s color. However, since two similar shades of the same color can be assigned with any two numbers, local similarity in color between pixels cannot be used to full advantage. We compared this image format with the GIF image format. For natural images our image format usually gives results with a larger number of colors (RGB triplets), has less bites and is closer to the original image (LSE) as can be shown in figure 3.4. For synthetic images where we have no noise or sample errors, this method can be exploited to its fullest; we can compresses files up to 4 times better than GIF, sometimes get better results than JPEG, and yet get better visual results. Nevertheless this format only compresses the color representation and for most natural images gives results which are poorer than those of the JPEG format which uses a more advanced approach [22].

CHAPTER 3. APPLICATIONS

20

(a)

(b)

(c)

Figure 3.4: (a) Original image. (b) Details of the GIF format. (c) Details of our format. In our format the total number of colors in the image is larger and the result is in general visually better (and the LSE is better), yet our file size is smaller. It is also possible to see that in our format the errors are concentrated along edges and we can reduce the errors by interpolating pixels’ color along edges between different segments without storing any additional information.

CHAPTER 3. APPLICATIONS

(a)

21

(b)

(c)

Figure 3.5: (a) Original image. (b) Segmentation according to our affine line segments model. (c) Segmentation according to our model, this time, saturated pixels were colored according to their saturation color

3.3 Saturated Color Correction Another application for this method is in correcting the color of saturated image pixels. The dynamic range of a typical natural or artificial scene is usually larger then the dynamic range that the camera’s sensors can capture. As a result, in many pictures some of the pixels have at least one saturated color component. In the histogram domain, this phenomenon appears in the form of a knee in the color cluster’s line, the line looks as if it has been projected upon the RGB bounding box. By modeling the color as color lines, it is easy to classify saturated pixels and non saturated ones as belonging to the same object as shown by figure 3.5. This not only achieves a better segmentation by classifying saturated pixels to the correct clns, but also allows us to correct the color of these pixels. We correct the saturated component so that the two segments are collinear (we can even use one non saturated component to correct the other two). The color correction results are shown in figures 3.6 and 3.7. This method works when we model the color as lines through the origin as well (or Normalized RGB segmentation) but the results will only be as good as the segmentation and currently the Normalized RGB color model is not an accurate model for images taken by digital cameras. By correcting the saturated pixels we in fact increase the dynamic range of the image, therefore making it unsuitable for direct presentation with typical display devices. In order to readjust the dynamic range we can use gamma correction or other methods for high dynamic range compression [24]. Simply rescaling the color will usually create an image that is significantly darker than the original image and therefore yields poor results.

CHAPTER 3. APPLICATIONS

22

(a)

(b)

(c)

(d)

Figure 3.6: (a) Saturated image. (b) Using Gamma correction. (c) Correcting saturated pixels and rescaling the colors. (d) Correcting saturated pixels and using gamma correction. It is possible to see that in figures (c) and (d) the saturated (yellowish) pixels in the left part of Pinocchio are corrected but the intensity range has increased from 255 to 305 and the image in (c) is too dark. The intensity in image (d) has been corrected using gamma correction.

CHAPTER 3. APPLICATIONS

23

(a)

(b)

(c)

(d)

Figure 3.7: (a) Synthetic image. (b) Histogram of (a). (c) Same image after correcting the saturation and cutoff. (d) Histogram of (c).

3.4 Color Correction People in the digital imaging industry are trying hard to manipulate the camera’s sensors output in a way that will make the color lines go through the RGB center of axis. Using this segmenter the color line can be corrected as a post-process by shifting the color lines to the best line through the origin describing the color points as can be seen in figure 3.7.

CHAPTER 3. APPLICATIONS

24

3.5 Noise Reduction In many cases, a camera’s color sampling is noisy and when zooming into an image we can see artifacts and minor color errors that are the result of sampling errors. Nevertheless, these errors are usually small. By projecting each pixel’s color upon the color line of the pixel’s cluster, we can reduce this noise and create an image with cleaner colors as shown in figure 3.8.

(a)

(b)

(c)

(d)

Figure 3.8: In figure (b) Details of figure (a) (c) Same image after projecting each pixel’s color onto its cluster color line. A lot of the noise from figure (b) is removed in figure (c), for example the yellow spots in the background (best seen on screen).

CHAPTER 3. APPLICATIONS

25

3.6 Color Editing Creating a color representation using indices to color lines and norms enables us to manipulate color very efficiently and in a very intuitive way, as can be seen in figure 3.9. We can increase or decrease the color saturation of an object in an image by moving the object’s color line from or towards the central line ([0,0,0] - [255,255,255]). We can increase or decrease object’s intensities by moving their colors up or down along their representative color line. We can increase an object’s contrast by stretching its color line and we can change an object’s color completely and yet preserve all its intensities, and therefore apply changes to big regions in the image at a very low computationally cost by moving the whole line.

(a)

(b)

(c)

(d)

(e)

(f)

Figure 3.9: (a) Original image (segmented into 3 color lines). (b) Decreasing the color saturation. (c) Increasing the color saturation. (d) Shifting the intensities making one object brighter and the others darker. (e) Stretching the lines. (f) Changing colors.

Chapter 4 Conclusion and Future Work 4.1 Summary and Conclusions In my thesis I have proposed a new way of representing color and argued its strong relation with the physical model of capturing images using digital cameras. The work is unique in the sense that it suggests the new idea of an image specific color model which can help overcome the color distortion created during the image capturing process. This idea has not been suggested before and I believe it has a strong advantage over traditional color representations for many computer vision and image processing tasks as shown in this paper. I believe that image specific color representation is a topic yet to be researched and it can contribute dramatically to applications which rely upon an accurate color representation. The advantage of an image specific color model over traditional color models is obvious, it creates a color model which is insensitive to color distortion which may have occurred during image capturing and therefore is much less sensitive to noise and outliers. The main disadvantage of this color model is that by nature, it is dependent upon an analysis of the image and therefore prone to errors. I have also shown several applications that utilize this simple yet strong representation of color for image segmentation, compression and for color enhancement. In the examples of image segmentation, image compression and noise reduction, a large variety of color models are used by existing applications. In these cases the color lines model can only improve current results. In the case of color editing, existing tools do allow such an image manipulation, but usually yield poor results and exact large computational cost. In the case of color correction and saturated color correction there are no existing tools which allow such image manipulation. These applications are strongly dependent upon the color lines representation.

26

CHAPTER 4. CONCLUSION AND FUTURE WORK

27

4.2 Future Work Due to the novel nature of this thesis, the amount of possible future work in this field is vast. Primarily, the model itself and the means of obtaining it from an image, should be researched. The algorithm for recovering the lines model from an image is yet to be perfected and there is yet work to be done in this area. I also believe that the model can be generalized to allow a better representation of the image, including the recovery of illumination in the scene. This work in fact is currently in progress. The model can also be extended to handle specularity, an important phenomena that has been addressed in this work. Another direction is the creation of similar image specific color models, created to match the color distortion caused by film and possibly other types of sensors and capturing devices. Finally, the adaption of many existing computer vision and image processing applications to the color lines model can be researched. One example is the shape from shading algorithm, where the norm along the color line should give a better indication for the 3D structure of an object than the luminance of the point. Another application is the creation of a binary image from a given color image, a problem that is defined quite well for gray level images, but has no natural extension to color images. Once the color lines have been recovered, each line segment can be treated as a gray level image and the extension of the gray level case becomes trivial.

Bibliography [1] F. Devernay and O. Faugeras. Automatic calibration and removal of distortion from scenes of structured environments. SPIE Conference on investigative and trial image processing SanDiego, CA, 1995 [2] H. Farid and A.C. Popescu Blind removal of Lens Distortion Journal of the Optical Society of America, 2001 [3] Color FAQ. http://www.inforamp.net/poynton/ColorFAQ.html [4] CIE Web site. http://www.cie.co.at/cie/index.html [5] Y, Ohta, T. Kanade and T. Sakai. Color information for region segmentation. Computer Graphics and Image Processing 13(1980) 222-241. [6] M. Chapron. A new chromatic edge detector used for color image segmentation. IEEE International Conference on Pattern Recognition, A, 1992, pp. 311-314. [7] S. Chen. and R. Ginosar. Adaptive Sensitivity CCD Image Sensor. SPIE 2415: CCD and Solid Sate Optical Sensors V, San Jose, CA, Feb. 1995. [8] P.V.C. Hough. Machine Analysis of Bubble Chamber Pictures. International Conference on High Energy Accelerators and Instrumentation, CERN, 1959. [9] M. A. Fischler, R. C. Bolles. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Comm. of the ACM, Vol 24, pp 381-395, 1981. [10] K. Schluns and A. Koschan. Global and local highlight analysis in color images. CGIP 2000. [11] W. Skarbek and A. Koschan. Colour image segmentation - a survay. Technical University of Berlin Oct. 1994. [12] H.D. Cheng, X.H. Jiang, Y. Sun and Jingli Wang. Color image segmentation: advances and prospects. Pattern Recognition 34 (2001) 2259-2281. [13] John Forsiet and another person. Computer Vision: A Modern Approach.

28

BIBLIOGRAPHY

29

[14] F. Perez, C. Koch. Toward color image segmentation in analog VLSI: algorithm and hardware. International Journal of Computer Vision 12 (1) (1994) 17-42. [15] S. Ji, H.W. Park. Image segmentation of color image based on region coherency. IEEE International Conference on Image Processing, 1998, pp. 80-83. [16] T. Uchiyama, M.A. Arbib. Color image segmentation using competitive learning. IEEE Trans. Pattern Analysis Machine Intelligence 16 (12) (1994) 1197-1206. [17] R. Ohlander, K. Price, D.R. Reddy. Picture segmentation using a recursive region splitting method. Computer Graphics and Image Processing. 8 (1978) 313-333. [18] M. Celenk. A color clustering technique for image segmentation. Graphical Models and Image Processing. 52 (3) (1990) 145-170. [19] S.K. Pal Image segmentation using fuzzy correlation. Information Science 62 (1992) 223-250. [20] D.N. Chun, H.S. Yang. Robust image segmentation using genetic algorithm with fuzzy measure. Pattern Recognition 29 (7) (1996) 1195-1211. [21] Mean Shift color image segmenter. http://www.caip.rutgers.edu/ meer/RIUL/uploads.html [22] The Graphics File Formats Page. http://www.dcs.ed.ac.uk/home/mxr/gfx/2d-hi.html [23] PNG Documentation. http://www.libpng.org/pub/png/pngdocs.html [24] Gradient domain high dynamic range compression R. Fattal, D. Lischinski, and M. Werman. ACM Transactions on Graphics (Proc. ACM SIGGRAPH 2002), July 2002.