Image-Based Surface Detail Transfer

Image-Based Surface Detail Transfer∗ Ying Shan Sarnoff Corporation [email protected] Zicheng Liu Microsoft Research [email protected] Abstract ima...
Author: Avis Hopkins
1 downloads 0 Views 2MB Size
Image-Based Surface Detail Transfer∗ Ying Shan Sarnoff Corporation [email protected]

Zicheng Liu Microsoft Research [email protected]

Abstract

image, it usually requires the knowledge of the lighting condition and reflectance functions. We observe that in some cases where we are only interested in transferring geometrical details from one object to another, it may not be necessary to explicitly compute 3D structure. In particular, we present a novel technique to capture the geometrical details of an object from a single image in a way that is independent of its reflectance property. The captured geometrical details can then be transferred to another surface to produce the appearance of the new surface with added geometrical details while its reflectance property is preserved. The advantage of our method is that it is simple to implement, reliable, and requires only a single image for each object.

We present a novel technique, called Image-Based Surface Detail Transfer, to transfer geometric details from one surface to another with simple 2D image operations. The basic observation is that, without knowing its 3D geometry, geometric details (local deformations) can be extracted from a single image of an object in a way independent of its surface reflectance, and furthermore, these geometric details can be transferred to modify the appearance of other objects directly in images. We show examples including surface detail transfer between real objects, as well as between real and synthesized objects.

1

Introduction

1.1

1.2

Overview

Related work

Our surface detail transfer method is image-based. The idea of changing object appearance with only image information has been explored by various other researchers in both compute vision and graphics communities. Given a face under two different lighting conditions and another face under the first lighting condition, Riklin-Raviv and Shashua [13] used image ratio technique (called quotient image) to generate an image of the second face under the second lighting condition. Stoschek [15] combined this technique with image morphing to generate the re-rendering of a face under continuous changes of poses or lighting conditions. Marschner et al. [11] used image ratios between the synthesized image pairs under the old and new lighting conditions to modify photographs taken under the old lighting condition to generate photographs under the new lighting condition. In a similar spirit, Debevec[4] used the color difference between the synthesized image pairs with and without adding a synthetic object to modify the original photograph. Burson and Nancy [3] computed the difference of the aligned images of a young face and an old face. Given the image of a new person’s face to be aged, the difference image is warped and added to this new face to make it look older. One problem with this technique is that the difference image contains the skin color information of the original two faces

To change the appearance of an object by adding geometric details is desirable in many real world applications. For example we may want to know what a wall might look like after adding some geometrical bumps on the wall, or we may want to know what a person might look like after adding/reducing wrinkles on his/her face, and so on. Direct method for adding geometric details to an object requires modeling both the object and the surface details. It is usually not trivial to build a 3D model for a real object. It is also tedious and labor intensive to model and create surface details with existing geometric modeling tools. Bump mapping [2] has been used as an alternative to adding geometrical details to an otherwise smooth object. But constructing visually interesting bump maps requires practices and artistic skills. Computer vision techniques have been very helpful for modeling real world objects as well as the surface details. These techniques include laser scanner, steror algorithms, shape from lighting variation [6, 14], shape from shading [8, 7], etc. However, some of these techniques require specialized equipment. Many other techniques require at least two images for each object, and it may be difficult to capture the high resolution geometrical details robustly. Although shape from shading technique only requires a single ∗ Electronic

Zhengyou Zhang Microsoft Research [email protected]

version available: http://research.microsoft.com/∼ zliu

1

where  is a small positive, and furthermore, there exist neighborhoods Θ(P1 ) of P1 and Θ(P2 ) of P2 such that

so that the skin color of the new face may be modified by the aging process (dark skin becomes light skin, etc). Liu et al.[10] used the image ratio between a neutral face and an expression face of the same person (called expression ratio image) to modify a different person’s neutral face image and generate facial expression details. Our work is closely related to [10] and [3] in that we all deal with surface deformations. However, our method differentiates from these two work and all the related work mentioned above in that we only require one source image. Our key observation is that the smoothing in the image domain corresponds to the smoothing in the geometrical domain when the surface reflectance is smooth. This point will be detailed mathematically in Section 3.

1.3

n(P1 ) − n(P2 ) ≤ δ

where δ is a small positive, and n(P1 ) and n(P2 ) are the mean normal defined in the neighborhoods of Θ(P1 ) and Θ(P2 ), respectively. The problem can then be stated as the following. Given images I1 and I2 of two aligned surfaces S1 and S2 , respectively, what is the new image I2 of S2 if we modify its surface normal such that n2 (P2 ) = n1 (P1 )

2.2 A Geometric Viewpoint The following discussion assumes a single point light source to simplify the derivation. Extension to multiple light sources is straight forward. Because the distance between P1 and P2 is small according to Eq. (2), it is reasonable to assume that the light is always sitting far away enough such that   dl , where dl is the average distance from light to the points. This leads to the approximation l(P1 ) ≈ l(P2 ). From Eq. (1) and (4), we then have

Image-Based Surface Detail Transfer

I2 (p2 ) I2 (p2 )

In this section, we describe the technique of Image-Based Surface Detail Transfer (IBSDT) to transfer the geometric details between the images of two surfaces without their 3D information.

2.1



li n(P) · li (P)

I2 (p2 ) ≈

(1)

(5)

I1 (p1 ) ρ(P2 ) ρ(P1 )

(6)

In order to compute the ratio of ρ(P1 ) and ρ(P2 ), let us define the smoothed image of I as  I(p) ≡ w(q) I(q) (7)

i=1

where p = C(P) is the 2D projection of P onto the image, and C(·) is the camera projection function. Two surfaces S1 and S2 are said to be aligned if there exists a one-to-one mapping F such that for all P1 ∈ S1 and P2 = F(P1 ) ∈ S2 P1 − P2  ≤ 

ρ(P2 ) l n2 (P2 ) · l(P2 ) ρ(P2 ) l n2 (P2 ) · l(P2 ) ρ(P1 ) l n1 (P1 ) · l(P1 ) ρ(P2 ) ρ(P2 ) l n2 (P2 ) · l(P2 ) ρ(P1 ) I1 (p1 ) ρ(P2 ) I2 (p2 ) ρ(P1 )

where ρ has the same meaning as in the Eq. (1), p1 = C1 (P1 ), p2 = C2 (P2 ), and I1 , I2 , and I2 have the same meaning as in the problem statement. Notice that the C(·) functions are different for the two surfaces. This is because the images I1 and I2 of the surfaces could be taken by two different cameras. This leads to

For any point P on a surface S, let n(P) denote its normal. Assume there are m point light sources. Let li (P),1 ≤ i ≤ m, denote the light direction from P to the ith light source, and li its intensity. Suppose the surface is diffused, and let ρ(P) be its reflectance coefficient at P. Under Lambertian model, the recorded intensity of point P in the image I is I(p) = ρ(P)

≡ ≈

Notation and Problem Statement

m 

(4)

where P1 and P2 are the corresponding points defined by the mapping F.

Paper organization

The remainder of this paper is organized as follows. We review related work on image ratios in Section 1.2. We describe the technique of surface detail transfer in Section 2. The implementations are discussed in Section3. The results are shown in Section 4. Finally we conclude the paper in Section 5 with discussions on the limitations of our method and future directions.

2

(3)

q∈Ω(p)

where Ω(p) = C(Θ(P)) is the neighborhood of p, and w(·) is the kernel function of a smooth filter, say, a Gaussian filter or an average filter. Assuming that the size of Θ(P)

(2) 2

is relatively small as compared with its distance to the light source, we have l(P) ≈ l(Q), ∀Q ∈ Θ(P). Also assuming that ρ(P) ≈ ρ(Q), ∀Q ∈ Θ(P), from Eq. (7) and Eq. (1), it is then obvious that    I(p) ≈ ρ(P) l w(C(Q)) n(Q) · l(P) (8)

usually only need to perform global transformations including rotation, translation, and scaling. For more complicated shapes such as human faces, we first manually put markers on the boundaries and the feature points, and then obtain pixel alignment through image warping [16, 1, 9]. In our implementation, we use a simple triangulation-based image warping method. Once the alignment is done, we can then run a Gaussian filter with a user specified σ on I1 , and I2 to obtain I 1 and I 2 . Finally we apply Equation (11) to obtain I2 . Intuitively, the σ of the Gaussian filter controls how much geometrical smoothing we perform on the surface of I1 . So it determines the scale of the surface details to be transferred. A small σ allows fine geometrical details to be transferred while a large σ allows only large scale geometrical deformations to be transferred.

Q∈Θ



where Q∈Θ w(C(Q)) n(Q) = n(P), and n(P) is the mean normal as mentioned in the problem statement. For surface S1 and S2 , we then have I 2 (p2 ) ρ(P2 ) l n(P2 ) · l(P2 ) ≈ ρ(P1 ) l n(P1 ) · l(P1 ) I 1 (p1 )

(9)

Since the two surfaces are aligned, we have l(P1 ) ≈ l(P2 ), and n(P2 ) ≈ n(P1 ). Equation (9) can then be rewritten as ρ(P2 ) I 2 (p) ≈ ρ(P1 ) I 1 (p)

4

(10)

Figure 1 shows the results of transferring the geometrical details of a synthetic sphere to a nectarine. The bumps on the synthetic sphere are generated by using bump mapping technique. The surface reflectance property on the synthesized sphere is set to be uniform. We put a point light source on top of the sphere so that its lighting condition is somewhat close to the lighting condition of the nectarine. We can see that the bumps on the synthetic sphere are transferred nicely to the nectarine except at the bottom where the synthetic sphere is basically dark. The sizes of the image are 614 by 614 pixels, and σ is 8. Figure 2 shows the results of transferring the geometrical details of a real orange to the same nectarine as in Figure 1. The bumps on the oranges are transferred faithfully to the nectarine. The image dimensions and σ are the same as in Figure 1. This example also reveals a limitation of our algorithm: the high lights on the orange are transferred to the nectarine. The reason is that the high lights are treated as being caused by geometrical variations. Figure 3 shows the results of transferring the geometrical details of a tissue to an synthetic rectangle. We can see that only the geometrical bumps on the tissues are transferred to the rectangle while the material color of the rectangle is preserved. Figure 4 shows the results of geometric detail transferring from the same tissue to the image of a piece of wood. Both pictures are taken under the same lighting conditions. We can see the the small bumps on the tissues are transferred to the wood while the wood texture is preserved. Figure 5 shows the result of transferring the geometrical details of the same tissue to a table surface. This table surface has a different texture pattern than the wood in Figure 4. It is interesting to compare the results (the images on the right)

Substituting Eq. (10) into Eq. (6) leads to I2 (p2 ) ≈

I1 (p1 ) I 2 (p2 ) I 1 (p1 )

(11)

Eq. (11) shows that the transfer of surface normal can be approximated by some simple operations on the images of the surfaces.

2.3 An Intuitive Signal Processing Viewpoint We now rewrite Eq. (11) as I2 (p)

I1 (p) ≈ I 2 (p) ≡ I 1 (p)



I1 (p) − I 1 (p) 1+ I 1 (p)

 I 2 (p)

(12) From signal processing view point, Eq. (12) simply substitutes the high frequency components of I2 with those from I1 . The high frequency components I1 − I 1 in I1 is normalized by I 1 in order to cancel the intensity scale difference between the low frequency components of I2 and I1 . Generally, I1 could be any image, regardless of the conditions given in the previous section. But the resultant image could be meaningless because of the inconsistency between the transferred detailed components from I1 and native low frequency components on the I2 . This happens when I1 and I2 are the images of two surfaces that are not aligned.

3

Results

Implementation

Given images I1 and I2 of similar shapes, to perform surface detail transfer, we first need to align the two images. For simple geometrical shapes such as rectangles and spheres, we 3

Figure 1: Left: synthetic sphere with many small bumps generated by

Figure 2: Left: photograph of an orange. Middle: photograph of a nectarine. Right: The synthesized image obtained by transferring the geometrical details of the orange to the nectarine. ((c) and (f)) with σ = 8. We can see that varying σ produces reasonable in-between aging effect such as (b) and (e). Obviously, surface detail transfer plays an important role when making a young person elder. However, it is less apparent why this technique is necessary to make an old person younger. To clarify this point, we simply smooth Fig. (6)(d) without transferring surface details from Fig. (6)(a), while masking out the facial features as before. Figure (7) shows the results with σ = 3 (left image) and σ = 8 (right image). As compared with the images in Fig. (6) (e) and (f) with the same σs, we can see that images in Fig. (6) are much less sharp and convincing.

in Figure 4 with Figure 5,and notice that they have the same geometrical bumps but different material properties. One interesting application of IBSDT is aging effect synthesis. Geometrically, the difference between an old person’s skin surface and a young person’s skin surface is that the old person’s skin surface has more bumps than the young face. If we transfer the bumps of an old person’s skin surface to a young person’s face, the young person’s face will become bumpy and look older. Conversely, we can also replace the bumps of an old person’s skin surface with that of the young person’s face so that the old person’s face gets smoother and look younger. So we can apply the surface details transfer technique as described in the previous section on human faces to generate aging effects. The alignment is done by first marking face boundaries and face features such as eyes, noses, and mouths, and then use triangulation-based image warping to warp I1 toward I2 . We only apply IBSDT to pixels inside of the face boundary. In addition, the pixels in the regions of the two brows, the two eye balls, nose top, and the mouth are not modified by IBSDT either. Figure 6 shows the aging effect synthesis results between the faces of a young male (a) and an old male (d). For each face, we experiment with different σ of the Gaussian filter during the surface detail transfer. Images in the middle ((b) and (e)) are the results with σ = 3, and those on the right

5

Conclusion and Future Directions

We have developed a technique called Image-Based Surface Detail Transfer or IBSDT to transfer geometrical details from one surface to the other without knowing the actual geometric information of the surfaces. This technique is particularly useful for adding geometric details to a real world object for which only a single image is available. It also provides a simple way to capture geometrical details of real world object and apply it to other synthetic or real world objects. 4

Figure 3: Left: photograph of a tissue. Middle: synthesized image of a rectangle. Right: The image obtained by transferring the geometrical details of the tissue to the rectangle.

Figure 4: Left: photograph of a tissue. Middle: image of a piece of wood. Right: The synthesized image obtained by transferring the geometrical details of the tissue to the wood. One limitation of this method is that it requires that the lighting conditions between the two images are similar. For images taken under completely different lighting conditions, one may use relighting techniques such as those reported in [13, 5, 12]. Another limitation is that it assumes that the surface reflectance are smooth. For objects with abrupt reflectance changes such as small color spots, our algorithm may confuse these color spots with geometrical details. It may be possible to separate these color variations from geometry variations perhaps through learning or some other approaches. We are planning on pursuing this in the future.

[5] P. E. Debevec, T. Hawkins, C. Tchou, H.-P. Duiker, W. Sarokin, and M. Sagar. Acquiring the reflectance field of a human face. In Computer Graphics, Annual Conference Series, pages 145–156. Siggraph, July 2000. [6] R. Epstein, A. Yuille, and P. Belhumeur. Learning object representations from lighting variations. In ECCV 96 International Workshop, pages 179–199, 1996. [7] G. Healey and T. Binford. Local shape from specularity. Computer Vision Graphics and Image Processing, pages 62– 86, April 1988. [8] B. Horn and M. J. Brooks. Shape from Shading. MIT Press, 1989. [9] P. Litwinowicz and L. Williams. Animating images with drawings. In Computer Graphics, pages 235–242. Siggraph, August 1990.

References [1] T. Beier and S. Neely. Feature-based image metamorphosis. In Computer Graphics, pages 35–42. Siggraph, July 1992.

[10] Z. Liu,Y. Shan, and Z. Zhang. Expressive expression mapping with ratio images. In Computer Graphics, Annual Conference Series, pages 271–276. Siggraph, August 2001.

[2] J. Blinn. Models of light reflection for computer synthesized pictures. In Computer Graphics, pages 192–198. SIGGRAPH, July 1977.

[11] S. R. Marschner and D. P. Greenberg. Inverse lighting for photography. In IST/SID Fifth Colort Imaging Conference, November 1997.

[3] N. Burson and T. D. Schneider. Method and apparatus for producing an image of a person’s face at a different age. United States Patent 4276570, 1981.

[12] S. R. Marschner, B. Guenter, and S. Raghupathy. Modeling and rendering for realistic facial animation. In Rendering Techniques, pages 231–242. Springer Wien New York, 2000.

[4] P. E. Debevec. Rendering synthetic objects into real scenes: Bridging traditional and image-based graphics with global illumination and high dynamic range photography. In Computer Graphics, Annual Conference Series, pages 189–198. Siggraph, July 1998.

[13] T. Riklin-Raviv and A. Shashua. The quotient image: Class based re-rendering and recongnition with varying illuminations. In IEEE Conference on Computer Vision and Pattern Recognition, pages 566–571, June 1999.

5

Figure 5: Left: image of a tissue. Middle: image of a table surface. Right: The synthesized image obtained by transferring the geometrical details of the tissue to the table surface.

(a)

(b)

(c)

(d)

(e)

(f)

Figure 6: Young adult vs. senior adult. (a) the face of a young adult (b) the simulated old face of (a) with a small σ (c) the simulated old face of (a) with a large σ (d) the face of a senior adult (e) the simulated young face of (d) with a small σ (f) the simulated young face of (d) with a large σ

Figure 7: Senior adult to young adult without IBSDT. Left: result with σ = 3 Right: result with σ = 8 The input face image is the same as Fig. (6)(d)

[14] H. Rushmeier, G. Taubin, and A. Gueziec. Applying shape from lighting variation to bump map capture. In Eurographics Workshop on Rendering, pages 35–44, 1997. [15] A. Stoschek. Image-based re-rendering of faces for continuous pose and illumination directions. In IEEE Conference on Computer Vision and Pattern Recognition, pages 582–587, 2000. [16] G. Wolberg. Digital Image Warping. IEEE Computer Society Press, 1990.

6