4D Geometry Compression based on Lifting Wavelet Transform

Proceedings of Industrial Engineering Research Conference (IERC’06), May 20-24, 2006 Orlando, Florida, Paper No. 1769 4D Geometry Compression based o...
Author: Justin Woods
2 downloads 0 Views 2MB Size
Proceedings of Industrial Engineering Research Conference (IERC’06), May 20-24, 2006 Orlando, Florida, Paper No. 1769

4D Geometry Compression based on Lifting Wavelet Transform Yan Wang and Heba Hamza NSF Center for e-Design, University of Central Florida, Orlando, FL 32816-2993, USA Abstract 3D geometry is one of the most extensively used data in military, engineering, and multimedia communication. In a distributed environment, efficient visual data exchange is desirable for real-time collaboration. Geometry compression is an effective way to distribute high-volume geometry data within limited bandwidth and storage capacity. In this paper, a new time-varying 3D geometry compression method based on 4D lifting wavelet transform is presented. In this hybrid approach, geometric information and animation are compressed based volume grid values. Surfaces are reconstructed from decompressed grid values. With rescaling and integer-to-integer lifting, compression ratio is significantly increased without compromising quality of surfaces. Keywords: 3D/4D Geometry Compression, Discrete Wavelet Transform, Integer-to-Integer Lifting, Isosurface

1

INTRODUCTION

3D geometry is one of the most extensively used data in military, engineering, and multimedia communication. Examples include terrain surface (road, mountain), artifacts (building, machine, device, vehicle), and natural objects (human, animal). In collaborative environments such as joint tasks and remote exploration, large amount of 3D data needs to be stored locally and transmitted over networks. This requires speeding up of transmission and reducing storage space given limited resources. Compared to still image, audio, and video, which are widely used in compressed formats, 3D geometry data compression is relatively new. There has been extensive research on 3D static geometry compression in the past decade [1, 2]. Research results have been included in emerging standards such as binary VRML and MPEG-4 [3]. However, only a few focus on dynamic geometry change over time. Similar to 2D video that complements 2D image, time-dependent 3D geometry can be looked as 3D video and has great potential in various applications, including communication, entertainment, scientific and medical computing, computer-aided design and engineering, as well as simulation and visualization. One can expect that 3D videos with compressed formats are standardized in the future and used as commonly as today’s audio and video. In general, there are two approaches in geometry compression, mesh oriented and image oriented approaches. In mesh oriented approach, both geometry (vertex coordinates in Euclidean space) and topology (connectivity among vertices) information is compressed. Accurate 3D meshes can be reconstructed, which is ideal for engineering applications. However, connectivity of mesh cannot be changed, which restrains it from general applications. In image oriented approaches, topology information is not considered. Volumetric geometry information is represented by voxels or points, and shapes can be compressed based on 2D image alike methods. The dynamics of topology can be captured easily, and the compression methods are general. However, to render a visually recognizable and appealing surface requires a large amount of data. A balanced approach considering these two ends will possibly introduce general solution with acceptable performance. In this paper, a new time-varied 3D geometry compression method based on 4D discrete wavelet transform (DWT) is presented. This method focuses on dynamic volume geometry compression with isosurface construction. Topology will not be coded as in mesh compression, as in many cases the connectivity exists purely for rendering purpose and it is not as essential as geometry information. Isosurface representation can reduce the data size for surface boundary reconstruction. With compressed geometry as well as related surface information such as normal directions and colors, it is sufficient to reconstruct surfaces for visualization with relatively low density volume information. In the rest of the paper, Section 2 gives an overview of related work. Section 3 presents the 4D volume compression scheme based on 4D lifted DWT where both floating point and integer lifting are compared.

2

BACKGROUND

As computing power grows, 3D animation becomes one of the important components in visual communication. 3D animation compression naturally catches attention. There are two approaches in 3D animation compression. In the

first approach, topology is assumed to be static, and there is no or small change in connectivity. In the second approach, topology may change arbitrarily between frames. Lengyel [4] first proposed a surface mesh compression algorithm based on encoding motion parameters. Subsequently, different dynamic mesh compression methods with little or no topology change are proposed, including approaches of principle component analysis (PCA) [5, 6, 7, 8], spatio-temporal prediction [9, 10], motion coding [11], clustering [12, 13], as well as wavelet coding [14, 15]. Compression of 3D volume animation with dynamic topology has been achieved by encoding quantized voxels [16], spatio-temporal coupling [17, 18], and wavelet transform [19, 20]. Different from the above coding methods, we propose a new 3D animation compression scheme combining volumetric data and isosurface construction. 4D wavelet transform is used to compress volumetric data, which considers spatial and temporal coherence simultaneously. Isosurface is constructed from low dense volumetric data to either improve visual effects or reconstruct surfaces of natural and man-made objects. The temptation is to create a generic 3D video coding mechanism to support a diverse range of applications.

3

PROPOSED COMPRESSION SCHEME

High Low High

High

Low

Low Low

High Low

LWT

High Low High

High High Low

Low High Low High

High Low

Isosurface Construction

Low

Encoding

Rescaling & Quantization

High

Low

Decoding

High High

Internet / Intranet

The volume data to be compressed is assumed to be regularly sampled, which is commonly used in scientific and medical visualization. Isosurface construction based on volume data can provide a realistic rendering and can also be applied in other visualization environments with artifacts and natural objects. The framework we propose is based on the combination of volumetric data and isosurfaces, as illustrated in Figure 1. 4D volumetric data is divided into groups of frames (GOFs). Each GOF is decomposed and compressed with 4D lifted wavelet transform at the server side. When received by client, the compressed data is decoded and decompressed. Isosurface is constructed based on the decompressed 4D volume data. The advantages of this wavelet-based approach include scalability and simultaneous redundancy reduction for spatial and time domains. With the inherent scalable representation, wavelet provides multi-resolution solution without extra costs. Wavelet captures coherence locality of spatial and temporal domains. There is no need to divide intraframe data into blocks, as in MPEG-2 standards based on discrete cosine transform (DCT).

Low

Low

Server

ILWT Client

Figure 1. 4D volume animation scheme 3.1 4D Lifting Wavelet Transform Lifting scheme [21] is also called second generation wavelet transform. It is a two-step filtering process: prediction and update. In the prediction step, even sequences are used to predict odd sequences. The prediction error forms the corresponding high-pass subband. In the update step, an approximation subband is obtained by updating even sequences with the scaled high-subband samples, which forms a low-pass subband. The main advantage of lifting is its memory efficiency in computation. Different from traditional traversal DWT, wavelet coefficient calculation in lifting scheme can be embedded in-place. Backward transform is easy to find and has the same complexity as the forward transform. Lossless integer-to-integer transform [22] can also be achieved. The two-step lifting transform can be generally described as hk [ x] = f 2 k +1[ x] + pi f 2 ( k −i ) [ x ]

∑ [ x] + ∑ u h i

l k [ x] = f 2 k

(1)

j k − j [ x]

j

where fk[x] is the sequence of input data to be processed, hk and lk are resulting high-pass and low-pass sequences

respectively, pi and uj are prediction and update coefficients of filters respectively. An important issue associated with spatio-temporal decomposition is the choice of filters. Different filters exhibit varied signal characteristics in terms of energy compactness in the transform domain and coding gain. Long filters tend to explore coherences of large regions or long period of time. However, they may blur boundary of occupation or movement. Two decomposition approaches can be taken for 4D wavelet transform. In dyadic decomposition, wavelet transform is applied in spatial X, Y, Z, and time T dimensions alternatively. In decoupled decomposition, transform is applied along the T dimension cascadingly first, then in X, Y, and Z dimensions alternatively. A dyadic decomposition approach is taken in this paper since it considers the coherence relation between time and space simultaneously. An example is used to illustrate the visual effect of compression. Figure 2 shows a few frames of the original 3D animation of polymer morphology change simulation, during which significant topology change is observed. A 2level decomposition process is applied with Haar lifting scheme. Compression is achieved by setting the transformed coefficients to zero with the original values less than a threshold. The highest coefficient magnitude in this example is M=89.944. If the threshold T is 0.11%, which is 0.11% of M, the decompressed surfaces are shown in Figure 3. With a 7.25:1 compression ratio, the compressed surfaces do not exhibit significant visual difference. As the threshold T increases to 0.55%, the reconstructed surfaces are shown in Figure 4 with a 27.4:1 compression ratio. An interesting “voxel” effect occurs on the surfaces because the grid neighborhood with same isovalues expands. One way to avoid the voxel effect is to reduce the resolution of surfaces by decreasing the density of grids at the client side. Figure 5 shows the reconstructed surfaces from the same data of Figure 4 but with reduced grid density by one half. Lower resolution but smoother isosurfaces are rebuilt. The second way to reduce the voxel effect is just to take advantage of inherent multi-resolution of wavelet transform and transmit low resolution data decomposed with multiple scales from the server side. If the previous 2-level data is only compressed and decompressed with 1-level, a compression ratio of 66:1 can be achieved. Figure 6 shows the reconstructed surface, in which even frames are simply copies of their intermediate odd frames before them.

(a) frame 1

(b) frame 17

(c) frame 33

(d) frame 64

Figure 2. Original isosurfaces of the animation with the file size of 108,153 KB

(a) frame 1

(b) frame 17

(c) frame 33

(d) frame 64

Figure 3. Decompressed isosurfaces with a threshold of 0.11% with file size of 14,924KB

(a) frame 1

(b) frame 17

(c) frame 33

(d) frame 64

Figure 4. Decompressed isosurfaces with a threshold of 0.55% with file size of 3,944KB

(a) frame 1

(b) frame 17

(c) frame 33

(d) frame 64

Figure 5. Reconstructed surfaces from data in Figure 4 with 50% grid density

(a) frame 1

(b) frame 17

(c) frame 33

(d) frame 64

Figure 6. Decompressed smaller scale data with a threshold of 0.55% with file size of 1,636KB 3.2 Rescaling and Integer-to-Integer Transform Compared to classical wavelet transform, in which transformed wavelet coefficients are floating point numbers even if the original data are integers, lifting scheme supports lossless integer-to-integer transform. It transforms integer data to integer coefficients. With inverse transform, original integer data can be reconstructed. Another interesting feature of our volume-based isosurface compression scheme is that isosurface construction is not sensitive to the number of bits used in coding if the range of grid values is large. As a result, floating-point grid values can be rounded to integers and integer-to-integer lifting wavelet transform can be used to increase the compression ratio without compromising quality. If the range of grid values is too small, the grid values can be rescaled before the rounding. A good rescaling strategy is to rescale the isosurface values to close to zero and the overall grid values to be evenly distributed between positive and negative sides. This will reduce the number of bits to code values. In the example of Figure 2, the maximum and minimum of grid values are ±7.8959. The grid values are multiplied by 100 before rounding. To reduce distortion, rounding towards zero is used. The result of integer lifting is depicted in Figure 7, where the size of the file containing integer coefficients is 6,318KB compared to the floating coefficients of 61,900KB. The quality of the surfaces is very close to the original ones.

(a) frame 1

(b) frame 17

(c) frame 33

(d) frame 64

Figure 7. Decompressed isosurfaces from integer lifting scheme with file size of 6,318KB 3.3 Motion Compensation Motion compensation (MC) is to remove temporal redundancy of video signal further in 4D subband coding. Blockbased motion models are predominantly used in traditional motion-compensated MPEG coding. They can accurately represent very smooth motion fields but not complex ones. In contrast, deformable mesh motion model can improve motion compensation by tracking expansions and contractions, at the same time sustaining a continuous motion field. Recently motion compensated temporal filtering through lifting is proven to be an effective temporal decomposition method in 3D DWT video coders [23,24]. Motion compensation can increase signal to noise ratio thus video quality, since it reduces the energy associated with high-pass temporal subbands effectively. In our application context, motion of surfaces is represented by value change of discrete grids, which is more resilient than pixel or voxel direct representation. Thus it is possible to create a general MC scheme. We develop a Control Grid MC model for lifting transform. With the notation in (1), hk [ x] = f 2 k +1[ x] + pi f 2( k − i ) [ MC2 k → 2 k +1 ( x)]

∑ i

lk [ x ] = f 2 k [ x ] +

(2)

∑u h

j k − j [ MC2 k +1→ 2 k ( x )]

j

is the MC scheme in spatial domain. In each frame, the 3D space is divided into small cells. The average value of within each cell is taken to be elements of motion vectors. For example, the Haar lifting with MC is shown in Figure 8. Odd frames are predicted by even frames with motion vectors before the lifting process by which high-pass temporal subbands are created. Low-pass temporal subbands are generated with inverse motion vectors and lifting. Finally, high-pass and low-pass subbands are normalized. 1/ 2

x2k

M 2 k →2 k +1 −1 x2k+1

lk

0.5 − M 2 k →2 k +1 1/ 2

hk

Figure 8. MC in Haar lifting scheme

4

CONCLUSION AND FUTURE WORK

In this paper, a new time-varying 3D geometry compression scheme based on 4D lifting wavelet transform is presented. It is demonstrated that a hybrid approach with volume grid values and isosurfaces is feasible for 3D geometry compression. Geometric information and animation are compressed based on volume grid values. Surfaces are reconstructed from grid values and isovalues. Rescaling and integer-to-integer lifting shows significantly improved compression ratio without compromising quality of surfaces. A control grid motion compensation model for lifting is also developed. The proposed 4D geometry compression can be used in general applications such as scientific computing and visualization, collaborative engineering, modeling and simulation, teleconferencing, and entertainment. Future work includes efficient isosurface construction, content-based motion compensation, as well as filter selection.

REFERENCES [1] P. Alliez and C. Gotsman, “Recent advances in compression of 3D meshes,” in Proc. Symp. on Mutiresolution in Geometric Modeling, 2003 [2] J. Peng, C.-S. Kim, and C.-C. J. Kuo, “Technologies for 3D mesh compression: A survey,” J. Visual Communication & Image Representation, vol.16, pp.688-733, 2005 [3] G. Taubin, W. Horn,, F. Lazarus, and J. Rossignac, “Geometry coding and VRML,” Proc. the IEEE, vol.96, no.6, pp.1228-1243, 1998 [4] J. Lengyel, “Compression of time dependent geometry,” Proc. 1999 ACM Symp. on Interactive 3D Graphics, pp.89-95, 1999 [5] M. Alexa and W. Müller, “Representing animations by principle components,” Proc. EUROGRAPHICS 2000, vol.19, no.3, 2000 [6] J.-H. Ahn, C.-S. Kim, C.-C. J. Kuo, and Y.-S. Ho, “Motion-compensated compression of 3D animation models,” IEE Electronics Letters, vol.37, no.24, pp.1445-1446, 2001 [7] M. Sattler, R. Sarlette, and R. Klein, “Simple and efficient compression of animation sequences,” Proc. EUROGRAPHICS 2005 / ACM SIGGRAPH Symp. on Computer Animation, pp.209-217 [8] Z. Karni and C. Gotsman, “Compression of soft-body animation sequences,” Computers & Graphics, vol.28, pp.25-34, 2004 [9] J.-H. Yang, C.-S. Kim, and S.-U. Lee, “Compression of 3-D triangle mesh sequences based on vertex-wise motion vector prediction,” IEEE Trans. Circuits and Systems for Video Technology, vol.12, no.12, pp.1178-1184, 2002 [10] L. Ibarria and J. Rossignac, “Dynapack: space-time compression of the 3D animations of triangle meshes with fixed connectivity,” Proc. EUROGRAPHICS 2003 / ACM SIGGRAPH Symp. on Computer Animation, pp.126-135, 2003 [11] J. Zhang and C.B. Owen, “Octree-based animated geometry compression,” Proc. 2004 IEEE Data Compression Conf., pp.508-517, 2004 [12] K. Müller, A. Smolic, M. Kautzner, P. Eisert, and T. Wiegand, “Predictive compression of dynamic 3D meshes,” Proc. 2005 Int. Conf. on Image Processing, Genova, Italy, 2005 [13] H.M. Briceño, P.V. Sander, L. McMillan, S. Gortler, and H. Hoppe, “Geometry videos: A new representation for 3D animations,” Proc. EUROGRAPHICS 2003 / ACM SIGGRAPH Symp. on Computer Animation, pp.136-146, 2003 [14] I. Guskov and A. Khodakovsky, “Wavelet compression of parametrically coherent mesh sequences,” Proc . EUROGRAPHICS 2004 / ACM SIGGRAPH Symp. on Computer Animation, pp.183-192, 2004 [15] F. Payan, Y. Boulfani, and M. Antonini, “Temporal lifting scheme for the compression of animated sequences of meshes,” Proc. IEEE Int. Workshop VLBV 2005, Sardinia, Italy, 2005 [16] K.-L. Ma, D. Smith, M.-Y. Shih, and H.-W. Shen, “Efficient encoding and rendering of time-varying volume data,” NASA/CR-1998-208424 ICASE Report NO.98-22, 1998 [17] E.B. Lum, K.-L. Ma, and J. Clyne, “Texture hardware assisted rendering of time-varying volume data,” Proc. IEEE Visualization 2001, pp.263-270, 2001 [18] A. Shamir, V. Pascucci, and C. Bajaj, “Multi-resolution dynamic meshes with arbitrary deformations,” Proc. IEEE Visualization 2000, pp.423-430, 2000 [19] S. Guthe and W. Strasser, “Real-time decompression and visualization of animated volume data,” Proc. IEEE Visualization 2001, pp.349-356, 2001 [20] B.-S. Sohn, C. Bajaj, and V. Siddavanahalli, “Volumetric video compression for interactive playback,” Computer Vision and Image Understanding, vol.96, pp.435-452, 2004 [21] W. Sweldens, “The lifting scheme: A custom-design construction of biorthogonal wavelets,” Applied and Computational Harmonic Analysis, vol.3, no.2, pp.186-200, 1996 [22] A.R. Calderbank, I. Daubechies, W. Sweldens, and B.-L., Yeo, “Lossless image compression using integer to integer wavelet transforms,” Proc. IEEE Int. Conf. Image Processing, pp.596-599, 1997 [23] B. Pesquet-Popescu and V. Bottreau, “Three-dimensional lifting schemes for motion compensated video compression,” Proc. IEEE Int. Conf. Acoustics Speech Signal Processing, pp.1793-1796, 2001 [24] A. Secker and D. Taubman, “Lifting-based invertible motion adaptive transform (LIMAT) framework for highly scalable video compression,” IEEE Trans. Image Process, vo.12, no.12, pp.1530-1542, 2003

Suggest Documents