VISION-BASED CONTROL OF 3D FACIAL ANIMATION

VISION-BASED CONTROL OF 3D FACIAL ANIMATION Jin-xiang Chai - Jing Xiao - Jessica Hodgins Carnegie Mellon University Eurographics / SIGGRAPH 2003 Yusu...
Author: Estella Joseph
1 downloads 2 Views 4MB Size
VISION-BASED CONTROL OF 3D FACIAL ANIMATION Jin-xiang Chai - Jing Xiao - Jessica Hodgins Carnegie Mellon University Eurographics / SIGGRAPH 2003

Yusuf OSMANLIOĞLU 2010

OUTLINE •  •  •  •  •  •  •  • 

Aim Existing techniques Proposed method and challenges Related work Overall system Analysis of system Results Future work

AIM “Interactive avatar control” •  Designing a rich set of realistic facial actions for a virtual character •  Providing intuitive and interactive control over these actions in real time

EXISTING TECHNIQUES •  Physically modeling skin and muscles of the face •  Motion capturing techniques –  Vision based –  Online

EXISTING TECHNIQUES

EXISTING TECHNIQUES

Motion Capture Techniques Control Interface

Quality

Vision based animation

+ Inexpensive + Easy to use

- Noisy - Low resolution

Online motion capture

- Expensive - Troublesome

+ High resolution

PROPOSED METHOD

+ Vision-based interface

Motion capture database

Interactive avatar control

CHALLENGES •  Map low quality visual signals to high quality motion data •  Extract meaningful animation control signals from the video sequence of a live performer in real time

•  Make vertices of the face model to change place for forming facial expression, according to the displacement of limited number of markers •  Allow any user to control any 3D face model

RELATED WORK •  •  •  • 

Keyframe interpolation Performanc Capturing Pseudo – muscle based / muscle based simulation 2D facial data for speech (viseme driven approach) •  Full 3D motion capture data

RELATED WORK Motion capture •  Making Faces[Guenter et al. 98] •  Expression Cloning[Noh and Neumann 01]

Vision based tracking for direct animation •  Physical markers[Williams 90] •  Edges [Terzopoulos and Waters 93, Lanitis et al. 97] •  Optical flow with 3D models[Essa et al. 96, Pighin et al. 99, DeCarlo et al. 00]

Vision based animation with blenshape •  Hand-drawn expressions [Buck et al. 00] •  3D avatar model [FaceStation]

SYSTEM OVERVIEW Video Analysis

Performance capture

Expression control and animation

3D head pose

Expression retargeting

Avatar Animation

Preprocessed motion capture data

Video Analysis •  Vision based facial tracking –  Tracking 19 2D features on the face –  2xLips, 2xMouth, 4xEyebrow, 8xEye, 3xNose

•  Initialization –  Neutral face –  Positioning and initializing parameters of the cylinder model to capture head pose –  Positioning locations of 19 points manually

•  Tracking pose of the head –  6 DOF – yaw, pitch, roll, 3D position –  Updating position and orientation per frame –  Reseting accumulated errors

•  Expression tracking –  Defining square windows centered at feature’s position

Video Analysis •  Expression Control Parameters –  15 parameters that are extracted automatically from 2D tracking points –  Mouth(6) – Nose(2) – Eyes(2) - Eyebrows(5)

Distance between two tracking points

Distance between a line and a point

Orientation and center of the mouth

Expression control signal

SYSTEM OVERVIEW Video Analysis

Expression control and animation

Performance capture

Expression retargeting

Avatar Animation

Preprocessed motion capture data

Motion Capture Data Preprocessing •  Building up the face model with 3D laser scan •  Motion capture –  Attaching 76 reflective markers on actor’s face –  Actor is allowed to move his head freely

•  Coupled head and facial movements –  Decoupling pose and expressions

Motion Capture Data Preprocessing Expression seperation

Expression control parameter extraction

3D poses

Motion Capture Data Preprocessing Motion capture database •  70.000 frames with a 120 fps camera (~10 minutes record) •  76 referance points on the face •  6 basic facial expression • Anger, fear, surprise, sadness, joy, disgust • Eating yawning, snoring • Each expression repeated 6 times during mocap session • Very limited motion data related to speaking(6000 frames) • Does not cover all variations of the facial movements related to speaking

SYSTEM OVERVIEW Video Analysis

Expression control and animation

Performance capture

Expression retargeting

Avatar Animation

Preprocessed motion capture data

Expression Control and Animation

19*2 DOF

2D tracking data

3D motion data

76*3 DOF

15 DOF

Facial expression control parameters

Facial expression control parameters

15 DOF

Vision-based Interface

Motion Capture Database

Expression Control and Animation •  Visual expression control signals are very noisy

Expression control signal

Expression control parameter

•  One-to-many mapping from expression control signal space 3D motion space 15 DOF Control Signal Space

76*3 DOF 3D Motion Space

Expression Control and Animation Noisy Control Signal Nearest Neghbor Search

K=120 closest examples

Time Interval W = 20 frame/60 fps =0.33s

Filtered Control Signal

Preprocessed motion capture database

Filter by eigen curves

Online PCA

7 largest eigen curves (99.5 % energy)

Expression Control and Animation d1

Filtered Control Signal Nearest Neighbor Search

w(d1)

d2

... dK

w(d2)

... w(dK)

SYSTEM OVERVIEW Video Analysis

Expression control and animation

Performance capture

Expression retargeting

Avatar Animation

Preprocessed motion capture data

EXPRESSION RETARGETING

Sythesized Expression

Avatar Expression

EXPRESSION RETARGETING δxs

xs

?

δxt

xt

•  Learn the surface mapping function using Radial Basis Functions such that xt=f(xs) •  Transfer the motion vector by local Jacobian matrix Jf(xs) by δxt=Jf(xs) δxs •  Run time computational cost is independent from the number of vertices of head model

SYSTEM OVERVIEW Video Analysis

Expression control and animation

Performance capture

Expression retargeting

Avatar Animation

Preprocessed motion capture data

RESULTS

CONCLUSIONS Developed a performance-based facial animation system for interactive expression control •  Tracking real-time facial movements in video •  Preprocessing the motion capture database •  Transforming low-quality 2D visual control signal to high quality 3D facial expression •  An efficient online expression retargetting

FUTURE WORK •  Formal user study on the quality of the synthesized motion •  Controlling and animating 3D photorealistic facial expression •  Size of database •  Speech as an input to the system

THANKS… QUESTIONS?

Suggest Documents