VISION-BASED CONTROL OF 3D FACIAL ANIMATION Jin-xiang Chai - Jing Xiao - Jessica Hodgins Carnegie Mellon University Eurographics / SIGGRAPH 2003
Yusuf OSMANLIOĞLU 2010
OUTLINE • • • • • • • •
Aim Existing techniques Proposed method and challenges Related work Overall system Analysis of system Results Future work
AIM “Interactive avatar control” • Designing a rich set of realistic facial actions for a virtual character • Providing intuitive and interactive control over these actions in real time
EXISTING TECHNIQUES • Physically modeling skin and muscles of the face • Motion capturing techniques – Vision based – Online
EXISTING TECHNIQUES
EXISTING TECHNIQUES
Motion Capture Techniques Control Interface
Quality
Vision based animation
+ Inexpensive + Easy to use
- Noisy - Low resolution
Online motion capture
- Expensive - Troublesome
+ High resolution
PROPOSED METHOD
+ Vision-based interface
Motion capture database
Interactive avatar control
CHALLENGES • Map low quality visual signals to high quality motion data • Extract meaningful animation control signals from the video sequence of a live performer in real time
• Make vertices of the face model to change place for forming facial expression, according to the displacement of limited number of markers • Allow any user to control any 3D face model
RELATED WORK • • • •
Keyframe interpolation Performanc Capturing Pseudo – muscle based / muscle based simulation 2D facial data for speech (viseme driven approach) • Full 3D motion capture data
RELATED WORK Motion capture • Making Faces[Guenter et al. 98] • Expression Cloning[Noh and Neumann 01]
Vision based tracking for direct animation • Physical markers[Williams 90] • Edges [Terzopoulos and Waters 93, Lanitis et al. 97] • Optical flow with 3D models[Essa et al. 96, Pighin et al. 99, DeCarlo et al. 00]
Vision based animation with blenshape • Hand-drawn expressions [Buck et al. 00] • 3D avatar model [FaceStation]
SYSTEM OVERVIEW Video Analysis
Performance capture
Expression control and animation
3D head pose
Expression retargeting
Avatar Animation
Preprocessed motion capture data
Video Analysis • Vision based facial tracking – Tracking 19 2D features on the face – 2xLips, 2xMouth, 4xEyebrow, 8xEye, 3xNose
• Initialization – Neutral face – Positioning and initializing parameters of the cylinder model to capture head pose – Positioning locations of 19 points manually
• Tracking pose of the head – 6 DOF – yaw, pitch, roll, 3D position – Updating position and orientation per frame – Reseting accumulated errors
• Expression tracking – Defining square windows centered at feature’s position
Video Analysis • Expression Control Parameters – 15 parameters that are extracted automatically from 2D tracking points – Mouth(6) – Nose(2) – Eyes(2) - Eyebrows(5)
Distance between two tracking points
Distance between a line and a point
Orientation and center of the mouth
Expression control signal
SYSTEM OVERVIEW Video Analysis
Expression control and animation
Performance capture
Expression retargeting
Avatar Animation
Preprocessed motion capture data
Motion Capture Data Preprocessing • Building up the face model with 3D laser scan • Motion capture – Attaching 76 reflective markers on actor’s face – Actor is allowed to move his head freely
• Coupled head and facial movements – Decoupling pose and expressions
Motion Capture Data Preprocessing Expression seperation
Expression control parameter extraction
3D poses
Motion Capture Data Preprocessing Motion capture database • 70.000 frames with a 120 fps camera (~10 minutes record) • 76 referance points on the face • 6 basic facial expression • Anger, fear, surprise, sadness, joy, disgust • Eating yawning, snoring • Each expression repeated 6 times during mocap session • Very limited motion data related to speaking(6000 frames) • Does not cover all variations of the facial movements related to speaking
SYSTEM OVERVIEW Video Analysis
Expression control and animation
Performance capture
Expression retargeting
Avatar Animation
Preprocessed motion capture data
Expression Control and Animation
19*2 DOF
2D tracking data
3D motion data
76*3 DOF
15 DOF
Facial expression control parameters
Facial expression control parameters
15 DOF
Vision-based Interface
Motion Capture Database
Expression Control and Animation • Visual expression control signals are very noisy
Expression control signal
Expression control parameter
• One-to-many mapping from expression control signal space 3D motion space 15 DOF Control Signal Space
76*3 DOF 3D Motion Space
Expression Control and Animation Noisy Control Signal Nearest Neghbor Search
K=120 closest examples
Time Interval W = 20 frame/60 fps =0.33s
Filtered Control Signal
Preprocessed motion capture database
Filter by eigen curves
Online PCA
7 largest eigen curves (99.5 % energy)
Expression Control and Animation d1
Filtered Control Signal Nearest Neighbor Search
w(d1)
d2
... dK
w(d2)
... w(dK)
SYSTEM OVERVIEW Video Analysis
Expression control and animation
Performance capture
Expression retargeting
Avatar Animation
Preprocessed motion capture data
EXPRESSION RETARGETING
Sythesized Expression
Avatar Expression
EXPRESSION RETARGETING δxs
xs
?
δxt
xt
• Learn the surface mapping function using Radial Basis Functions such that xt=f(xs) • Transfer the motion vector by local Jacobian matrix Jf(xs) by δxt=Jf(xs) δxs • Run time computational cost is independent from the number of vertices of head model
SYSTEM OVERVIEW Video Analysis
Expression control and animation
Performance capture
Expression retargeting
Avatar Animation
Preprocessed motion capture data
RESULTS
CONCLUSIONS Developed a performance-based facial animation system for interactive expression control • Tracking real-time facial movements in video • Preprocessing the motion capture database • Transforming low-quality 2D visual control signal to high quality 3D facial expression • An efficient online expression retargetting
FUTURE WORK • Formal user study on the quality of the synthesized motion • Controlling and animating 3D photorealistic facial expression • Size of database • Speech as an input to the system