Video Processing and Compression

Video Processing and Compression Video (A Sequence of Temporal Images) I(x,y,t) t 1 Video • Temporal Sequence of Images – I(x,y,t) – t = time ...

Author: Jasmin Pearson

5 downloads 0 Views 221KB Size

Report

Download PDF

Recommend Documents

Image and video compression

Image and Video Processing

H.264 and MPEG-4 Video Compression

Video Compression Card User Manual

Image Compression. CmpE 464 Image Processing. Image Compression: Coding redundancy. Image Compression. Image Compression: Coding redundancy

INTRODUCTION TO VIDEO PROCESSING

Python Video Processing

S-video Converter HD Video Processing LKV381

Video stabilization and motion detection using Matlab video processing toolbox

MATLAB Image Processing and Video Blockset

Information Theory and Coding Image, Video and Audio Compression

H.264 Video Compression Digital Video Recorder. User Guide

A Review on Image and Video processing

Practical Image and Video Processing Using MATLAB

Fractal based image and video processing

DSPs for image and video processing

THE H.264 ADVANCED VIDEO COMPRESSION STANDARD

An Adaptive Algorithm for video compression

Digital Image Processing Chapter 8: Image Compression

Text Pre-processing for Lossless Compression

Video Compression System for Mobile Devices

TURBO-CODE BASED WYNER-ZIV VIDEO COMPRESSION

Dynamic Processing Allocation in Video

Multiview Video Sequence Analysis, Compression, and Virtual Viewpoint Synthesis

Video Processing and Compression

Video

(A Sequence of Temporal Images)

I(x,y,t)

t

1

Video • Temporal Sequence of Images – I(x,y,t) – t = time

• Terminology – Frame

• an image in the sequence I(x,y,t=c) • frames per second (fps)

– numbers of frames displayed per second

– Field

• Frames are often composed of two fields – Even and odd

• This procedure is called interlacing

Interlacing even field

odd field

0

time

0.5

say it takes time, t=1, to capture a frame odd field captured in t/2 time even field captured in t/2 time

1

t

frame composite

odd and even field This is the “interlace” effect

2

Interlacing • Legacy from analogue video

– Fields used due to bandwidth limitations of analogue transmission – Trade off in spatial resolution for temporal time

• 30 frames per second

– means 60 fields per second – some monitors transmit video frame in interlaced format

Terminology – Progressive Scan Video Cameras

• Can capture a whole frame in time t • Although format can still be interlaced

– You will not get the “interlace” effect – that is, odd and even fields captured at the same time

– Progressive Scan

• Common on new models of Camcorders

– Topics in this lecture (other than de-interlacing) will assume no interlacing in the frame

3

Terminology • Critical Fusion Frequency – The temporal frame rate at which your eye does not see “flicker” in an video sequence – Generally considered to be around 24 fps

• Temporal Aliasing – Effect from low frame rate • Jerky Image • “Flicker”

Some Simple Video Processing • De-interlacing • Optical Flow/Motion Estimation • Object Tracking

4

De-interlacing • Frame is interlaced • De-interlacing

(suffers from interlace effect)

– Create two new frames (same size as original) – One using only even scan lines • Interpolate “odd” scan lines

– One using only odd scan lines

• Interpolate “even” scan lines

• Produces 60 frames per second

– De-interlaced frames have lower vertical resolution

• Common Pre-processing Technique

De-interlacing Example Even “Fields” Interpolate missing scanlines

Frame has interlaced effect Odd “Fields” Interpolate missing scanlines

5

Optical Flow • Estimating the motion field from image sequences • Idea: Change in illumination over time is related to motion in the scene such that: I(x,y,t-1) = I(x+u, y+v, t) • Assumes

– a constantly illuminated scene – fairly simple scene motion and object properties

3D Motion -> 2D Motion Field Object/ M(t) Scene

M(t+dt) dM

u = dx/dt v = dy/dt

Set of (u,v) vectors is called “optical flow”

V = dM/dt

(dx,dy)

Image

6

Motion Field

I(x,y,t)

I(x,y,t-1) (u,v)

I(x,y,t-1) = I(x+u, y+v, t)

[u,v] represent the projection of the 3D motion.

Optical Flow Example

I(x,y,t-1)

I(x,y,t)

“Needle representation” of the motion vectors Often shown at a “sparse” resolution.

7

Matching Approach to Optical Flow (Motion Estimation) • For each (x,y) in I(t-1) – Construct a template about (x,y)

• Search a neighborhood size W, about (x,y) in I(t) • Find best match – Using Correlation, SAD, SAM, etc . .

• Best match is at location x’, y’

– record motion vector

• V(x,y) = ( x’ – x, y’ – y’ )

• loop

Match-based Motion Estimation I(x,y,t-1)

Create a template around x,y

I(x,y,t)

Search a window W, about x,y Perform template matching Find best match (x’, y’) Motion Vector

u = x’ - x v = y’ - y

8

Aperture problem Actual Motion

Observed Image

I(t)

I(t)

I(t-1)

I(t-1)

• The motion perceived is only the normal component • Parallel direction cannot be estimated

Optical Flow Uses • Motion vectors gives insight to 3D scene – Used to help segment moving objects – Determine direction of objects – Other high-level analysis (take computer vision)

• Describes camera motion – Often called camera “ego” motion

De-Zoom

Zoom

Translate (or rotation)

9

Optical Flow • Many of the assumptions of optical flow are invalid – Constant Intensity

• changing illumination is perceived as motion • Objects do not have constant reflectance

– Slow moving objects

• we often have small objects with lots of occlusion/deocclusion

– Sufficient aperture

• As demonstrated we may encounter the aperture problem

• Even with invalid assumption, still gives reasonable results • Note: magic numbers for template size and search window

Object Tracking

1. Locate object 2. Track this object over time Generally assume a fixed camera

10

Two phases • “Lock on” – Find the object to track – Use recognition techniques – User selects initial starting point

• Tracking based on initial template – Construct a template around the object – Find the template in the next image – Perform template matching within a window

Tracking It

It+1

T

It+2

T

T

•Feature-based tracking –Construct a template around the object –Fixes the size of the template! –Find the template in the next image –Perform template matching within a window about current center –Update the template using the best match in advancing frames –this allows the template to adapt over time

11

Tracking Issues • Occlusion

– Object moves/becomes occluded

• Need to threshold matches to determine when object has been “lost” • Requires a new “lock on” phase

• Multiple Objects

– what happens when two similar objects “cross-over”

• Object deforms over time

– Object exhibits substantial change • Lighting effect, specularity, pose

• Tracking is a hard problem!

Tracking • Presented simple tracking skeleton

– Often want tracking to be performed in realtime – This is a fast and simple brute-force approach

• Extensions

– Speed up using prediction of motion based on previous estimations • See Kalman Filters

– Use better models than just template based • Training set, template set • Deformable templates • Affine templates

12

Other Video Processing • Digital Re-mastering

– Film converted to digital format – Allows us the opportunity to process the pixels

• Some examples

– Remove scratches in film

• Median filters • Median filters over time (use t axis)

– Digital Effects

• Post-video processing • Digital artists

Video Compression (MPEG)

13

Redundancies • Three basic types of redundancies – Coding Redundancy – Interpixel Redundancy • Spatial-Temporal Redundancy – Temporal Frame Coherence

– Psycho-visual Redundancy

• Image/Video compression – Reduce one or more of these redundancies

Image Compression (Remember JPEG)

f(x,y) - 128 (normalize between –128 to 127)

FDCT

(Forward DCT) on each block

Quantize DCT coefficients via Quantization “Table” C’(u,v) = round(C(u,v)/T(u,v))

T(u,v) f(x,y) Divided into 8x8 blocks

JPEG bitstream

Differential coding DC component

Huffman Encode

0

RLE AC Vector “Zig-zag” Order Coefficients

14

Video Compression • Motion-Jpeg – I(x,y,t) – Each frame (t) is encoded as a JPEG – Reduces spatial redundancies in the frame

• MPEG – Exploit temporal coherence – Image does not change much from frame to frame – If it does change, it is due to some small motion in the image

MPEG • Three Frame Types – I frame • Intra-encoded Frame • Fully encoded JPEG frame

– P frame • Predictive Frame

– B frame • Bi-directional Predictive Frame

15

MPEG • Each frame is divided into 8x8 blocks – 8x8 blocks will be DCT encoded

• We also create 16x16 logical “macro-blocks” – These will be used for motion estimation

8x8 Blocks

Macro-blocks

Predictive Frame • • •

Determine a motion vector for each macro-block in the P-frame The motion vector points to the most “similar” macro-block in a previous I-frame or P-frame Encode the motion vector

• •

Calculate residual error (diff of the macroblocks) Encode residual as 4 8x8 DCT blocks (like JPEG)

– Often referred to as Motion Compensation (MC)

– The residual DCT values are quantized (compression) Encode motion vector (u,v) for each macroblock Encode residual as 4 8x8 DCT blocks I(t-1)

I(t)

NOTE: if background, you get a (0,0) motion vector and no residual [good compression!]

16

MPEG Sequence with P-frames

I

P

P

P

P

B-Frames (Bi-direction predictive frames) • Bi-direction predictive frames • Can use previous or future P or I frames for motion compensation – YES, uses future frame – Not for real-time encoding

• Requires the frames to be transmitted out of order – and/or buffering

I

P

B

P

P

17

Group of Pictures (GOP) • Sequence of I-P-and-B frames form a logical GOP – GOP always starts with an I-frame

• Note that if you lose in the initial I-frame in “transmission” – The whole GOP is corrupted

• Common GOP formats – IPPPP – IBBPBBPBBPBBPBB

I

P

B

P

GOP

P I

P

B

P

P

GOP

MPEG Decoding Input Buffer

MC Decoder

ZZ and DeQuant

IDCT

Forward MC Prev. Frame Store

Adder

Display

Bidirectional MC Future Frame Store

Backward MC

18

Motion Compensation • Provides most of MPEG’s compression. • Relies on temporal coherence. • Finding a good motion vector essentially a search problem. • Evaluating the motion vector can be a bit tricky. – Often trade speed for accuracy

• MC is what makes MPEG asymmetric. – Harder to encode than to decode.

Exhaustive MC Search • Brute force calculation of MC – Using SAD, SAM

• The most obvious and easiest solution. • Encoding time related to size of search window. • Although time consuming, also embarrassingly parallel.

19

MPEG • DCT quantization of I-frames and residual controls frame fidelity • Note that error propagates as we have more and more predictive frames (B-P frames) • Resulting bit-rate is scene dependent! • More MPEG details See: – www.mpeg.org

Useful Tool • Berkeley MPEG encoder on unix machines • mpeg_encode – utility to convert pgm/ppm files to MPEG – usage: • mpeg_encode param_file

– Param file • Describes input files and MPEG format • See class-web page for example

20

Summary Video Processing • Processing

– De-interlacing

• Pre-processing step to remove interlacing effect

– Optical Flow

• Motion Vector Estimation

– Object Tracking

• Temporal Template Matching • Constrained Search Window

• Video Compression – MPEG

• Powerful compression exploiting temporal coherence

– I, P, and B frames

• P and B frames use motion estimation for macroblocks • Encode a residual error

Active Research Areas • Optical Flow

– Computational expensive process – Research into faster techniques – Dense samples

• Tracking

– Tracking is a hot topic – Recognition over temporal frames – Multiple objects

• Especially multiple interacting objects

– New approaches to finding the object – Staying locked on in the face of occlusions

21

Active Research Areas • Video Encoding • Hot topics – Video Codecs – – – – – –

• new techniques, often for non-real-time apps

Constant bit-rate for MPEG Robustness to transmission errors VOD (video on demand) Streaming technology Motion compensation strategies MPEG-4

• Wrapper for all types of data (not just video)

22