Overview: Video Coding Standards

Overview: Video Coding Standards „ „ „ „ „ „ „ „ Video coding standards: applications and common structure Relevant standards organizations ITU-T Rec...
Author: Thomas Jacobs
150 downloads 2 Views 540KB Size
Overview: Video Coding Standards „ „ „ „ „ „ „ „

Video coding standards: applications and common structure Relevant standards organizations ITU-T Rec. H.261 ITU-T Rec. H.263 ISO/IEC MPEG-1 ISO/IEC MPEG-2 ISO/IEC MPEG-4 Recent progress: H.264/AVC

Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 1

The JVT Project „

„ „

„ „

ITU-T Q.6/SG16 (VCEG - Video Coding Experts Group) formed for ITU-T standardization activity for video compression since 1997 August 1999: 1st test model (TML-1) of H.26L December 2001: Formation of the Joint Video Team (JVT) between VCEG and ISO/IEC JTC 1/SC 29/WG 11 (MPEG) to establish a joint standard project - H.264 / MPEG4-AVC ITU-T Approval: May 2003 ISO/IEC Approval: October 2003

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 2

JVT Goals „

Improved coding efficiency z

z

„

Improved network friendliness z

z

„

Average bit rate reduction of 50% given fixed fidelity compared to any other standard Trade-off complexity vs. coding efficiency Anticipate error-prone transport over mobile networks and the wired and wireless Internet Further improve robustness techniques in H.263 and MPEG-4

Simple syntax specification z z

Avoid excessive quantity of optional features Minimize number of “profiles” for distinct application areas

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 3

H.264/JVT Applications „

Entertainment Video z z

„

Conversational Services z z z z

„

H.320 Conversational 3GPP Conversational H.324/M H.323 Conversational Internet/best effort IP/RTP 3GPP Conversational IP/RTP/SIP

Video Streaming z z

„

Broadcast: Terrestial / Satellite / Cable . . . Storage: DVD / HD-DVD / PVR . . .

3GPP Streaming IP/RTP/RTSP Streaming IP/RTP/RTSP (without TCP fallback)

Other Applications z z

3GPP Multimedia Messaging Services Digital camcorder [source: G. Sullivan, VCEG]

Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 4

Relationship to Other Standards „

„

Identical specifications have been approved in both ITU-T / VCEG and ISO/IEC / MPEG In ITU-T / VCEG this is a new & separate standard z z

„

In ISO/IEC / MPEG this is a new “part” in the MPEG-4 suite z z

z z „

ITU-T Recommendation H.264 ITU-T Systems (H.32x) will be modified to support it Separate codec design from prior MPEG-4 visual New Part 10 called “Advanced Video Coding” (AVC – similar to “AAC” in MPEG-2 as separate audio codec) MPEG-4 Systems / File Format has been modified to support it H.222.0 | MPEG-2 Systems also modified to support it

IETF: RTP payload packetization

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 5

H.264/AVC Profiles „

„

„

Baseline: core compression capabilities, plus error resilience, e.g., for videoconferencing, mobile video Main: high compression and quality, e.g., for broadcasting Extended: added features for efficient streaming

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 6

H.264/AVC Coder Input Video Signal

Coder Control

Split into Macroblocks 16x16 pixels

Control Data

Transform/ Scal./Quant.

Decoder

Quant. Transf. coeffs Scaling & Inv. Transform Entropy Coding

Intra-frame Prediction

Deblocking Filter

MotionIntra/Inter Compensation

Output Video Signal Motion Data

Motion Estimation

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 7

Input Video Signal Progressive Frame

Top Field

Bottom Field

• Progressive and interlaced frames can be coded as one unit • Progressive vs. interlace frame is signaled but has no impact on decoding • Each field can be coded separately • Dangling fields

Δt Interlaced Frame (Top Field First)

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 8

Partitioning of the Picture ƒ Slices: • A picture is split into 1 or several slices • Slices are self-contained • Slices are a sequence of macroblocks ƒ Macroblocks: • Basic syntax & processing unit • Contains 16x16 luma samples and 2 x 8x8 chroma samples • Macroblocks within a slice depend on each other • Macroblocks can be further partitioned

Slice #0 Slice #1

Slice #2 0 1 2 …

Macroblock #40

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 9

Flexible Macroblock Ordering (FMO) ƒ Slice Group: • Pattern of macroblocks defined by a Macroblock allocation map • A slice group may contain 1 to several slices ƒ Macroblock allocation map types: • Interleaved slices • Dispersed macroblock allocation • Explicitly assign a slice group to each macroblock location in raster scan order • One or more “foreground” slice groups and a “leftover” slice group

Slice Group #0

Slice Group #1 Slice Group #2

Slice Group #0 Slice Group #1

Slice Group #0

Slice Group #1

Slice Group #2

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 10

Interlaced Processing ƒ Field coding: each field is coded as a separate picture using fields for motion compensation

ƒ Frame coding: • Type 1: the complete frame is coded as a separate picture • Type 2: the frame is scanned as macroblock pairs, for each macroblock pair: switch between frame and field coding

0 2 4 … 1 3 5 … 36 37

Macroblock Pair

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 11

Scanning of a Macroblock

0

1

2

3

Intra_16x16 macroblock type only: Luma 4x4 DC

-1

... 0

1

4

Cb 16

5

Coded Block Pattern for 2 3 6 7 Luma in 8x8 block order: 8 9 12 13 signals which of the 8x8 blocks contains at least 10 11 14 15 one 4x4 block with nonzero transform coefficients Luma 4x4 block order for 4x4 intra prediction and 4x4 residual coding

18 19

Cr

17

2x2 DC

22 23 AC

20

21

24

25

Chroma 4x4 block order for 4x4 residual coding, shown as 16-25, and intra 4x4 prediction, shown as 18-21 and 22-25

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 12

H.264/AVC Coder Input Video Signal

Coder Control

-

Control Data

Transform/ Scal./Quant.

Quant. Transf. coeffs Scaling & Inv. Transform

Split into Macroblocks 16x16 pixels

Intra Prediction Data

Intra-frame Estimation

Intra-frame Prediction

Intra/Inter MB select

Motion Compensation

Deblocking Filter

Motion Estimation

Entropy Coding

Motion Data Output Video Signal

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 13

Common Elements with other Standards „ „

„ „ „ „ „ „

Macroblocks: 16x16 luma + 2 x 8x8 chroma samples Input: Association of luma and chroma and conventional sub-sampling of chroma (4:2:0) Block-wise motion compensation Motion vectors over picture boundaries Variable block-size motion Block transforms Scalar quantization I, P, and B coding types

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 14

H.264 Motion Compensation Accuracy Input Video Signal

Coder Control

Split into Macroblocks 16x16 pixels

Control Data

Transform/ Scal./Quant. Decoder

Quant. Transf. coeffs Scaling & Inv. Transform Entropy Coding

Intra-frame Prediction MotionIntra/Inter Compensation

Motion Estimation

De-blocking 16x16 Filter MB 0 Types

16x8 0

8x16

8x8 0 1

0 1 2 3 Output1 Video 4x8 8x8 8x4 4x4 Signal 0 1 0 8x8 0 1 0 Motion Types 2 3 1 Data Motion vector accuracy 1/4 (6-tap filter)

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 15

H.264 Multiple Reference Frames Input Video Signal

Coder Control

Split into Macroblocks 16x16 pixels

Control Data

Transform/ Scal./Quant. Decoder

Quant. Transf. coeffs Scaling & Inv. Transform Entropy Coding

Intra-frame Prediction

De-blocking Filter Output Video Signal

MotionIntra/Inter Compensation

Motion Estimation

Motion Frames ƒ Multiple Reference Data ƒ Generalized B Frames ƒ Weighted Prediction

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 16

H.264 Intra Prediction Input Video Signal

Split into Macroblocks 16x16 pixels

ƒ Directional spatial prediction (9 types for luma, 1 chroma) Control

Coder Control Transform/ Scal./Quant. Decoder

Scaling & Inv. Transform

Q A BData C D E F G H I a Quant. b c d J Transf. e f gcoeffs h K i j k l L m n o p Entropy 0 Coding

Intra-frame Prediction

7 2 8

De-blocking Filter

MotionIntra/Inter Compensation

Motion Estimation

Output Video Signal

4

6 1 5

3

• e.g., Mode 3: Motion diagonal down/right prediction Data a, f, k, p are predicted by (A + 2Q + I + 2) >> 2 [source: G. Sullivan, VCEG]

Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 17

H.264 4x4 Transform Input Video Signal

Coder Control

-

Control Data

Transform/ Scal./Quant.

Decoder ƒ 4x4 Block Integer Transform Split into

⎡ Macroblocks ⎢2 16x16 pixels ⎢

1 1 1⎤ 1 −1 −2⎥⎥ H= ⎢ 1 −1 −1 1 ⎥ ⎢ ⎥ ⎣⎢ 1 −2 2 −1⎥⎦ 1

Quant. Transf. coeffs Scaling & Inv. Transform Entropy Coding

Intra-frame Prediction DC coeffs

De-blocking Filter

ƒ Repeated transform of for 8x8 chroma and some Motion16x16 Intra luma blocks Intra/Inter Compensation

Output Video Signal Motion Data

Motion Estimation

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 18

Quantization of Transform Coefficients „ „ „ „ „ „

Scalar quantization Logarithmic step size control Smaller step size for chroma (per H.263 Annex T) Extended range of step sizes Can change to any step size at macroblock level Quantization reconstruction is one multiply, one add, one shift

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 19

Deblocking Filter „

„ „ „

Improves subjective quality and PSNR of the decoded picture Significantly superior to post filtering Filtering affects the edges of the 4x4 block structure Adaptive filtering removes blocking artifacts, but does not unnecessarily blur the visual content z z z z

On slice level, the global filtering strength can be adjusted to the individual characteristics of the video sequence On edge level, filtering strength is made dependent on inter/intra, motion, and coded residuals On sample level, quantizer dependent thresholds can turn off filtering for every individual sample Specially strong filter for macroblocks with very flat characteristics almost removes “tiling artifacts” [source: G. Sullivan, VCEG]

Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 20

Deblocking Filter q0 q1

q2

One dimensional visualization of an edge position Filtering of p0 and q0 only takes place if:

p2

p0 p1

4x4 Block Edge

1.

|p0 - q0| < α(QP)

2.

|p1 - p0| < β(QP)

3.

|q1 - q0| < β(QP)

Where β(QP) is considerably smaller than α(QP) Filtering of p1 or q1 takes place if additionally : 1.

|p2 - p0| < β(QP) or |q2 - q0| < β(QP) (QP = quantization parameter)

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 21

Deblocking: Subjective Result for Intra Highly compressed first decoded intra picture at 0.28 bit/sample

Without Filter

With H264/AVC Deblocking [source: G. Sullivan, VCEG]

Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 22

Deblocking: Subjective Result for Inter Highly compressed decoded inter picture

Without Filter

With H264/AVC Deblocking [source: G. Sullivan, VCEG]

Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 23

Entropy coding Input Video Signal

Coder Control

Split into Macroblocks 16x16 pixels

Control Data

Transform/ Scal./Quant. Decoder

Quant. Transf. coeffs Inv. Scal. & Transform Entropy Coding

Intra-frame Prediction

De-blocking Filter

MotionIntra/Inter Compensation

Output Video Signal Motion Data

Motion Estimation

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 24

Variable length coding „

„

Exp-Golomb code for almost all symbols except for transform coefficients Context adaptive VLCs for coding of transform coefficients z z z

Number of coefficients is decoded Special treatment of values +1 and -1 Contexts are built dependent on transform coefficients

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 25

Context-Adaptive Arithmetic Coding (CABAC)

update probability estimation

Context modeling

Binarization

Probability estimation

Coding engine

Adaptive binary arithmetic coder Chooses a model conditioned on past observations

Maps non-binary symbols to a binary sequence

Uses the provided model for the actual encoding and updates the model

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 26

S Pictures „

General description z

z z

„

Allows identical reconstruction of frames even when different reference frames are being used SP pictures use of motion-compensated prediction SI pictures can exactly approximate SP pictures

Applications z z z z z

Bitstream switching or splicing Random access Fast-forward, fast-backward Error recovery and/or resiliency Resynchronization such as in Video Redundancy Coding

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 27

SP and SI Pictures l rec Scaling Quant. Transf. coeffs

Quantization

+

Scaling & Inv. Transform

Transform Entropy Decoding

l pred

De-blocking Filter

Control Data Intra-frame Prediction MotionIntra/Inter Compensation

Motion Data

Output Video Signal

Motion Estimation

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 28

Comparison of H.264 to MPEG-4 „

MPEG-4: Advanced Simple Profile (ASP) z z

„

H.264: z z z

„

Motion Compensation: 1/4 pel Global Motion Compensation Motion Compensation: 1/4 pel Using CABAC entropy coding 5 reference frames (News: 17)

Both z z z z

Sequence structure IBBPBBP... QPB=QPP+2 (step size: +25%) Search range: 32x32 around 16x16 predictor Lagrangian D+λR coder control [source: ITU-T VCEG]

Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 29

RD Curves: Foreman (QCIF, 10Hz) 39 38

Average PSNR(Y) [dB]

37 36 35 34 33

>30%

32 31 30 29 28

MPEG-4

27

H.26L

26 0

16

32

48

64

80

96

112

128

Bit-rate [kbit/s] [source: ITU-T VCEG] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 30

Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 31

Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 32

Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 33

Performance Streaming Application Average bit-rate savings relative to: Coder

MPEG-4 ASP H.263 HLP

MPEG-2

H.264/AVC MP

37.44%

47.58%

63.57%

MPEG-4 ASP

-

16.65%

42.95%

H.263 HLP

-

-

30.61%

[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 34

Y-PSNR [dB]

Example Streaming Test Result Tem pete CIF 15Hz

38 37 36 35 34 33 32 31 30 29 28 27 26 25 24

MPEG-2 H.263 HLP MPEG-4 ASP H.264/AVC MP Test Points 0

256

512

768

1024

1280

1536

1792

Bit-rate [kbit/s]

[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 35

Example Streaming Test Result Tem pete CIF 15Hz 80%

Rate saving relative to MPEG-2

70% 60%

H.264/AVC MP

50% 40% 30%

MPEG-4 ASP

20% H.263 HLP 10% 0% 26

28

30

32

34

36

38

Y-PSNR [dB]

[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 36

Test Results for Real-Time Conversation Average bit-rate savings relative to: Coder

H.263 CHC MPEG-4 SP H.263 Base

H.264/AVC BP

27.69%

29.37%

40.59%

H.263 CHC

-

2.04%

17.63%

MPEG-4 SP

-

-

15.69%

[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 37

Example Real-Time Conversation Result

Y-PSNR [dB]

Paris CIF 15Hz 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24

H.263-Base H.263 CHC MPEG-4 SP H.264/AVC BP Test Points 0

128

256

384

512

640

768

Bit-rate [kbit/s]

[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 38

Example Real-Time Test Result Paris CIF 15Hz 50% H.264/AVC BP Rate saving relative to H.263-Baseline

40%

30%

20%

H.263 CHC

10% MPEG-4 SP 0% 24

26

28

30

32

34

36

38

Y-PSNR [dB]

[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 39

Test Results Entertainment-Quality Applications

Average bit-rate savings relative to: Coder

MPEG-2

H.264/AVC MP

45%

[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 40

Example Entertainment-Quality Applications Result

Y-PSNR [dB]

Entertainm ent SD (720x576i) 25Hz 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24

MPEG-2 H.264/AVC MP 0

1

2

3

4 5 6 Bit-rate [Mbit/s]

7

8

9

10

[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 41

Example Entertainment-Quality Applications Result Entertainm ent SD (720x576i) 25Hz

Rate saving relative to MPEG-2

60% 50% H.264/AVC MP 40% 30% 20% 10% 0% 26

28

30

32

34

36

38

Y-PSNR [dB]

[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 42

Further reading IEEE Transactions on Circuits and Systems for Video Technology, Special Issue on the H.264/JVC Video Coding Standard, July 2003.

Bernd Girod: EE398B Image Communication II

Video Coding Standards: H.264/AVC no. 43