Overview: Video Coding Standards
Video coding standards: applications and common structure Relevant standards organizations ITU-T Rec. H.261 ITU-T Rec. H.263 ISO/IEC MPEG-1 ISO/IEC MPEG-2 ISO/IEC MPEG-4 Recent progress: H.264/AVC
Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 1
The JVT Project
ITU-T Q.6/SG16 (VCEG - Video Coding Experts Group) formed for ITU-T standardization activity for video compression since 1997 August 1999: 1st test model (TML-1) of H.26L December 2001: Formation of the Joint Video Team (JVT) between VCEG and ISO/IEC JTC 1/SC 29/WG 11 (MPEG) to establish a joint standard project - H.264 / MPEG4-AVC ITU-T Approval: May 2003 ISO/IEC Approval: October 2003
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 2
JVT Goals
Improved coding efficiency z
z
Improved network friendliness z
z
Average bit rate reduction of 50% given fixed fidelity compared to any other standard Trade-off complexity vs. coding efficiency Anticipate error-prone transport over mobile networks and the wired and wireless Internet Further improve robustness techniques in H.263 and MPEG-4
Simple syntax specification z z
Avoid excessive quantity of optional features Minimize number of “profiles” for distinct application areas
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 3
H.264/JVT Applications
Entertainment Video z z
Conversational Services z z z z
H.320 Conversational 3GPP Conversational H.324/M H.323 Conversational Internet/best effort IP/RTP 3GPP Conversational IP/RTP/SIP
Video Streaming z z
Broadcast: Terrestial / Satellite / Cable . . . Storage: DVD / HD-DVD / PVR . . .
3GPP Streaming IP/RTP/RTSP Streaming IP/RTP/RTSP (without TCP fallback)
Other Applications z z
3GPP Multimedia Messaging Services Digital camcorder [source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 4
Relationship to Other Standards
Identical specifications have been approved in both ITU-T / VCEG and ISO/IEC / MPEG In ITU-T / VCEG this is a new & separate standard z z
In ISO/IEC / MPEG this is a new “part” in the MPEG-4 suite z z
z z
ITU-T Recommendation H.264 ITU-T Systems (H.32x) will be modified to support it Separate codec design from prior MPEG-4 visual New Part 10 called “Advanced Video Coding” (AVC – similar to “AAC” in MPEG-2 as separate audio codec) MPEG-4 Systems / File Format has been modified to support it H.222.0 | MPEG-2 Systems also modified to support it
IETF: RTP payload packetization
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 5
H.264/AVC Profiles
Baseline: core compression capabilities, plus error resilience, e.g., for videoconferencing, mobile video Main: high compression and quality, e.g., for broadcasting Extended: added features for efficient streaming
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 6
H.264/AVC Coder Input Video Signal
Coder Control
Split into Macroblocks 16x16 pixels
Control Data
Transform/ Scal./Quant.
Decoder
Quant. Transf. coeffs Scaling & Inv. Transform Entropy Coding
Intra-frame Prediction
Deblocking Filter
MotionIntra/Inter Compensation
Output Video Signal Motion Data
Motion Estimation
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 7
Input Video Signal Progressive Frame
Top Field
Bottom Field
• Progressive and interlaced frames can be coded as one unit • Progressive vs. interlace frame is signaled but has no impact on decoding • Each field can be coded separately • Dangling fields
Δt Interlaced Frame (Top Field First)
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 8
Partitioning of the Picture Slices: • A picture is split into 1 or several slices • Slices are self-contained • Slices are a sequence of macroblocks Macroblocks: • Basic syntax & processing unit • Contains 16x16 luma samples and 2 x 8x8 chroma samples • Macroblocks within a slice depend on each other • Macroblocks can be further partitioned
Slice #0 Slice #1
Slice #2 0 1 2 …
Macroblock #40
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 9
Flexible Macroblock Ordering (FMO) Slice Group: • Pattern of macroblocks defined by a Macroblock allocation map • A slice group may contain 1 to several slices Macroblock allocation map types: • Interleaved slices • Dispersed macroblock allocation • Explicitly assign a slice group to each macroblock location in raster scan order • One or more “foreground” slice groups and a “leftover” slice group
Slice Group #0
Slice Group #1 Slice Group #2
Slice Group #0 Slice Group #1
Slice Group #0
Slice Group #1
Slice Group #2
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 10
Interlaced Processing Field coding: each field is coded as a separate picture using fields for motion compensation
Frame coding: • Type 1: the complete frame is coded as a separate picture • Type 2: the frame is scanned as macroblock pairs, for each macroblock pair: switch between frame and field coding
0 2 4 … 1 3 5 … 36 37
Macroblock Pair
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 11
Scanning of a Macroblock
0
1
2
3
Intra_16x16 macroblock type only: Luma 4x4 DC
-1
... 0
1
4
Cb 16
5
Coded Block Pattern for 2 3 6 7 Luma in 8x8 block order: 8 9 12 13 signals which of the 8x8 blocks contains at least 10 11 14 15 one 4x4 block with nonzero transform coefficients Luma 4x4 block order for 4x4 intra prediction and 4x4 residual coding
18 19
Cr
17
2x2 DC
22 23 AC
20
21
24
25
Chroma 4x4 block order for 4x4 residual coding, shown as 16-25, and intra 4x4 prediction, shown as 18-21 and 22-25
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 12
H.264/AVC Coder Input Video Signal
Coder Control
-
Control Data
Transform/ Scal./Quant.
Quant. Transf. coeffs Scaling & Inv. Transform
Split into Macroblocks 16x16 pixels
Intra Prediction Data
Intra-frame Estimation
Intra-frame Prediction
Intra/Inter MB select
Motion Compensation
Deblocking Filter
Motion Estimation
Entropy Coding
Motion Data Output Video Signal
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 13
Common Elements with other Standards
Macroblocks: 16x16 luma + 2 x 8x8 chroma samples Input: Association of luma and chroma and conventional sub-sampling of chroma (4:2:0) Block-wise motion compensation Motion vectors over picture boundaries Variable block-size motion Block transforms Scalar quantization I, P, and B coding types
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 14
H.264 Motion Compensation Accuracy Input Video Signal
Coder Control
Split into Macroblocks 16x16 pixels
Control Data
Transform/ Scal./Quant. Decoder
Quant. Transf. coeffs Scaling & Inv. Transform Entropy Coding
Intra-frame Prediction MotionIntra/Inter Compensation
Motion Estimation
De-blocking 16x16 Filter MB 0 Types
16x8 0
8x16
8x8 0 1
0 1 2 3 Output1 Video 4x8 8x8 8x4 4x4 Signal 0 1 0 8x8 0 1 0 Motion Types 2 3 1 Data Motion vector accuracy 1/4 (6-tap filter)
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 15
H.264 Multiple Reference Frames Input Video Signal
Coder Control
Split into Macroblocks 16x16 pixels
Control Data
Transform/ Scal./Quant. Decoder
Quant. Transf. coeffs Scaling & Inv. Transform Entropy Coding
Intra-frame Prediction
De-blocking Filter Output Video Signal
MotionIntra/Inter Compensation
Motion Estimation
Motion Frames Multiple Reference Data Generalized B Frames Weighted Prediction
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 16
H.264 Intra Prediction Input Video Signal
Split into Macroblocks 16x16 pixels
Directional spatial prediction (9 types for luma, 1 chroma) Control
Coder Control Transform/ Scal./Quant. Decoder
Scaling & Inv. Transform
Q A BData C D E F G H I a Quant. b c d J Transf. e f gcoeffs h K i j k l L m n o p Entropy 0 Coding
Intra-frame Prediction
7 2 8
De-blocking Filter
MotionIntra/Inter Compensation
Motion Estimation
Output Video Signal
4
6 1 5
3
• e.g., Mode 3: Motion diagonal down/right prediction Data a, f, k, p are predicted by (A + 2Q + I + 2) >> 2 [source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 17
H.264 4x4 Transform Input Video Signal
Coder Control
-
Control Data
Transform/ Scal./Quant.
Decoder 4x4 Block Integer Transform Split into
⎡ Macroblocks ⎢2 16x16 pixels ⎢
1 1 1⎤ 1 −1 −2⎥⎥ H= ⎢ 1 −1 −1 1 ⎥ ⎢ ⎥ ⎣⎢ 1 −2 2 −1⎥⎦ 1
Quant. Transf. coeffs Scaling & Inv. Transform Entropy Coding
Intra-frame Prediction DC coeffs
De-blocking Filter
Repeated transform of for 8x8 chroma and some Motion16x16 Intra luma blocks Intra/Inter Compensation
Output Video Signal Motion Data
Motion Estimation
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 18
Quantization of Transform Coefficients
Scalar quantization Logarithmic step size control Smaller step size for chroma (per H.263 Annex T) Extended range of step sizes Can change to any step size at macroblock level Quantization reconstruction is one multiply, one add, one shift
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 19
Deblocking Filter
Improves subjective quality and PSNR of the decoded picture Significantly superior to post filtering Filtering affects the edges of the 4x4 block structure Adaptive filtering removes blocking artifacts, but does not unnecessarily blur the visual content z z z z
On slice level, the global filtering strength can be adjusted to the individual characteristics of the video sequence On edge level, filtering strength is made dependent on inter/intra, motion, and coded residuals On sample level, quantizer dependent thresholds can turn off filtering for every individual sample Specially strong filter for macroblocks with very flat characteristics almost removes “tiling artifacts” [source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 20
Deblocking Filter q0 q1
q2
One dimensional visualization of an edge position Filtering of p0 and q0 only takes place if:
p2
p0 p1
4x4 Block Edge
1.
|p0 - q0| < α(QP)
2.
|p1 - p0| < β(QP)
3.
|q1 - q0| < β(QP)
Where β(QP) is considerably smaller than α(QP) Filtering of p1 or q1 takes place if additionally : 1.
|p2 - p0| < β(QP) or |q2 - q0| < β(QP) (QP = quantization parameter)
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 21
Deblocking: Subjective Result for Intra Highly compressed first decoded intra picture at 0.28 bit/sample
Without Filter
With H264/AVC Deblocking [source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 22
Deblocking: Subjective Result for Inter Highly compressed decoded inter picture
Without Filter
With H264/AVC Deblocking [source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 23
Entropy coding Input Video Signal
Coder Control
Split into Macroblocks 16x16 pixels
Control Data
Transform/ Scal./Quant. Decoder
Quant. Transf. coeffs Inv. Scal. & Transform Entropy Coding
Intra-frame Prediction
De-blocking Filter
MotionIntra/Inter Compensation
Output Video Signal Motion Data
Motion Estimation
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 24
Variable length coding
Exp-Golomb code for almost all symbols except for transform coefficients Context adaptive VLCs for coding of transform coefficients z z z
Number of coefficients is decoded Special treatment of values +1 and -1 Contexts are built dependent on transform coefficients
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 25
Context-Adaptive Arithmetic Coding (CABAC)
update probability estimation
Context modeling
Binarization
Probability estimation
Coding engine
Adaptive binary arithmetic coder Chooses a model conditioned on past observations
Maps non-binary symbols to a binary sequence
Uses the provided model for the actual encoding and updates the model
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 26
S Pictures
General description z
z z
Allows identical reconstruction of frames even when different reference frames are being used SP pictures use of motion-compensated prediction SI pictures can exactly approximate SP pictures
Applications z z z z z
Bitstream switching or splicing Random access Fast-forward, fast-backward Error recovery and/or resiliency Resynchronization such as in Video Redundancy Coding
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 27
SP and SI Pictures l rec Scaling Quant. Transf. coeffs
Quantization
+
Scaling & Inv. Transform
Transform Entropy Decoding
l pred
De-blocking Filter
Control Data Intra-frame Prediction MotionIntra/Inter Compensation
Motion Data
Output Video Signal
Motion Estimation
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 28
Comparison of H.264 to MPEG-4
MPEG-4: Advanced Simple Profile (ASP) z z
H.264: z z z
Motion Compensation: 1/4 pel Global Motion Compensation Motion Compensation: 1/4 pel Using CABAC entropy coding 5 reference frames (News: 17)
Both z z z z
Sequence structure IBBPBBP... QPB=QPP+2 (step size: +25%) Search range: 32x32 around 16x16 predictor Lagrangian D+λR coder control [source: ITU-T VCEG]
Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 29
RD Curves: Foreman (QCIF, 10Hz) 39 38
Average PSNR(Y) [dB]
37 36 35 34 33
>30%
32 31 30 29 28
MPEG-4
27
H.26L
26 0
16
32
48
64
80
96
112
128
Bit-rate [kbit/s] [source: ITU-T VCEG] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 30
Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 31
Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 32
Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 33
Performance Streaming Application Average bit-rate savings relative to: Coder
MPEG-4 ASP H.263 HLP
MPEG-2
H.264/AVC MP
37.44%
47.58%
63.57%
MPEG-4 ASP
-
16.65%
42.95%
H.263 HLP
-
-
30.61%
[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 34
Y-PSNR [dB]
Example Streaming Test Result Tem pete CIF 15Hz
38 37 36 35 34 33 32 31 30 29 28 27 26 25 24
MPEG-2 H.263 HLP MPEG-4 ASP H.264/AVC MP Test Points 0
256
512
768
1024
1280
1536
1792
Bit-rate [kbit/s]
[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 35
Example Streaming Test Result Tem pete CIF 15Hz 80%
Rate saving relative to MPEG-2
70% 60%
H.264/AVC MP
50% 40% 30%
MPEG-4 ASP
20% H.263 HLP 10% 0% 26
28
30
32
34
36
38
Y-PSNR [dB]
[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 36
Test Results for Real-Time Conversation Average bit-rate savings relative to: Coder
H.263 CHC MPEG-4 SP H.263 Base
H.264/AVC BP
27.69%
29.37%
40.59%
H.263 CHC
-
2.04%
17.63%
MPEG-4 SP
-
-
15.69%
[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 37
Example Real-Time Conversation Result
Y-PSNR [dB]
Paris CIF 15Hz 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24
H.263-Base H.263 CHC MPEG-4 SP H.264/AVC BP Test Points 0
128
256
384
512
640
768
Bit-rate [kbit/s]
[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 38
Example Real-Time Test Result Paris CIF 15Hz 50% H.264/AVC BP Rate saving relative to H.263-Baseline
40%
30%
20%
H.263 CHC
10% MPEG-4 SP 0% 24
26
28
30
32
34
36
38
Y-PSNR [dB]
[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 39
Test Results Entertainment-Quality Applications
Average bit-rate savings relative to: Coder
MPEG-2
H.264/AVC MP
45%
[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 40
Example Entertainment-Quality Applications Result
Y-PSNR [dB]
Entertainm ent SD (720x576i) 25Hz 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24
MPEG-2 H.264/AVC MP 0
1
2
3
4 5 6 Bit-rate [Mbit/s]
7
8
9
10
[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 41
Example Entertainment-Quality Applications Result Entertainm ent SD (720x576i) 25Hz
Rate saving relative to MPEG-2
60% 50% H.264/AVC MP 40% 30% 20% 10% 0% 26
28
30
32
34
36
38
Y-PSNR [dB]
[Wiegand, et al. 2003] Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 42
Further reading IEEE Transactions on Circuits and Systems for Video Technology, Special Issue on the H.264/JVC Video Coding Standard, July 2003.
Bernd Girod: EE398B Image Communication II
Video Coding Standards: H.264/AVC no. 43