Other Free Encyclopedias » Online Encyclopedia » Encyclopedia - Featured Articles » Contributed Topics from K-O

MPEG-4 Video Compression - MPEG-4 Shape Coding, MPEG-4 Texture Coding, Scalability Coding Tool

based tools modes reference

Definition: MPEG-4 multimedia compression standard was developed to provide technological foundations to deal with multimedia content in object-based, interactive, and non-linear way.

Consequently, MPEG-4 video (Part 2 of the MPEG-4 standards series) had to meet the need with shape coding for arbitrary shaped video representation. Since MPEG-4 is a generic coder, the standard includes many algorithms/tools that can be used for a variety of applications under different operating conditions. Compared with the MPEG-2 standard, MPEG-4 covers novel profiles and levels with shape coding and low-bit rate tools additionally. As a result, wireless phone standards such as 3GPP include MPEG-4 video as a video subsystem compression standard. This section discusses about MPEG-4 video natural coding tools focusing on shape coding and texture coding.

MPEG-4 Shape Coding

There are two types of shape data in MPEG-4: grey scale and binary shape information. Context-based Arithmetic Encoding (CAE) technique is used to encode both types of shape data. The only difference is that the grey scale shape data needs an additional process for texture data compression on top of binary (i.e., support) compression. To define a coding unit, Binary Alpha Block (BAB) is defined as a set of 16×16 binary pixels.

MPEG-4 Shape Coding is performed in two steps: 1. size conversion process, 2. CAE process. The size conversion process is actually lossy coding where Conversion Ratio (CR) is taken as one of. A CR is chosen for each BAB by encoder based on targeted rate and distortion characteristics. CAE is applied on size converted blocks for further lossless coding. A size converted block can be encoded in one of 7 modes 2 modes (MVDs=0 && No Update, MVDs!=0 && No Update) for copy from a reference, 3 modes (all_255, all_0, IntraCAE) for no-reference-need-coding and 2 modes (MVDs=0 && InterCAE, MVDs!=0 && InterCAE) for reference-need-coding. No-reference-need-coding modes can be used for I/P/B-VOPs, while reference-need-coding modes are used for P/ B-VOPs. Following 3 are worth remembering – 1. MVs are all in pel domain, 2. MC is carried out on 16×16 dimension, and 3. B-VOP chooses only 1 reference based on geometry.

The principal method used on size converted blocks is CAE with block-based motion compensation. Note that 2 modes for copy from reference use motion compensation in binary alpha pixels, but 2 reference-need-coding modes use merely motion vectors to extract spatio-temporal context (i.e., template) to obtain CAE probability from the context-probability mapping table in the standard. The input for CAE is anyway binary alpha bits. The key issue here is how probability table is updated for a better performance in CAE. Probability adaptation comes from spatio-temporal neighborhood’s pattern – so Context-based Arithmetic Encoding is named for this technology.

In MPEG-4 Shape Coding, there are two types of contexts defined – spatial for IntraCAE and spatio-temporal for InterCAE. Note that InterCAE is not mandatory for BABs in P/B-VOPs – IntraCAE can be used for P/B-VOPs as well as I-VOPs. Mode decision is based on rate distortion policy – in other words, additional sending of MV data might not be preferred.

The motion vectors are compressed in differential form. MPEG-4 video standard suggests using two different VLC tables based on whether x-axis differential MV is 0. In addition, there are two options to choose MV predictor (MVP). If binary_only flag is on or B-VOP is encoded, independent shape vectors are kept in MV history buffers. Otherwise, MVs from corresponding texture data are used to define the MVP.

MPEG-4 Texture Coding

MPEG-4 natural video coding tools comprise a whole tool-set for different applications and user cases. These tools can be combined to code different source of natural video. Not all of the tools are required for every application.

Originally MPEG-4 video standardization was initiated to cover low bit rate applications that were missing in the development of MPEG-2. H.263 technology, which was optimized for low bit rate applications, became a basis for further MPEG-4 development at that time. Some interesting technology coming from H.263 includes 1MV/4MV, unrestricted MV, overlap motion compensation, 3D RLD (zero-run, level, Last) triplets, direct mode motion compensation (PB frame), hybrid 8×8 DCT with half-pel resolution, PQAUNT/DQUANT, DBQUANT (PB frame), etc.

On top of these tools, there were a couple of techniques put into MPEG-4 Texture Coding: Intra DC/AC adaptive prediction (DC coefficient, the first row and the first column), Intra DC non-linear quantization (luminance and chrominance were separately optimized), alternative scan modes, weighting matrices (this was not used in H.263, while it was used in other MPEG standards) option, SA-DCT (Instead of SA-DCT, boundary padding method can be used, too. – some visual object support both algorithms), generalized direct mode (multiple B frames between two reference frames), quarter-pel motion compensation, global motion compensation, etc. Note that MPEG-4 natural video coding supports the use of 4 to 12 bits per pixel for luminance and chrominance values. This is called N-bit tool..

Scalability Coding Tool

In MPEG-4 video, object-based scalability can be achieved, where multiple objects have different level of basic scalabilities. Basic scalabilities are as follows: 1. Spatial scalability, 2. Temporal scalability, and 3. SNR fine granularity scalability. Fine granularity scalability was designed to provide a very accurate bandwidth adaptation for streaming applications. To do so, enhancement layer was further bit-plane-by-bit-plane partitioned to obtain many small granularity of a bitstream.

Other MPEG-4 Coding Tools

MPEG-4 video was intended for applications such as delivery over best-effort and lossy networks. As a result, several tools were included to improve the error resiliency of the video. The supported error resilience tools are: Video Packet based resynchronization, Data Partitioning, Reversible VLC, Header Extension Code, New Prediction. The MPEG-4 standard also includes such as interlaced coding and sprite coding.

Visual Texture Coding Tool

Visual Texture is a still image most likely used for a base material for 2D/3D graphics rendering. VTC technique is to achieve high-quality, coding efficiency and scalable textures. MPEG-4 VTC is based on the discrete wavelet transform (DWT) and zero-tree coding. Due to the nature of wavelet transform coding for still images, the following characteristics are obtained: 1. Efficient compression over a wide range of qualities, 2. Easy Spatial and SNR scalability coding, 3. Robust transmission in error-prone environments, 4. Random access, and 5. Complexity scalability levels. VTC compression tool was adopted in MPEG-4 Visual versionl, while error-resilience, tiling, and shape-adaptive tools were adopted in MPEG-4 Visual version 2 [ 1 , 2 ].

Mr. Horn [next] [back] MPEG-4 Advanced Video Compression (MPEG-4 AVC)/H.264 - Network Adaptation Layer (NAL) and Video Coding Layer (VCL)

User Comments

Your email address will be altered so spam harvesting bots can't read it easily.
Hide my email completely instead?

Cancel or