Other Free Encyclopedias » Online Encyclopedia » Encyclopedia - Featured Articles » Contributed Topics from F-J

H.263 Video Compression - Key Compression Tools for H.263 Video, H.263 Video Specific Semantics and Syntax

picture block frame mode

Definition: H.263 is a teleconferencing video compression standard developed by the ITU, which was designed for low bit rate conversational video services.

H.263 technology became a basis for later MPEG-4 Part 2 since MPEG-4 community intended to develop a technology optimized for very low bit rate applications at the first stage. As a result, wireless phone standards such as 3GPP include H.263 as a video subsystem compression standard. In addition, MPEG-4 Part 2 requires any compliant decoders to be able to decode H.263B (Base).

Key Compression Tools for H.263 Video

The basic configuration of the video compression algorithm in H.263 is based on H.261 developed by ITU. H.263 is a hybrid coder that is based on 8×8 block DCT and 16×16/8×8 motion compensation with half-pel resolution. In H.263, the source video formats are fixed as the following 5 – SubQCIF/QCIF/CIF/4CIF/16CIF. There are 3 frame types – I, P and PB. MBs in I are all Intra-coded, while MBs in P are either Intra- or Inter-coded. Inter-coded MBs are either 1 MV predicted or 4 MV OBMC (overlapped block motion compensation) predicted. In PB frames, MBs in the P frame and corresponding MBs in the B frame are jointed-coded with a common MB header. A MB in the P can take on either Intra-coded MB or Inter-coded MB. However, any MBs in corresponding B frame are not allowed to be Intra-coded. The QS ranges from 2 to 62 (Qp ranges from 1 to 31). Within a MB, the same QS is used for all coefficients except Intra DC. Intra DC is specially handled with a step size 8 of uniform quantizer.

After zig-zag scanning, 3D RLD (zero-run, level, Last) triplets are formed. And, binary pattern for each triplet is looked up from pre-designed Huffman table. Note that 2D RLD duplets are used for MPEG-1 and MPEG-2. PQAUN, GUANT and DQUANT are used to represent Qp more efficiently. PQUANT data is represented in the Picture layer and it indicates the quantizer QUANT to be used for the picture until updated by any subsequent GQUANT or DQUANT. GQUANT is present in the GOB layer and it indicates the quantizer QUANT to be used for the GOB until updated by any subsequent DQUANT. Only differential DQUANT is described in each MB header when necessary. If no DQUANT data is necessary, MCBPC basically declares the situation. There are four optional coding modes in H.263 – Unrestricted MV mode, Syntax-based Arithmetic Coding (SAC) mode, Advanced Prediction mode and PB-frame mode. In the default prediction mode of H.263, MVs are restricted such that all pixels referenced by them are within the coded picture area. In unrestricted MV mode, MVs are allowed to point outside the picture. When a pixel referenced by a MV is outside the coded area, an edge pixel (a.k.a., extended padded pixel) is used instead. In SAC mode, all VLC/VLD operations are replaced with arithmetic coding. As in VLC table mode of H.263, the syntax of the symbols is partitioned into 4 layers: Picture, GOB, MB and Block. Based on the syntax, probability model changes for arithmetic encoding/decoding. The syntax of the top three layers remains exactly the same. The syntax of the Block layer also remains quite similar, but slightly re-defined. SAC applies mostly to syntax elements of MB and Block levels. The first 3 RLD triplets in Block level are based on different probability models, while the rest of RLD triplets are with a same probability model. Advanced Prediction mode includes OBMC and the possibility of 4 MVs per MB. The use of this mode is indicated in PTYPE. The Advanced Prediction mode is only used in combination with the unrestricted MV mode. If MCBPC indicates 4MV mode, the 1 st MV is described in MVD and the other 3 MVs are put into MVD2-4. If MVs are in half-pel resolution, half-pel values are found using bilinear interpolation. Each 8×8 block undergoes OBMC to reduce blocky effect. 5 MVs (4 neighborhood 8×8 block MVs and 1 current 8×8 block MV) centered around the current 8×8 block are used to extract 5 8×8 predictors. And, 5 predictors are used to obtain weighted average as an 8×8 predictor, where each pixel has different weights for each of 5 predictors. Note that application point of neighborhood MVs is at current location of 8×8 block, not at the location of neighborhood blocks. A PB-frame is of two pictures being coded as one unit. The PB-frame consists of one P-picture which is predicted from the previous decoded P-picture and one B-picture which is predicted both from the decoded P-picture and the P-picture currently being decoded. Note that MVD is additionally used for B-block. In a PB-frame, a MB comprises 12 blocks. First the data for the 6 P-blocks is described as in the default H.263 mode, and then the data for the 6 B-blocks is added. MV computation in the B frame part (a.k.a., direct mode) of PB-frame relies on geometrical division of MV of co-located MB in the P frame part of PB-frame. H.263 standard provides a way to derive forward and backward MVs in half-pel units. The prediction of the B-block has 2 modes that are used for different parts of the block. For pixels where the MVb points inside the reconstructed P MB, bi-directional prediction is used for B block predictor. Note that average of forward and backward predictors is used for the predictor. For all other pixels, forward prediction is used for B block predictor. This is almost the same as the direct mode in MPEG-4 Part2 except forward prediction area – the difference comes from the fact that P-MB and B-MB are combined together to represent a single data unit. The story would be different if data were composed with separate frame-based units such as P frame or B frame.

H.263 Video Specific Semantics and Syntax

There are 4 levels of headers in H.263 video bitstream syntax –Picture, GOB, MB and Block. Note that there is no Sequence header. Some extra information needed is signaled by external means. Picture header contains PSC, TR, PTYPE, PQUANT, CPM, PLCI, TRb, DBQAUNT, PEI, PSPARE, EOS, etc. PSC is Picture startcode that is a word of 22 bits. All picture startcodes are byte aligned. TR is temporal reference of 8 bits. In the optional PB-frames, TR only addresses P-pictures. PTYPE is unified information to specify Source Format/ Split/ Freeze Picture Release/ Picture-Intra or Inter/ Optional modes status, etc. CPM is Continuous Presence Multipoint flag. PLCI (Picture Logical Channel Indicator) is present if CPM is 1, and it is information about the logical channel number for the picture header and all following information until the next Picture or GOB startcode. TRb is present if PTYPE indicates “PB-frame,” and indicates the number of non-transmitted pictures since the last P- or I-picture and before the B-picture. Note that H.263 supports for variable-frame rate coding, thus making TRb necessary. DBQUANT is present if PTYPE indicates “PB-frame.” QUANT for each P-MB is mapped to BQUANT for the B-MB based on this parameter. GOB header contains GBSC, GN, GLCI, GFID, and GQUANT. GBSC is Group of Block startcode. GN is Group Number with 5 bits. GLCI (GOB Logical Channel Indicator) is only present if CPM is set to 1. GFID (GOB Frame ID) has the same value in every GOB header of a given picture. Moreover, if PTYPE in a picture header is the same as for the previous picture, GFID has the same value as in that previous picture. MB header contains COD, MCBPC, MODB, CBPB, CBPY, DQUANT, MVD1-4, MVDB, etc. COD indicates as INTER block with 0 MV and 0 coefficients. MCBPC is for MB type and coded block pattern for chrominance. MODB/CBPB are for MB mode and coded block pattern for B-blocks, respectively. CBPY is coded block pattern for Chrominance. MVD1-4 are differential MV data for at most 4MVs. MVDB is present if indicated by MODB. MODB is differential MV data that corrects bi-directional MVs. Block layer is composed of IntraDC and Tcoefs like other standards.

Hückel, Erich [next] [back] Guy, Jasmine (1964–)

User Comments

Your email address will be altered so spam harvesting bots can't read it easily.
Hide my email completely instead?

Cancel or