# Motion Compensation for Video Compression - Motion Compensated Discrete Cosine Transform (DCT) Video Coder, Block Matching Motion Estimation

### frame search step prediction

**J. Feng
City University of Hong Kong, Hong Kong**

**K.T. Lo
The Hong Kong Polytechnic University, Hong Kong**

**Definition:** Motion compensation has been used widely in video compression, because of its abilities to exploit high temporal correlation between successive frames of an image sequence.

Video compression plays an important role in modern multimedia applications. Inside digitized video, there is a considerable amount of redundancy and compression can be achieved by exploiting such redundancies. The redundancy of video data is generally divided into two classes: statistical redundancy and subjective redundancy. For statistical redundancy, it can be derived from the highly correlated video information both spatially and temporally. For example, adjacent picture elements of a television picture are almost alike and successive pictures often have small changes. Thus the differences among these similar elements are small, and hence the average bit-rate of video data can be saved by sending the differences of these similar elements instead of the individual element. On the other hand, video contains information that is not visually obvious to the human eye, and thus it can be removed without noticeable degradation on the picture quality. This kind of information is known as subjective redundancy. Further bit-rate reduction of video data can be achieved by throwing away this subjectively redundant information. For example, humans are less sensitive to the image details of a moving object so it can be reproduced with coarser image resolution than a static object inside a picture.

The aim of video compression is to remove redundancies in both spatial and temporal domains. As illustrated in Figure 1, video compression is normally a two-stage process. Interframe coding techniques are used to reduce the temporary redundancies between successive frames of a video sequence while intraframe coding techniques are used to reduce the spatial redundancies within the difference frame obtained by interframe coding.

Motion compensation has been used widely in video compression, because of its abilities to exploit high temporal correlation between successive frames of an image sequence. Motion compensation mainly consists of two parts: motion estimation and prediction error coding. As shown in Figure 2, the displacement of objects between successive frames is first estimated (motion estimation). The resulting motion information is then exploited in efficient interframe predictive coding (motion compensation). Both the motion information and prediction error are required to encode and transmit to the decoder, where the prediction error is the difference between the current frame and the motion-predicted frame.

Motion compensated predictive techniques have been explored by many researchers over the past twenty years. One of the solutions is partitioning images into regular non-overlapped blocks and assuming that the moving objects can be approximated reasonably well by these regular blocks. Then a single displacement vector is determined to represent the movement of the entire image block, as illustrated in Figure 3.

With both the displacement vector and prediction error, the current video frame can be reconstructed from the reference video frames in the past. This block-based motion compensation method is widely adopted in current video compression systems including MPEG and H.26×.

Depending on the choice of reference frame(s), three types of motion compensation (MC) are defined: forward MC, backward MC and bidirectional MC. As illustrated in Figure 4, forward MC will use the previous frame as the reference frame, hence the prediction for block X is block A and the resultant motion vector (MV) is called forward MV. On the contrast, backward MC uses the future frame as reference.

The resultant motion vector is called backward MV. In the example, block B will be used as the prediction for X. Bidirectional MC uses both the previous and the future frame as references. In this case, there are three candidates for the prediction of X: A, B and the interpolated block by A and B. Two motion vectors, forward MV and backward MV, are required to transmit to the decoder if the interpolated block is selected.

Figure 5 shows the effects of forward MC on a particular video sequence. It is seen that the amplitude of the prediction error is greatly reduced when applying MC for interframe prediction.

## Motion Compensated Discrete Cosine Transform (DCT) Video Coder

Figure 6 shows the block diagram of a typical motion compensated DCT video codec that is currently adopted in various video coding standards like MPEG and H.26×. In such a coder, motion compensated predictive coding is used for interframe coding to remove the redundancies between successive video frames. The DCT based intraframe coder is then used to exploit the spatial redundancies exists between pixels of the prediction error or the video frame.

## Block Matching Motion Estimation

The block matching algorithm is a relatively simple and effective motion vector estimation technique. It assumes that all the pixels within the block have a uniform motion. The process of block matching is to find a candidate block, within a search area in the previous frame, which is most similar to the current block in the present frame.

The full search block matching (FSBM) algorithm exhaustively examines all locations in the search area and provides an optimal solution. The FSBM algorithm is illustrated in Figure 6, in which the present frame of a video sequence is divided into rectangular or square blocks of pixels. For each block ( *N×N* ) in the current frame, we look for the block of pixels in the previous frame that is the closest to it according to a predetermined criterion. The commonly used criterion is mean absolute difference (MAD). For the present frame *n* , we denote the intensity of the pixels with coordinates ( *i, j* ) by *F n* ( *i, j* ).

We refer to a block of *N×N* pixels by the coordinate ( *k,l* ) of its upper left corner. The MAD between the block ( *k,l* ) of the present frame and the block ( *k+x, l+y* ) of the previous frame can be written as

The motion vector *v* ( *k,l* ) of the block ( *k,l* ) is given by

where the vectors ( *k+x, l+y* ) are valid block coordinates.

The block in the previous frame must be within a search area of size ( *N+2w × N+2w* ), where w is the maximum displacement that an object can move between two adjacent frames. For each location in the previous frame to be tested, the calculation of the MAD requires (2 *N* 2 -1) additions. Therefore, the computational requirement for an FSBM algorithm is quite high and becomes one of the bottlenecks in implementing the video encoder. The heavy computational loading of FSBM hence motivates the development of various fast block matching algorithms.

Among different fast searching approaches, many researchers adopted the idea to limit the number of search locations so as to reduce the computations. These algorithms generally assume that the distortion function increases monotonically as the searched point moves away from the direction of the minimum distortion. Based on this assumption, some specific searching patterns are defined for the search to follow along with the decrease of the distortion function until the motion vector is found. Some famous fast search algorithms in this category include two-dimensional logarithmic search, three-step search (TSS), cross search, new three-step search (NTSS), four-step search (FSS), block-based gradient descent search (BBGDS), diamond search and hexagonal search, Among these fast block matching algorithms, the three-step search (TSS) algorithm is still commonly used in various video codec implementation because of its simplicity and effectiveness. Compared with other methods, TSS has a larger searching pattern in the first step, hence TSS is more efficient to find the globe minimum especially for those sequences with large motion.

Figure 8 shows an example of finding the motion vector using TSS for a maximum displacement of 7 (w=7). TSS is based on a coarse-to-fine approach with logarithmic decreasing in step size. The detailed steps of TSS are listed as follows.

*First Step :*

- The center point (x, y) and 8 search points in location (x ± w/2, y ± w/2) are tested.
- The location of the minimum point is denoted as (x (1) , y (1) ),
- The step size is reduced by half.

*Second Step:*

- Use (x (1) , y (1) ) as the center and 8 more points in location (x (1) ± w/4, y (1) ± w/4) are tested.
- The location of the minimum point is denoted as (x (2) , y (2) ).
- The step size is reduced by half.

*Third Step:*

- Use (x (2) , y (2) ) as the center and 8 more points in location (x (2) ± w/8, y (2) ± w/8) are tested.
- The test is completed when the step size becomes 1.

The algorithms mentioned above can be easily trapped in the local minimum of the distortion function giving prediction with appreciably higher error than the full search method. Furthermore, the capability of adaptation to local characteristics of those algorithms is always poor since they use the same searching approach for all image blocks. This may limit the efficiency of the fast algorithm since image data are highly non-stationary. As a result, considerable research has been carried out to find alternative approaches [ 23 ]-[ 27 ] for block matching motion estimation.

## User Comments

over 1 year ago

can you please give video compression and decompression code in matlab

over 3 years ago

Hello ,,,

I am working on video compression technique using motion compensation in MATLAB. I collect a lots of data about this topic but I didnt get code of MATLAB .

Plz U have any idea about this topic or code of MATLAB related this so plz contact on my mail id.

thanks in advance!!1!

over 6 years ago

anand pateirya

i found the stuff very intersting and i had great difficlty for searching the data i need i feel very happy for the data provided here abt motion compernsation