Other Free Encyclopedias » Online Encyclopedia » Encyclopedia - Featured Articles » Contributed Topics from A-E

Compressed Video Spatio-Temporal Segmentation

motion moving vectors objects

Definition: Video spatial-temporal segmentation is used to detect and track moving objects and can be performed on uncompressed or compressed video sequences.

Most spatio-temporal segmentation approaches proposed in the literature operate in the uncompressed pixel domain. This provides them with the potential to estimate object boundaries with pixel accuracy but requires that the processed sequence be fully decoded before segmentation can be performed. Often the need also arises for motion feature extraction using block matching algorithms. As a result, the usefulness of such approaches is usually restricted to non-real-time applications. Real-time pixel-domain methods are usually applicable only on head-and-shoulder sequences (e.g. video-conference applications) or are based on restrictive assumptions (e.g. that the background is uniformly colored).

To counter these drawbacks, compressed domain methods have been proposed for spatio-temporal segmentation. In their majority, they consider the prevalent MPEG-2 standard as compression scheme, and they examine only I- and P-frames of the compressed sequence, since these contain all information that is necessary for the detection and tracking of moving (and non-moving) objects. In , translational motion vectors, which are contained in the MPEG-2 stream for P-frame macroblocks, are accumulated over a number of frames and the magnitude of the displacement is calculated. Uniform quantization of the latter is used for assigning macroblocks to regions. In, the motion vectors are again accumulated over a few frames and are subsequently spatially interpolated to get a dense motion vector field. The expectation maximization (EM) algorithm is applied to the dense motion vectors of each frame; the resulting foreground regions are then temporally tracked. In, a real-time algorithm is proposed; this uses the bilinear motion model to model the motion of both the camera and the identified moving objects. An iterative rejection scheme and temporal consistency constraints are employed to deal with the fact that motion vectors extracted from the compressed stream may not represent accurately the actual object motion. Coarse color information (DC coefficients) is used for the generation of background spatio-temporal objects, whereas additional color information is used to effect pixel-accuracy segmentation mask refinement, if required.

Compression in Image Secret Sharing [next] [back] Companions in Nightmare

User Comments

Your email address will be altered so spam harvesting bots can't read it easily.
Hide my email completely instead?

Cancel or