Search
 
 
 
 

H.263 Video Codec

H.263 is an ITU standard, designed for low bitrate communication such as video-conferencing and video-telephony applications. At the same time it can be used for a wide range of bitrates, and not just low bitrates.
H.263 is a hybrid of interframe compression that makes use of temporal redundancy and intraframe transform coding to use spatial redundancy as a compression advantage. Accordingly, a superior compression ratio can be achieved in comparision to other video codecs.
Here is the illustration of the H.263 standard describes typical encoder and decoder.

H.263 Video Encode

Motion estimation and compensation
The first step in reducing the bandwidth is to subtract the previous transmitted frame from the current frame so that only the difference or residue needs to be encoded and transmitted. This means that areas of the frame that do not change (for example the background) are not encoded. Further reduction is achieved by attempting to estimate where areas of the previous frame have moved to in the current frame (motion estimation) and compensating for this movement (motion compensation). The motion estimation module compares each 16x16 pixel block (macroblock) in the current frame with its surrounding area in the previous frame and attempts to find a match. The matching area is moved into the current macroblock position by the motion compensator module. The motion compensated macroblock is subtracted from the current macroblock. If the motion estimation and compensation process is efficient, the remaining "residual" macroblock should contain only a small amount of information.

Discrete Cosine Transform (DCT)
The DCT transforms a block of pixel values (or residual values) into a set of "spatial frequency" coefficients. This is analogous to transforming a time domain signal into a frequency domain signal using a Fast Fourier Transform. The DCT operates on a 2-dimensional block of pixels (rather than on a 1-dimensional signal) and is particularly good at "compacting" the energy in the block of values into a small number of coefficients. This means that only a few DCT coefficients are required to recreate a recognizable copy of the original block of pixels.

Quantization
For a typical block of pixels, most of the coefficients produced by the DCT are close to zero. The quantizer module reduces the precision of each coefficient so that the near-zero coefficients are set to zero and only a few significant non-zero coefficients are left. This is done in practice by dividing each coefficient by an integer scale factor and truncating the result. It is important to realize that the quantizer "throws away" information.

Entropy encoding
An entropy encoder (such as a Huffman encoder) replaces frequently-occurring values with short binary codes and replaces infrequently-occurring values with longer binary codes. The entropy encoding in H.263 is based on this technique and is used to compress the quantized DCT coefficients. The result is a sequence of variable-length binary codes. These codes are combined with synchronization and control information (such as the motion "vectors" required to reconstruct the motion-compensated reference frame) to form the encoded H.263 bitstream.

Frame store
The current frame must be stored so that it can be used as a reference when the next frame is encoded. Instead of simply copying the current frame into a store, the quantized coefficients are re-scaled, inverse transformed using an Inverse Discrete Cosine Transform and added to the motion-compensated reference block to create a reconstructed frame that is placed in a store (the frame store). This ensures that the contents of the frame store in the encoder are identical to the contents of the frame store in the decoder (see below). When the next frame is encoded, the motion estimator uses the contents of this frame store to determine the best matching area for motion compensation

H.263 Video Decode

Entropy decode
The variable-length codes that make up the H.263 bitstream are decoded in order to extract the coefficient values and motion vector information.

Rescale
This is the "reverse" of quantization: the coefficients are multiplied by the same scaling factor that was used in the quantizer. However, because the quantizer discarded the fractional remainder, the rescaled coefficients are not identical to the original coefficients.

Inverse Discrete Cosine Transform
The IDCT reverses the DCT operation to create a block of samples: these (typically) correspond to the difference values that were produced by the motion compensator in the encoder.

Motion compensation
The difference values are added to a reconstructed area from the previous frame. The motion vector information is used to pick the correct area (the same reference area that was used in the encoder). The result is a reconstruction of the original frame: note that this will not be identical to the original because of the "lossy" quantization stage, i.e. the image quality will be poorer than the original. The reconstructed frame is placed in a frame store and it is used to motion-compensate the next received frame.