|
H.263 Video Codec
H.263
is an ITU standard, designed for low bitrate communication
such as video-conferencing and video-telephony applications.
At the same time it can be used for a wide range of
bitrates, and not just low bitrates.
H.263 is a hybrid of interframe compression that makes
use of temporal redundancy and intraframe transform
coding to use spatial redundancy as a compression advantage.
Accordingly, a superior compression ratio can be achieved
in comparision to other video codecs.
Here is the illustration of the H.263 standard describes
typical encoder and decoder.
H.263
Video Encode

Motion
estimation and compensation
The first step in reducing the bandwidth is to subtract
the previous transmitted frame from the current frame
so that only the difference or residue needs to be encoded
and transmitted. This means that areas of the frame
that do not change (for example the background) are
not encoded. Further reduction is achieved by attempting
to estimate where areas of the previous frame have moved
to in the current frame (motion estimation) and compensating
for this movement (motion compensation). The motion
estimation module compares each 16x16 pixel block (macroblock)
in the current frame with its surrounding area in the
previous frame and attempts to find a match. The matching
area is moved into the current macroblock position by
the motion compensator module. The motion compensated
macroblock is subtracted from the current macroblock.
If the motion estimation and compensation process is
efficient, the remaining "residual" macroblock
should contain only a small amount of information.
Discrete
Cosine Transform (DCT)
The DCT transforms a block of pixel values (or residual
values) into a set of "spatial frequency"
coefficients. This is analogous to transforming a time
domain signal into a frequency domain signal using a
Fast Fourier Transform. The DCT operates on a 2-dimensional
block of pixels (rather than on a 1-dimensional signal)
and is particularly good at "compacting" the
energy in the block of values into a small number of
coefficients. This means that only a few DCT coefficients
are required to recreate a recognizable copy of the
original block of pixels.
Quantization
For a typical block of pixels, most of the coefficients
produced by the DCT are close to zero. The quantizer
module reduces the precision of each coefficient so
that the near-zero coefficients are set to zero and
only a few significant non-zero coefficients are left.
This is done in practice by dividing each coefficient
by an integer scale factor and truncating the result.
It is important to realize that the quantizer "throws
away" information.
Entropy
encoding
An entropy encoder (such as a Huffman encoder) replaces
frequently-occurring values with short binary codes
and replaces infrequently-occurring values with longer
binary codes. The entropy encoding in H.263 is based
on this technique and is used to compress the quantized
DCT coefficients. The result is a sequence of variable-length
binary codes. These codes are combined with synchronization
and control information (such as the motion "vectors"
required to reconstruct the motion-compensated reference
frame) to form the encoded H.263 bitstream.
Frame
store
The current frame must be stored so that it can be used
as a reference when the next frame is encoded. Instead
of simply copying the current frame into a store, the
quantized coefficients are re-scaled, inverse transformed
using an Inverse Discrete Cosine Transform and added
to the motion-compensated reference block to create
a reconstructed frame that is placed in a store (the
frame store). This ensures that the contents of the
frame store in the encoder are identical to the contents
of the frame store in the decoder (see below). When
the next frame is encoded, the motion estimator uses
the contents of this frame store to determine the best
matching area for motion compensation
H.263
Video Decode

Entropy decode
The variable-length codes that make up the H.263 bitstream
are decoded in order to extract the coefficient values
and motion vector information.
Rescale
This is the "reverse" of quantization: the
coefficients are multiplied by the same scaling factor
that was used in the quantizer. However, because the
quantizer discarded the fractional remainder, the rescaled
coefficients are not identical to the original coefficients.
Inverse
Discrete Cosine Transform
The IDCT reverses the DCT operation to create a block
of samples: these (typically) correspond to the difference
values that were produced by the motion compensator
in the encoder.
Motion
compensation
The difference values are added to a reconstructed area
from the previous frame. The motion vector information
is used to pick the correct area (the same reference
area that was used in the encoder). The result is a
reconstruction of the original frame: note that this
will not be identical to the original because of the
"lossy" quantization stage, i.e. the image
quality will be poorer than the original. The reconstructed
frame is placed in a frame store and it is used to motion-compensate
the next received frame.
|