MPEG Video

The Moving Picture Experts Group (MPEG) have defined a series of standards for compressing motion video and audio signals using DCT (Discrete Cosine Transform) compression which provide a common world language for high-quality digital video. These use the JPEG algorithm for compressing individual frames, then eliminate the data that stays the same in successive frames. The MPEG formats are asymmetrical – meaning that it takes longer to compress a frame of video than it does to decompress it – requiring serious computational power to reduce the file size. The results, however, are impressive:

  • MPEG-1 (aka White Book standard) was designed to get VHS-quality video to a fixed data rate of 1.5 Mbit/s so it could play from a regular CD (for the more or less defunct VideoCD format). Published in 1993, the standard supports video coding at bit-rates up to about 1.5 Mbit/s and virtually transparent stereo audio quality at 192 Kbit/s, providing 352×240 resolution at 30 fps, with quality roughly equivalent to VHS videotape. The 352×240 resolution is typically scaled and interpolated. (Scaling causes a blocky appearance when one pixel – scaled up – becomes four pixels of the same colour value. Interpolation blends adjacent pixels by interposing pixels with best-guess colour values.) Most graphics chips can scale the picture for full-screen playback, however software-only half-screen playback is a useful trade-off. MPEG-1 enables more than 70 minutes of good-quality video and audio to be stored on a single CD-ROM disc. Prior to the introduction of Pentium-based computers, MPEG-1 required dedicated hardware support. It is optimised for non-interlaced video signals.
  • During 1990, MPEG recognised the need for a second, related standard for coding video at higher data rates and in an interlaced format. The resulting MPEG-2 standard is capable of coding standard definition television at bit-rates from about 1.5 Mbit/s to some 15 Mbit/s. MPEG-2 also adds the option of multi-channel surround sound coding and is backwards compatible with MPEG-1. It is interesting to note that, for video signals coded at bitrates below about 3 Mbit/s, MPEG-1 may be more efficient than MPEG-2. MPEG-2 has a resolution of 704×480 at 30 fps – four times greater than MPEG-1 – and is optimised for the higher demands of broadcast and entertainment applications, such as DSS satellite broadcast and DVD-Video. At a data rate of around 10 Mbit/s, the latter is capable of delivering near-broadcast-quality video with five-channel audio. Resolution is about twice that of a VHS videotape and the standard supports additional features such as scalability and the ability to place pictures within pictures.
  • MPEG-3, intended for HDTV, was rolled into MPEG-2.
  • In 1993 work was started on MPEG-4, a low-bandwidth multimedia format akin to QuickTime that can contain a mix of media, allowing recorded video images and sounds to co-exist with their computer-generated counterparts. Importantly, MPEG-4 provides standardised ways of representing units of aural, visual or audio-visual content, as discrete media objects. These can be of natural or synthetic origin, meaning, for example, they could be recorded with a camera or microphone, or generated with a computer. Possibly the greatest of the advances made by MPEG-4 is that it allows viewers and listeners to interact with objects within a scene.
  • MPEG-7, formally named Multimedia Content Description Interface, aims to create a standard for describing the multimedia content data that will support some degree of interpretation of the information’s meaning, which can be passed onto, or accessed by, a device or a computer code.

MPEG video needs less bandwidth than M-JPEG because it combines two forms of compression. M-JPEG video files are essentially a series of compressed stills. Using intraframe, or spatial compression, it disposes of redundancy within each frame of video. MPEG does this but also utilises another process known as interframe, or temporal compression. This eradicates redundancy between video frames. Take two, sequential frames of video and you’ll notice very little changes in a 25th of a second. So MPEG reduces the data rate by recording changes instead of complete frames.

MPEG video streams consist of a sequence of sets of frames known as a GOP (group of pictures). Each group, typically eight to 24 frames long, has only one complete frame represented in full, which is compressed using only intraframe compression. It’s just like a JPEG still and is known as an I frame. Around it are temporally-compressed frames, representing only change data. During encoding, powerful motion prediction techniques compare neighbouring frames and pinpoint areas of movement, defining vectors for how each will move from one frame to the next. By recording only these vectors, the data which needs to be recorded can be substantially reduced. P (predictive) frames, refer only to the previous frame, while B (bi-directional) rely on previous and subsequent frames. This combination of compression techniques makes MPEG highly scalable. Not only can the spatial compression of each 1 frame be cranked up, but by using longer GOPs with more B and P frames, data rates are pushed even lower.