Digital Video Format

In the late 1990s a new generation of entirely digital cameras and camcorders emerged, and with it a new video format, Digital Video (DV). The DV cassette is a small, metal-oxide tape, which is about three-quarters the size of a DAT, and confers the significant advantage of allowing the entire video processing cycle to remain within the digital domain. Instead of having to be funnelled through a process of analogue-to-digital conversion by a traditional video capture card, DV footage – already in a compressed digital format – can simply be downloaded to a PC in real-time with no loss of quality.

Panasonic and Sony were the first to use the DV standard on their camcorders and though it wasn’t originally intended as a professional format, both companies subsequently announced their own extensions to the standard – Panasonic with DVCPRO in 1995, and Sony with DVCAM in 1996. However, in common with just about every other maker of digital camcorders, both manufacturers have stuck to the MiniDV format for their digital consumer equipment. The DV format uses 1/4in (6.35mm) metal evaporate tapes, capable of recording up to three hours of video in SP (standard play) mode on cassettes which measure 125x78x14.6mm. A major advantage of the MiniDV format is that since the tapes are very small – 1/12th the size of a standard VHS tape at 66x48x1.2mm – the cameras that use it can be incredibly small too. MiniDV can record an hour in standard format or up to 90 minutes of lower quality output in LP (long play) mode at horizontal resolutions of up to 500 lines.

Technically speaking, DV is the summit of the industry’s research into video compression and, in particular, complex Discrete Cosine Transformation (DCT) codes. It is an intraframe rather than progressive compression technique, using a three-stage process compress data – each frame being compressed on an individual basis rather than being compared to adjacent frames. The first stage uses DCT compression, a lossless technique which strips away information that cannot be seen by the human eye. It then separates the information from each pixel into brigghtness and colour and then samples this, favouring brightness over colour, which gives a colour representation that’s acceptable to the human eye but cuts down the data by a third. This is achieved by converting the RGB colour information for each pixel into a YUV colour space – Y for brightness, and U and V for colour. The Y value is sampled four times, the U and V twice, this formula being described as YUV 4:2:2. The video then gets further reduced as the DV codec optimises the formula to YUV 4:2:0, bunching colour information from adjacent pixels in 4×4 blocks. Again, it’s a trade-off, but the human eye finds subtle variations in colour hard to detect, so in well-lit natural surroundings the difference is imperceptible. Finally, the hardware compression system on the camera compresses the video down further using an algorithm similar to M-JPEG.

DV differs by being able to compress different parts of each frame to different ratios. So, the blue sky in an image backdrop can be brought down to, say, 25:1, while the complex forest in the foreground, which needs more detail, is reduced to only 7:1. In this way DV can optimise its video stream frame by frame. M-JPEG, by contrast, has to have a fixed compression rate for the whole video and can’t intelligently balance the compression of each image, resulting in more artefacts. It also employs a technique known as adaptive interfield compression, which results in a pair of interlaced fields of a frame (as used by PAL, for example) being compressed together if little difference between them is detected. In theory this means that scenes with less movement are handled better than fast action scenes, although in practice it’s difficult to observe any perceivable difference.

The DV standard also supports PCM (pulse code modulation) stereo, thereby supporting CD-quality 16-bit audio. Alternatively, 12-bit mode can be used to record two pairs of audio tracks – one for stereo sound recorded at the time of the video and one for music or narration added later. The net result is that DV video information is carried in a nominal 25 Mbit/s data stream – which increases to 36 Mbit/s when audio and the various control and error correction data is taken into account.

DV’s principal problem is that, unlike MPEG-2, it isn’t scalable. It was designed for recording to tape with a fixed 25 Mbit/s data rate. This, and its limited colour capacity (4:2:0 or 4:1:1, meaning that there’s half as much colour information as brightness), mean many consider it unsuitable for professional post-production. For NLE, the data rate is too high for off-line editing and too low for high-end effects and graphics-heavy work. The launch of Panasonic’s DVCPRO50 in 1998 – which doubled the data rate to 50 Mbit/s and expands the colour depth to a professional 4:2:2 – extended DV’s application to the higher end. JVC’s Digital-S (or D9) format records an identical 50 Mbit/s DV bitstream to VHS-sized cassettes. The quality of both formats has been compared to Digital Betacam, yet at the time they were less than half the price.

At the start of the new millennium a split had appeared in broadcasting, with the DV formats and MPEG-2 sitting on opposite sides. However, it appeared that a resolution was on the horizon. DV and ProMPEG are very similar to DCT-based, I-frame-only schemes and it was only a matter of time before someone built silicon to support them all. C-Cube and Matrox were the first to oblige, launching a codec chip in its Digituite DTV video card which supports DV25 and 50 as well as MPEG-2.

However, DCT isn’t the last word in compression, just a standard whose early development and suitability for real-time codecs chips attracted attention at the right time. Other currently under-developed technologies do promise better pictures at lower data rates. They include wavelet and fractal algorithms. The former has already been implemented in silicon and has the important advantage of both being moderately more efficient than M-JPEG and degrading more naturally, with images appearing grainy rather than blocky at higher compression levels. These alternatives are unlikely to overthrow DCT in broadcasting and consumer electronics. But in the broadband network delivery systems which are likely to replace traditional broadcasting over the next decade, it’s a different story. With increasingly powerful CPUs becoming commonplace, appropriate real-time decoder software can be delivered with the content.

Many companies are beginning to offer IP-based streaming solutions for video-on-demand across company intranets. Although MPEG is still the dominant technology, some suppliers have demonstrated other techniques that can stream VHS-quality video in as little as 512 Kbit/s – more than within the scope of the cable and ADSL broadband solutions expect to emerge over the coming years.