DVD Player question [Archive] - Audio & Video Forums

PDA

View Full Version : DVD Player question



Brian68
02-13-2004, 06:14 PM
What does Post drc{dynamic range control}do ?

plextor guy
02-13-2004, 07:40 PM
DYNAMIC RANGE CONTROL IN A MULTICHANNEL ENVIRONMENT
N.A.F. Cutmore
BBC, UK


The need for some way of controlling the Dynamic Range of high quality audio programme material when listening under less than perfect conditions is well understood, but are different methods of control advantageous or possible when more than two audio channels are present ? This paper discusses the problems and how some brief tests have helped to identify what control signals may be needed.

This report is one of a series of papers based on the work conducted in the audio group (PG II) of the Eureka project, Advanced Digital Television Technologies (ADTT). The audio group included the following partners: British Broadcasting Corporation (BBC), Philips, Institut für Rundfunktechnik (IRT), Nokia Research Center (NRC) and Bang and Olufsen A/S (B&O).

INTRODUCTION

Until fairly recently, the highest quality sound signals that were available to the domestic consumer came from FM radio. This has a maximum dynamic range of some 60 dB, which although quite large, can still be handled by a reasonably specified domestic sound reproduction system specifically intended for "armchair" listening. However, for the listener in a less than perfect environment, using a portable radio in a noisy kitchen perhaps, the variation in level between quiet and loud sounds can often be troublesome, even with this relatively "easy" material. Because of this, most radio broadcasts are actually quite heavily compressed -- for much popular music, the dynamic range may be reduced to only a few decibels*. The net result is that the true dynamics of the original programme material are lost and cannot be recovered by the serious listener. In effect, from the listener's point of view, the available dynamic range from most radio transmissions has actually decreased over the years.

1. DO WE NEED DYNAMIC RANGE CONTROL ?

In contrast to the situation of radio broadcasts, for almost two decades (that is well before the introduction of the Compact Disc system), the keen domestic listener has (at least in theory) been able to access programme material of extremely high quality, with a dynamic range capability perhaps exceeding 80 dB. The forerunner of the CD, the early laser disc player (using sound modulated onto two FM carriers) could comfortably outperform FM radio, but the high cost of the large discs needed meant that the market was small. Similarly, domestic ("HiFi") video recorders were available which could achieve a comparable performance, though these did not achieve widespread use as a sound only distribution medium, probably due to the relatively high cost of tape duplication. However, it is probably true to say that if lack of dynamic range had been seen by the market as a major problem with existing distribution systems, either one or both of these formats would have been more popular than they have become. There simply seems to be no "mass market" for programme material which has all the dynamics of the original; only the privileged few can afford the luxury of the environment and equipment such sound systems need. In fact, it is probably true that the Compact Disc system has been the success it has become because the discs are smaller and less prone to damage, not because the available dynamic range or quality is greater.

With the arrival of digital radio and digital TV broadcasts, the quality of sound programme material delivered to the home will improve markedly. Most systems will be capable of rivalling CD quality, though it is somewhat unlikely that such dynamic range will be used for "normal" programmes. However, in the case of TV, such items as film soundtracks originally produced for the cinema are likely to need some form of dynamic range reduction for the majority of domestic listeners who are listening under less than perfect conditions.

2. DYNAMIC RANGE CONTROL METHODS

As was noted above, for most existing analogue broadcast services, the dynamic range has already been severely limited before transmission and no further control is likely to be needed. However, it seems probable that this approach is unlikely to be true for the newer digital services; here the intention is to transmit the programme material in its original "uncompressed" form and rely on the listener's receiver to process it as required. In principle, it is quite straightforward to reduce the dynamic range of an audio signal by simply increasing the gain during quiet passages and decreasing it during loud parts. However, an important part of the programme content is often the "dynamics" of the sound, that is the balance between the loud and quiet passages as intended by the producer. A simple volume levelling "Automatic Gain Control" (AGC) will destroy this level relationship. What really is needed is a system which can adjust the gain in a more intelligent fashion when a sound balancing engineer mixes a programme, he/she will tend to "look ahead" and make gain adjustments before they are required to prevent over/under modulation. The gain changes are made relatively slowly so any sudden change in level is preserved. The amount of "look ahead" required for best results depends on the programme's contents and can be as high as several seconds for classical music, although acceptable results can usually still be achieved with much shorter times.

In terms of circuitry, a "lookahead" requires adding an undesirable audio delay in the receiver, as well as significantly adding to the complexity and cost (see Figure 1).



Fig. 1 - Receiver block diagram with Dynamic Range Control derived locally.

However, a simpler method of generating the gain control information exists. For most new digital broadcast systems, use is made of bit reduction techniques which involve significant coding delays. If the broadcaster analyses the dynamics of the audio before it is coded, it is possible to send an additional "helper" signal along with the digital audio which "predicts" the gain that will shortly be required. This allows the receiver to modify the dynamic range of the reproduced audio in a smooth manner (see Figure 2).



Fig. 2 - Receiver block diagram with Dynamic Range Control derived by broadcaster.

For the serious listener in a good environment, this "helper" signal is not used and the full dynamic range of the original material is reproduced, but for less fortunate souls sensible compression which can maintain the "dynamics" of the original programme is available at the flick of a switch. This system has been implemented for stereo signals in the BBC's Digital Audio Broadcasting system [1] and provision has been made in the most recent MPEG-2 standard (13818-3.2) [2] for such a gain control signal. Other digital delivery systems (such as Dolby AC3) [3] can also include such gain control information. This method has the additional advantage that any improvements which are made to the dynamic range control algorithm only require an update of the broadcaster's equipment, not a software update of every individual receiver.

It is also sometimes possible to derive a coarse indication of the level of an audio signal when it is in bitrate reduced digital form by analysis of "scale factors" embedded in the bitstream. These are gain values by which the samples within a coded subband will be multiplied, and indirectly provide an indication of the audio level within that subband. Since these gain values are available before the audio is decoded, they do provide a short "lookahead" of typically one coding frame duration (normally about 24 milliseconds for an MPEG bitstream). In terms of receiver complexity, there is some doubt as to whether processing of these values to provide dynamic range control of the final audio signal is likely to be more complicated than extraction of the "helper" signal described above. This system does however have the advantage that it could be used on all signal sources (including "local" ones such as Digital Versatile Disc).

Although the gain changing operation in the diagrams above is shown as a simple multiplication operation, it may usefully be made more complicated. Rather than using the dynamic range control gain figure directly (as indicated in the "helper" signal or locally derived), it can be scaled under user control to provide a "compression amount" function. Also, for small portable applications, there may be some advantage in coupling automatically the amount of compression applied to the actual volume control; that is, as the volume is increased, the amount of compression applied increases. This allows the apparent loudness to increase without creating distortion by overloading the loudspeaker or amplifier.

3. MULTICHANNEL CONSIDERATIONS

At a first glance, dynamic range control of a multichannel system might seem unnecessary; any listener who is keen enough to install such a sound system is surely going to be able to listen under the best of conditions. However, there will always be a significant number of domestic listeners who cannot do this because, for example, of their close proximity to neighbours or because they are listening late at night. In addition, there will be those in cars and those who are constrained to using simpler systems which attempt to produce a multichannel presentation from just frontal loudspeakers (an area experiencing quite some growth at the moment). There will also be those who suffer hearing impairments and require a reduction in dynamic range simply for intelligibility or comfort. So the problem does exist.

The question is, do the rules for dynamic range control change between stereo and multichannel presentations ? Obviously, for stereo, any gain changes applied to the audio must be applied to both channels equally to avoid any phantom image shifts, so only one gain control signal will be required. In this case, simply choosing the louder of the two channels as the one which controls the gain is appropriate. But what is needed when there are more than two channels ? A typical multichannel system will have five main channels and possibly an additional Low Frequency Effects (LFE) channel to carry high amplitude low frequency information. The five main channels will be arranged as a stereo pair with a centre "fillin" channel and two rear "surround" loudspeakers for ambience and effects (see Figure 3). Is there now a benefit in controlling the channels independently ? The presence of phantom images between the loudspeakers (whether produced by panning or direct microphone pickup) is still the determining factor when deciding whether to control the dynamic range of each channel individually, so the type of programme material has to be considered when making the decision. A number of different cases will be used to illustrate this.



Fig. 3 - Typical loudspeaker setup.

3.1. Case 1 - Full surround recording in an ambient atmosphere

In this type of recording, the centre channel has been used to increase the area of the "sweet spot"; that is where the listener must be situated to obtain a stable sound image. There will certainly be sound images which are positioned across the front of the soundstage between the main three loudspeakers, meaning that the front three channels must all be controlled in gain together or image shifts will occur. There will also be very significant correlation between the sound from the front and rear channels; this is how the listener detects the "ambience" in the programme. Changing the relative gain of the front and rear channels will cause the listener to "move" their perceived position in the soundstage and must be avoided. In this case, there can only be one dynamic range control signal.

Even with only one gain control, there are still possible problems. For instance, a rapid gain reduction will successfully control a transient in the front channels, but the resultant reverberation in the rear channels may be accentuated if the gain recovers too quickly. The gain control time constants must be carefully chosen to avoid this effect (the long "lookahead" system described in section 2 will tend to help here). This problem does, of course, also exist in stereo (and mono) programmes, but may be more obvious in a multichannel presentation because of the physical separation between the sound of the transient and that of the reverberation.

3.2. Case 2 - Film soundtrack with separate dialogue channel

In some film soundtracks, the centre channel is used solely for dialogue and no speech is mixed into the other front channels. If it is also the case that none of the music or effects on the left and right front channels appears in the centre channel, then there is no correlation between the channels and there is no reason why a separate dynamic range control signal cannot be derived for the centre channel. However, use of such a signal would change the balance between speech and music/effects that was intended by the producer of the programme and this may be undesirable.

Again, the ability to control the rear channels separately depends on whether there is correlation between the signals fed to the front and rear loudspeakers and this is likely to depend on the programme material. Consider an example -- a group of people are out in the open air, when they are attacked by a lowflying aeroplane. Loud explosions are heard in the front and rear channels, but the effects are relatively uncorrelated since the environment produces few echoes. The people run into a large cavern for shelter, where the sound of their voices is now heard to reverberate all around the listener. The programme is now of the same type as considered in case 1 above and the requirements have changed. Where the programme is delivered using a system which can signal separate dynamic range control information for each channel, it is possible to see how this problem could be solved during production, but the solution is complex. In other words, without some form of constantly updated indication of the programme material type, it is impossible to determine how many dynamic range control signals can be used.

3.3. Case 3 - Postproduced stereo recording with surround "effects" only

An example of this type of recording might be a radio play, where some of the voices are positioned to the rear of the soundstage for dramatic effect. There is little, if any, ambient acoustic associated with them. This is probably represents the only case in which separate control of the rear channels is possible. But again, the "balance" between music, dialogue and effects will be altered by the use of more than one dynamic range control signal.

4. TEST STRATEGY

To see if the use of more than one dynamic range control signal does offer any useful advantages, some limited tests were carried out. Several short multichannel excerpts each about 30 seconds long (mostly already existing MPEG test items used for codec evaluation) were processed on a workstation to provide dynamic range control data. For the five main channels (the LFE was not considered in these tests), the audio in digital form was processed using a software simulation of a standard Peak Programme Meter (PPM) to give five data files which indicated the channel level every 10 milliseconds. (The software used was similar to that used in the BBC DRACULA short delay compressor [4, 5]). These five "level" files were then combined in various different ways to produce "control" signals which were used to adjust the audio gains of each channel. Depending on the dynamic range control algorithm under test, there could be more than one of these gain control signals; for experimental simplicity, no more than two were ever used. Three different algorithms were tested :

One overall gain control for all five channels ("all"),
One gain control for the centre front (dialogue) channel and one (different) gain control for the other four channels ("centre").
One gain control for the front channels and a (different) gain control for the rear channels ("separate" front and rear),
The three different versions (called "a", "c" and "s") of the resulting five channel compressed audio were then transferred to a hard-disk based editing system. Ten different audio test items, chosen to have different types of audio "perspective", were processed in this way. A brief description of each of these follows to indicate the range of material which was examined :


Station Master
Male voice on the centre channel (commentary) with steam railway "atmosphere" on the other four channels,

Thalheim
Piano (front left), Clarinet (front right) and female singer (centre), with "atmosphere" in the rear channels,

Rock Fiddle
Live music in the front channels, with audience reaction and room "atmosphere" in the front and rear channels,

Party Talk
On stage voices in the front channels, live audience in the front and rear channels and "production talkback" in the rear left channel,

Tennis
Part of the Wimbledon Tennis finals, with the commentator (panned centre/left), the umpire (panned centre/right) and audience in the front and rear channels,

Indiana
Male voice in the centre channel, with orchestral music in the front and rear channels (film soundtrack),

House of Elliot
Drama with three voices (centre, panned centre/left and panned centre/right) with a restaurant "atmosphere" in the front and rear channels,

Train
Steam train passing under a bridge on which the listener is standing (discrete five channel recording),

Clarinet Theatre
Theatre foyer "atmosphere" with a clarinet in the centre channel and rain on windows in the rear channels.
The hard disk editing system was then used to transfer the test items to TASCAM DA88 digital audio tapes. Four tapes were made, each of which contained five of these ten items processed using the three algorithms described above. For each of the items, the three versions ("a", "c" and "s") were presented (in randomised order), followed by a 10 second pause for the listener to consider what they had heard. Then the three versions were repeated (in the same order as before) and a further 20 second pause allowed the listener to record their preferences. This sequence was repeated for another four items to produce a complete tape. The order of the test items and the order of the dynamic range control processing algorithms was randomised but, in total, each listener heard the same four tapes (though not necessarily in the same order) and each programme item was presented twice (with two different orderings of the three algorithms). Each tape lasted about 15 minutes and listeners had several hours between sessions.

5. DETAILS OF PROCESSING USED

As mentioned above, the programme items in this experiment were compressed according to a "PPM" value which was calculated by analysing the audio signals themselves. The amount of compression applied overall was relatively modest, with a maximum gain change of some 8 decibels (see Figure 4).



Fig. 4 - Compression law used in DRC method tests.

For a normal stereo signal, it is usual to compress both audio channels by the same amount (to avoid shifting "phantom" images), so the level of the louder channel is used to control the gain. This is shown pictorially in Figure 5.



Fig. 5 - Stereo DRC processing (as used in DAB).

Similar processing was used for the three compressed five channel versions as mentioned previously. Figures 6, 7 and 8 show the compression method for versions "a" (one overall gain signal), "c" (separate control of centre channel) and "s" (separate control of front and rear channels). In all cases, the compression law used was the same.



Fig. 6 - Processing for DRC method "a".


Fig. 7 - Processing for DRC method "c".



Fig. 8 - Processing for DRC method "s".


6. STRUCTURE OF TESTS

As there is (as yet) no widely used proven "correct" way of carrying out multichannel dynamic range control, these tests were only intended to find out if different methods of processing would give widely differing results with different types of programme material. From the broadcaster's point of view, the ideal situation would be for one method of compression to suit all types of programme material; these tests were intended to see if this apparent assumption [6] could be proved one way or the other.

Since the object of the tests was to find out what could be heard when using differing methods of dynamic range control, the listeners were not informed of the processing methods used, but they were told that each of the three versions had "been processed using the same amount of compression". They were asked to place the three versions in order of preference by giving each a mark out of 10 (on an arbitrary scale where "1" meant "totally unacceptable" and "10" meant completely acceptable") and "not to mark all three versions with exactly the same score unless they really had no preference". They were told that there were three different methods of compression under test and that the order of presentation of the three methods had been randomised, so they could not assume that the first version they heard in a block had always been compressed in the same way. They were also reminded that none of the versions was "a reference" and that this was "a purely comparative test".

A total of nine listeners took part in the tests, with backgrounds ranging from "interested amateur" to "skilled listener". The tests took place over a period of two days, in a controlled listening environment using 5 BBC LS5/9 loudspeakers arranged in the standard positions of 0, 30 and 110 with respect to the listener.

7. DISCUSSION OF RESULTS

Since the scale the listeners had been asked to mark to was completely arbitrary, it was decided that the actual numeric values they had given in their responses was probably not a particularly useful measure to use to grade the results. As the request to the listeners had been "to place the each of the three versions in order of preference", this was the first analysis done. Figure 9 shows the number of "first" choices; that is where the listener expressed a preference for one method of compression over the other two. "Equal firsts" are not included in this chart. It can be seen that, for most of the programme items, the listeners did not appear to prefer one compression method over another. However, in the cases of the "Station Master", "House of Elliot" and "Clarinet Theatre" the listeners appear to favour a different method in each case.



Fig. 9 - Number of expressed "first" choices.

Since part of the object of the experiment had also been to find out if different compression methods produced either "good" or "bad" results, a chart (Figure 10) was also made of expressed "last" choices; that is, where the listener marked one method of compression lower than the other two. Again, "equal lasts" where the listener gave the same mark to two compression methods were excluded. This similarly shows no particular method as being the worst overall (though "Station Master" and "Clarinet Theatre" do show one method in each case is particularly disliked).



Fig. 10 - Number of expressed "last" choices.



Fig. 11 - Difference between "first" and "last" choices.

As a final measure, a further chart (Figure 11) was produced which shows the difference between the expressed "first" and "last" choices; that is the difference between Figures 9 and 10. If the listeners had really had no overall preference for any one particularcompression method, this last chart would be expected to show numbers around zero for each of the bars (as indeed it does for "Party Talk" and "Thalheim", for example). However, it is clear that there is an overall best method of compression for some of the items (method "a" for "Station Master", method "s" for "House of Elliot" and method "c" for "Clarinet Theatre"). It is also clear that none of the three methods of dynamic range control examined here works well for all of the programme items tested.

8. CONCLUSIONS

It is dangerous to draw too many conclusions from what has been a fairly limited set of trials. But, judging by the comments the listeners made after the tests, applying one dynamic range control signal derived from all five channels can easily result in unpleasant audible "gain pumping" artefacts. What works for stereo does not seem to be universally applicable for five channel signals. (Of course, it is possible to produce "pumping" by poor choice of the time constants used when deriving the gain control signal; what the listeners did not like was mainly the gain reduction caused in the rear channels by loud signals at the front). It can also be seen from the results that none of the three compression methods used in these tests can be considered "universal" enough to be recommended for use with all types of programme material. At the very least, it would seem prudent to ensure that any future systems for the delivery of multichannel audio programmes to listeners have provision for more than one dynamic range control signal.

9. ACKNOWLEDGEMENTS

The author would like to thank the BBC for permission to publish this paper (and his long suffering colleagues for patiently listening to same same test items so many times without complaint).

10. REFERENCES


"Radio Broadcast Systems; Digital Audio Broadcasting (DAB) to Mobile, Portable and Fixed Receivers," ETS 300 401, European Telecommunications Standards Institute, Sophia Antipolis (1995 Feb.).
ISO/IEC DIS 138183.2, "Generic Coding of Moving Pictures and Associated Audio Audio Part" (1997 Feb.).
C.C. Todd et al., "AC3: Flexible Perceptual Coding for Audio Transmission and Storage," presented at the 96th Convention of the Audio Engineering Society, preprint 3796.
N.H.C. Gilchrist, "DRACULA: Dynamic Range Control for Broadcasting and Other Applications," BBC Research and Development Report no. 1994/13 (1994).
A.J. Mason, A.K. McParland, N.H.C. Gilchrist, "Unobtrusive Compression of Dynamic Range," presented at the 93rd Convention of the Audio Engineering Society, preprint 3433.
W. Hoeg, "Dynamic Range Control (DRC) for Multichannel Audio Systems" presented at the 102nd Convention of the Audio Engineering Society, pre-print 4434.


--------------------------------------------------------------------------------

* In many cases, this is done because it is felt that a listener "surfing the band" to pick out a station to listen to will choose "the one that sounds loudest". Hence the massive market in Audio Compressor/Limiters to increase the subjective volume of the programme without causing over-deviation. Return to Text