Surround sound is a technique for enriching the fidelity and depth of sound reproduction by using multiple from speakers that surround the listener (surround channels). Its first application was in . Prior to surround sound, theater sound systems commonly had three screen channels of sound that played from three (left, center, and right) located in front of the audience. Surround sound adds one or more channels from loudspeakers to the side or behind the listener that are able to create the sensation of sound coming from any horizontal direction (at ground level) around the listener.
The technique enhances the psychoacoustics of sound spatialization by exploiting sound localization: a listener's ability to identify the location or origin of a detected sound in direction and distance. This is achieved by using multiple discrete audio channels routed to an array of . Surround sound typically has a listener location (sweet spot) where the audio effects work best and presents a fixed or forward perspective of the sound field to the listener at this location.
Surround sound formats vary in reproduction and recording methods, along with the number and positioning of additional channels. The most common surround sound specification, the ITU's 5.1 standard, calls for 6 speakers: center (C), in front of the listener; left (L) and right (R), at angles of 60°; left surround (LS) and right surround (RS) at angles of 100–120°; and a subwoofer, whose position is not critical.
Most surround sound recordings are created by film production companies or video game producers; however some consumer have such capability either built-in or available separately. Surround sound technologies can also be used in music to enable new methods of artistic expression. After the failure of quadraphonic audio in the 1970s, multichannel music has slowly been reintroduced since 1999 with the help of SACD and DVD-Audio formats. Some AV receivers, stereophonic systems, and computer contain integral digital signal processors or digital audio processors to simulate surround sound from a stereophonic source (see fake stereo).
In 1967, the rock group Pink Floyd performed the first-ever surround sound concert at "Games for May", a lavish affair at London’s Queen Elizabeth Hall where the band debuted its custom-made quadraphonic speaker system. The control device they had made, the Azimuth Co-ordinator, is now displayed at London's Victoria and Albert Museum, as part of their Theatre Collections gallery.
In the 1950s, the German composer Karlheinz Stockhausen experimented with and produced ground-breaking electronic compositions such as Gesang der Jünglinge and Kontakte, the latter using fully discrete and rotating quadraphonic sounds generated with industrial electronic equipment in Herbert Eimert's studio at the Westdeutscher Rundfunk (WDR). Edgar Varese's Poème électronique, created for the Iannis Xenakis-designed Philips Pavilion at the 1958 Brussels World's Fair, also used spatial audio with 425 loudspeakers used to move sound throughout the pavilion.
In 1957, working with artist Jordan Belson, Henry Jacobs produced Vortex: Experiments in Sound and Light - a series of concerts featuring new music, including some of Jacobs' own, and that of Karlheinz Stockhausen, and many others - taking place in the Morrison Planetarium in Golden Gate Park, San Francisco. Sound designers commonly regard this as the origin of the (now standard) concept of "surround sound." The program was popular, and Jacobs and Belson were invited to reproduce it at the 1958 World Expo in Brussels.Henry Jacobs There are also many other composers that created ground-breaking surround sound works in the same time period.
In 1978, a concept devised by Max Bell for Dolby Laboratories called "split surround" was tested with the movie Superman. This led to the 70mm stereo surround release of Apocalypse Now, which became one of the first formal releases in cinemas with three channels in the front and two in the rear. There were typically five speakers behind the screens of 70mm-capable cinemas, but only the left, center and right were used full-frequency, while center-left and center-right were only used for bass-frequencies (as it is currently common). The Apocalypse Now encoder/decoder was designed by Michael Karagosian, also for Dolby Laboratories. The surround mix was produced by an Oscar-winning crew led by Walter Murch for American Zoetrope. The format was also deployed in 1982 with the stereo surround release of Blade Runner.
The 5.1 version of surround sound originated in 1987 at the famous French Cabaret Moulin Rouge. A French engineer, Dominique Bertrand used a mixing board specially designed in cooperation with Solid State Logic, based on 5000 series and including six channels. Respectively: A left, B right, C center, D left rear, E right rear, F bass. The same engineer had already achieved a 3.1 system in 1974, for the International Summit of Francophone States in Dakar, Senegal.
The Ambisonics form, also based on Huygens' principle, gives an exact sound reconstruction at the central point; however, it is less accurate away from the central point. There are many free and commercial software programs available for Ambisonics, which dominates most of the consumer market, especially musicians using electronic and computer music. Moreover, Ambisonics products are the standard in surround sound hardware sold by Meridian Audio. In its simplest form, Ambisonics consumes few resources, however this is not true for recent developments, such as Near Field Compensated Higher Order Ambisonics. Some years ago it was shown that, in the limit, WFS and Ambisonics converge.
Finally, surround sound can also be achieved by mastering level, from stereophonic sources as with Penteo, which uses digital signal processing analysis of a stereo recording to parse out individual sounds to component panorama positions, then positions them, accordingly, into a five-channel field. However, there are more ways to create surround sound out of stereo, for instance with the routines based on QS and SQ for encoding Quad sound, where instruments were divided over 4 speakers in the studio. This way of creating surround with software routines is normally referred to as upmixing, which was particularly successful on the Sansui Electric QSD-series decoders that had a mode where it mapped the L ↔ R stereo onto an ∩ arc.
The standard surround setup consists of three front speakers LCR (left, center and right), two surround speakers LS and RS (left and right surround respectively) and a subwoofer for the low-frequency effects (LFE) channel, that is low-pass filtered at 120 Hz. The angles between the speakers have been standardized by the ITU (International Telecommunication Union) recommendation 775 and AES (Audio Engineering Society) as follows: 60 degrees between the L and R channels (allows for two-channel stereo compatibility) with the center speaker directly in front of the listener. The Surround channels are placed 100–120 degrees from the center channel, with the subwoofer's positioning not being critical due to the low directional factor of frequencies below 120 Hz. The ITU standard also allows for additional surround speakers, that need to be distributed evenly between 60 and 150 degrees.
Surround mixes of more or fewer channels are acceptable, if they are compatible, as described by the ITU-R BS. 775-1, with 5.1 surround. The 3-1 channel setup (consisting of one monophonic surround channel) is such a case, where both LS and RS are fed by the monophonic signal at an attenuated level of -3 dB.
The function of the center channel is to anchor the signal so that any central panned images do not shift when a listener is moving or is sitting away from the sweet spot. The center channel also prevents any timbral modifications from occurring, which is typical for 2-channel stereo, due to phase differences at the two ears of a listener. The center channel is especially used in films and television, with dialogue primarily feeding the center channel. The function of the center channel can either be of a monophonic nature (as with dialogue) or it can be used in combination with the left and right channels for true three-channel stereo. Motion Pictures tend to use the center channel for monophonic purposes with stereo being reserved purely for the left and right channels. Surround microphone techniques have however been developed that fully use the potential of three-channel stereo.
In 5.1 surround, phantom images between the front speakers are quite accurate, with images towards the back and especially to the sides being unstable. The localization of a virtual source, based on level differences between two loudspeakers to the side of a listener, shows great inconsistency across the standardized 5.1 setup, also being largely affected by movement away from the reference position. 5.1 surround is therefore limited in its ability to convey 3D sound, making the surround channels more appropriate for ambiance or effects.)
7.1 channel surround is another setup, most commonly used in large cinemas, that is compatible with 5.1 surround, though it is not stated in the ITU standards. 7.1 channel surround adds two additional channels, center-left (CL) and center-right (CR) to the 5.1 surround setup, with the speakers situated 15 degrees off center from the listener. This convention is used to cover an increased angle between the front loudspeakers as a product of a larger screen.
Surround recording techniques can be differentiated into those that use single arrays of microphones placed in close proximity, and those treating front and rear channels with separate arrays. Close arrays present more accurate phantom images, whereas separate treatment of rear channels is usually used for ambiance. For accurate depiction of an acoustic environment, such as a halls, side reflections are essential. Appropriate microphone techniques should therefore be used if room impression is important. Although the reproduction of side images are very unstable in the 5.1 surround setup, room impressions can still be accurately presented.
Some microphone techniques used for coverage of three front channels, include double-stereo techniques, INA-3 (Ideal Cardioid Arrangement), the Decca Tree setup and the OCT (Optimum Cardioid Triangle). Surround techniques are largely based on 3-channel techniques with additional microphones used for the surround channels. A distinguishing factor for the pickup of the front channels in surround is that less reverberation should be picked up, as the surround microphones will be responsible for the pickup of reverberation. Cardioid, hypercardioid, or supercardioid polar patterns will therefore often replace omnidirectional polar patterns for surround recordings. To compensate for the lost low-end of directional (pressure gradient) microphones, additional omnidirectional (pressure microphones), exhibiting an extended low-end response, can be added. The microphone's output is usually low-pass filtered. A simple surround microphone configuration involves the use of a front array in combination with two backward-facing omnidirectional room microphones placed about 10–15 meters away from the front array. If echoes are notable, the front array can be delayed appropriately. Alternatively, backward facing cardioid microphones can be placed closer to the front array for a similar reverberation pickup.
The INA-5 (Ideal Cardioid Arrangement) is a surround microphone array that uses five cardioid microphones resembling the angles of the standardized surround loudspeaker configuration defined by the ITU Rec. 775. Dimensions between the front three microphone as well as the polar patterns of the microphones can be changed for different pickup angles and ambient response. This technique therefore allows for great flexibility.
A well-established microphone array is the Fukada Tree, which is a modified variant of the Decca Tree stereo technique. The array consists of five spaced cardioid microphones, three front microphones resembling a Decca Tree and two surround microphones. Two additional omnidirectional outriggers can be added to enlarge the perceived size of the orchestra or to better integrate the front and surround channels. The L, R, LS and RS microphones should be placed in a square formation, with L/R and LS/RS angled at 45 degrees and 135 degrees from the center microphone respectively. Spacing between these microphones should be about 1.8 meters. This square formation is responsible for the room impressions. The center channel is placed a meter in front of the L and R channels, producing a strong center image. The surround microphones are usually placed at the critical distance (where the direct and reverberant field is equal), with the full array usually situated several meters above and behind the conductor.
The NHK (Japanese broadcasting company) developed an alternative technique also involving five cardioid microphones. Here a baffle is used for separation between the front left and right channels, which are 30 cm apart. Outrigger omnidirectional microphones, low-pass filtered at 250 Hz, are spaced 3 meters apart in line with the L and R cardioids. These compensate for the bass roll-off of the cardioid microphones and also add expansiveness. A 3-meter spaced microphone pair, situated 2–3 meters behind the front array, is used for the surround channels. The center channel is again placed slightly forward, with the L/R and LS/RS again angled at 45 and 135 degrees respectively.
The OCT-Surround (Optimum Cardioid Triangle-Surround) microphone array is an augmented technique of the stereo OCT technique using the same front array with added surround microphones. The front array is designed for minimum crosstalk, with the front left and right microphones having supercardioid polar patterns and angled at 90 degrees relative to the center microphone. It is important that high-quality small diaphragm microphones are used for the L and R channels to reduce off-axis coloration. Equalization can also be used to flatten the response of the supercardioid microphones to signals coming in at up to about 30 degrees from the front of the array. The center channel is placed slightly forward. The surround microphones are backward-facing cardioid microphones, that are placed 40 cm back from the L and R microphones. The L, R, LS and RS microphones pick up early reflections from both the sides and the back of an acoustic venue, therefore giving significant room impressions. Spacing between the L and R microphones can be varied to obtain the required stereo width.
Specialized microphone arrays have been developed to record a space's ambiance. These arrays are used in combination with suitable front arrays or can be added to above-mentioned surround techniques. The Hamasaki square (also proposed by NHK) is a well-established microphone array used for the pickup of hall ambience. Four figure-eight microphones are arranged in a square, ideally placed far away and high up in the hall. Spacing between the microphones should be between 1–3 meters. The microphones nulls (zero pickup point) are set to face the main sound source with positive polarities outward facing, therefore very effectively minimizing the direct sound pickup as well as echoes from the back of the hall The back two microphones are mixed to the surround channels, with the front two channels being mixed in combination with the front array into L and R.
Another ambient technique is the IRT (Institut für Rundfunktechnik) cross. Here, four cardioid microphones, 90 degrees relative to one another, are placed in square formation, separated by 21–25 cm. The front two microphones should be positioned 45 degrees off axis from the sound source. This technique therefore resembles back-to-back near-coincident stereo pairs. The microphones outputs are fed to the L, R and LS, RS channels. The disadvantage of this approach is that direct sound pickup is quite significant.
Many recordings do not require pickup of side reflections. For Live Pop music concerts a more appropriate array for the pickup of ambiance is the cardioid trapezium. All four cardioid microphones are backward facing and angled at 60 degrees from one another, therefore similar to a semi-circle. This is effective for the pickup of audience and ambiance.
All the above-mentioned microphone arrays take up considerable space, making them quite ineffective for field recordings. In this respect, the double mid-side (MS) technique is quite advantageous. This array uses back-to-back cardioid microphones, one facing forward, the other backward, combined with either one or two figure-eight microphones. Different channels are obtained by the sum and difference of the figure-eight and cardioid patterns. When using only one figure-eight microphone, the double MS technique is extremely compact and therefore also perfectly compatible with monophonic playback. This technique also allows for postproduction changes of the pickup angle.
There is a notation difference before and after the bass management system. Before the bass management system, there is a LFE channel. After the bass management system, there is a subwoofer signal. A common misunderstanding is the belief that the LFE channel is the subwoofer channel. The bass management system may direct bass to one or more subwoofers (if present) from any channel, not just from the LFE channel. Also, if there is no subwoofer speaker present then the bass management system can direct the LFE channel to one or more of the main speakers.
The LFE channel is a source of some confusion in surround sound. It was originally developed to carry extremely low sub-bass cinematic sound effects (e.g., the loud rumble of thunder or explosions) on their own channel. This allowed theaters to control the volume of these effects to suit the particular cinema's acoustic environment and sound reproduction system. Independent control of the sub-bass effects also reduced the problem of intermodulation distortion in analog movie sound reproduction.
In the original movie theater implementation, the LFE was a separate channel fed to one or more subwoofers. Home replay systems, however, may not have a separate subwoofer, so modern home surround decoders and systems often include a bass management system that allows bass on any channel (main or LFE) to be fed only to the loudspeakers that can handle low-frequency signals. The salient point here is that the LFE channel is not the subwoofer channel; there may be no subwoofer and, if there is, it may be handling a good deal more than effects.
Some record labels such as Telarc and Chesky Records have argued that LFE channels are not needed in a modern digital multichannel entertainment system. They argue that, given loudspeakers that have low-frequency response to 30 Hz, all available channels have a full-frequency range and, as such, there is no need for an LFE in surround music production, because all the frequencies are available in all the main channels. These labels sometimes use the LFE channel to carry a height channel. The label BIS Records generally uses a 5.0 channel mix.
The first digit in "5.1" is the number of full-range channels. The ".1" reflects the limited frequency range of the LFE channel.
For example, two stereo speakers with no LFE channel = 2.0
5 full-range channels + 1 LFE channel = 5.1
An alternative notation shows the number of full-range channels in front of the listener, separated by a slash from the number of full-range channels beside or behind the listener, with a decimal point marking the number of limited-range LFE channels.
E.g. 3 front channels + 2 side channels + an LFE channel = 3/2.1
The notation can be expanded to include . Dolby Digital EX, for example, has a sixth full-range channel incorporated into the two rear channels with a matrix decoder. This is expressed:
3 front channels + 2 rear channels + 3 channels reproduced in the rear in total + 1 LFE channel = 3/2:3.1
The term stereo, although popularised in reference to two channel audio, historically also referred to surround sound, as it strictly means "solid" (three-dimensional) sound. However this is no longer common usage and "stereo sound" almost exclusively means two channels, left and right.
| +ANSI/CEA-863-A identification for surround sound channels ! colspan=3 | Zero-based channel index ! rowspan=2 | Channel name ! rowspan=2 colspan=2 | Color-coding on commercial receiver and cabling | |
| 0 | 1 | 0 | Front left | White |
| 1 | 2 | 2 | Front right | Red |
| 2 | 0 | 1 | Center | Green |
| 3 | 5 | 7 | Subwoofer | Purple |
| 4 | 3 | 3 | Side left | Blue |
| 5 | 4 | 4 | Side right | Grey |
| 6 | 6 | 5 | Rear left | Brown |
| 7 | 7 | 6 | Rear right | Khaki |
| +Diagram | Front left | Center | Front right |
| Side left | Side right | ||
| Rear left | Rear right | ||
| Subwoofer | |||
| +Height channels !Index !Channel name !colspan=2 | Color-coding on commercial receiver and cabling | |
| 8 | Left height 1 | Yellow |
| 9 | Right height 1 | Orange |
| 10 | Left height 2 | Pink |
| 11 | Right height 2 | Magenta |
| + Standard speaker channels in Microsoft Windows KMixer |
While it is possible to build any speaker configuration, there is little commercial movie or music content for alternative speaker configurations. However, source channels can be remixed for the speaker channels using a matrix table specifying how much of each content channel is played through each speaker channel.
Most channel configuration may include a LFE channel (the channel played through the subwoofer.) This makes the configuration ".1" instead of ".0". Most modern multichannel mixes contain one LFE, some use two.
Dolby Atmos (and other Microsoft Spatial Sound engines; see in ) additionally support a virtual "8.1.4.4" configuration, to be rendered by a HRTF. The configuration adds to 7.1.4 with a center speaker behind the listener and 4 speakers below.
|
|