From Stereo to Wavefield Synthesis: A Brief Overview of Current Multi-Channel Recording and Reproduction Technologies / by Xiao Quan

            Since our hunter-gatherer days, we have been looking for cost-effective ways to enter alternative narrative realities through aural manipulation, such as through ritual performance in Paleolithic caves found in France (Reznikoff, 2008), or in classical amphitheaters of Ancient Greece that reduces reverberation for speech clarity (Chourmouziadou & Kang, 2008). With the advent and advancements of recording and reproduction technologies in the 19th and 20th centuries, our capabilities to be ‘taken away’ by sound have improved drastically. This article briefly summarizes the advancement of current recording and reproduction technologies. 

           In 1933, stereo recording technology was invented and patented for EMI by Alan Blumlein (Blumlein, 1933). This is an early attempt at recording and reproducing sound with more detailed spatial information than mono recordings. The patent outlines that, using two directional microphones and two loudspeakers with proper technique, we can create and reconstruct sound faithfully with “near-replica of the original directional sound image” (Geluso, 2017, pp. 63). From a consumer’s perspective, this means with just one extra speaker, we can have an order of magnitude difference in the reproduced sound experience. It is no wonder that with the invention of the world’s first stereo-headphones in 1958, the stereo format had become and remains to be the mainstay for sound reproduction in the consumer market (Geluso, 2017).

           Since the commercial implementation of the stereo format, the audio industry has been obsessed with developing more immersive recording and reproduction technologies, though struggling to find commercial success comparable to that of stereo. The first attempt is the ‘quadrophonic’ format, “proposed by Peter Scheiber in 1968. It required four speakers placed at four separate corners of the listening environment for playback (Torick, 1998). Though much content had been made for this new format, it was doomed for failure due to its lack of cost-effectiveness compared to that of stereo. However, this failed attempt inspired many other multi-channel attempts aimed at the consumer market.

           Throughout the late 70s, Dolby Laboratories have been making a series of surround sound innovations in cinema sound systems, starting from a four-channel matrix system in 1976’s A Star Is Born to 1978’s Superman with the world’s first 5.1 surround sound system (Davis, 2003). This setup provides a much more realistic spatialized sound image than stereo, with wider areas of sweet spots compared to ‘quad’, as a result of its emphasis on creating a center channel (Davis, 2003). Because of its success in the cinemas, Dolby Surround had become a major player in multi-channel sound systems. Subsequently, more discrete channels were added on the horizontal plane and the height axis to form 7.1, 10.2, or even 22.2 channel surround sound systems. (Davis, 2003; Rumsey, 2012)

           Another method of reproducing spatialized sound is the sound field approach. Unlike channel-based systems that are speaker and listener oriented, “the sound field approach is based on a non-speaker-centric physical representation of the sound waves” (Nicol, 2017, pp. 290). In other words, the sounds recorded by individual microphones do not correlate to discrete channels. Instead, multiple microphones are placed in a spherical arrangement, such as the tetrahedron, to capture all sound information in a particular sound environment, thus the term ‘Ambisonic’. The recorded signals are then algorithmically processed to form spatial sound components: W, X, Y, Z, and can be then converted for use in various speaker configurations, from stereo, surround, to binaural (Nicol, 2017). With more capsules in an Ambisonic microphone, we can encode higher-order sound components. However, this requires an extensive encoding and decoding process for us to map signals to loudspeakers.

           One such technique for accurately reproducing spatialized sound components is wave field synthesis (Daniel, Moreau, & Nicol, 2003). By essence, wave field synthesis reproduction is based on the assumption of using “an infinitely large number of infinitely small loudspeakers… to generate sound fields that maintain the temporal and spatial properties” of a virtual sound source (Sporer, Brandenburg, Brix, & Sladeczek, 2017, pp. 320). Furthermore, it is capable of placing virtual sound sources both in front and behind the speaker array, giving it a unique advantage over channel-based formats. (Sporer, Brandenburg, Brix, & Sladeczek, 2017). The disadvantage of wave field synthesis is its high cost associated with the huge number of loudspeakers needed for the system to work effectively.

           In conclusion, the forefront of research for sound recording and reproduction is centered in immersive formats, with various methods aiming for the optimal balance between experience, cost, and ease of use. As I wrote at the beginning of this article, human beings have been looking for various ways to enter into alternative narrative realities for millenniums. Technology is almost always a means to an end. At this stage, it seems to me that the most successful implementation for the average consumer to obtain the balance between these things, is still stereo headphones, or earbuds. Thus, I envision a future where spatialized recording and synthesis, decoded into a binaural format, will be the overruling method of sound reproduction in the years ahead.

           

  

Citations

Blumlein, A. (1933). British Patent Specification 394,325. Reprinted. Journal of Audio Engineering Society, 6(2), 91.

Boren, B. (2017). History of 3D Sound. In Immersive Sound (pp. 40-62). Routledge.

Chourmouziadou, K., & Kang, J. (2008). Acoustic evolution of ancient Greek and Roman theatres. Applied Acoustics, 69(6), 514–529.

Daniel, J., Moreau, S., & Nicol, R. (2003, March). Further investigations of high-order ambisonics and wavefield synthesis for holophonic sound imaging. In Audio Engineering Society Convention 114. Audio Engineering Society.

Davis, M. F. (2003). History of spatial coding. Journal of the Audio Engineering Society, 51(6),  554-569.

Nicol, R. (2017). Sound Field. In Immersive Sound (pp. 290-324). Focal Press.

Reznikoff, I. (2008). Sound resonance in prehistoric times: A study of Paleolithic painted caves and rocks. Journal of the Acoustical Society of America, 123(5), 3603.

Roginska, A., & Geluso, P. (Eds.). (2017). Immersive sound: The art and science of binaural and multi-channel audio. Taylor & Francis.

Rumsey, F. (2012). Spatial audio. Routledge.

Sporer, T., Brandenburg, K., Brix, S., & Sladeczek, C. (2017). Wave Field Synthesis. In Immersive Sound (pp. 311-332). Routledge.

Torick, E. (1998). Highlights in the history of multichannel sound. Journal of the Audio Engineering Society, 46(1/2), 27-31.