From Stereo to Wavefield Synthesis: A Brief Overview of Current Multi-Channel Recording and Reproduction Technologies by Xiao Quan

            Since our hunter-gatherer days, we have been looking for cost-effective ways to enter alternative narrative realities through aural manipulation, such as through ritual performance in Paleolithic caves found in France (Reznikoff, 2008), or in classical amphitheaters of Ancient Greece that reduces reverberation for speech clarity (Chourmouziadou & Kang, 2008). With the advent and advancements of recording and reproduction technologies in the 19th and 20th centuries, our capabilities to be ‘taken away’ by sound have improved drastically. This article briefly summarizes the advancement of current recording and reproduction technologies. 

           In 1933, stereo recording technology was invented and patented for EMI by Alan Blumlein (Blumlein, 1933). This is an early attempt at recording and reproducing sound with more detailed spatial information than mono recordings. The patent outlines that, using two directional microphones and two loudspeakers with proper technique, we can create and reconstruct sound faithfully with “near-replica of the original directional sound image” (Geluso, 2017, pp. 63). From a consumer’s perspective, this means with just one extra speaker, we can have an order of magnitude difference in the reproduced sound experience. It is no wonder that with the invention of the world’s first stereo-headphones in 1958, the stereo format had become and remains to be the mainstay for sound reproduction in the consumer market (Geluso, 2017).

           Since the commercial implementation of the stereo format, the audio industry has been obsessed with developing more immersive recording and reproduction technologies, though struggling to find commercial success comparable to that of stereo. The first attempt is the ‘quadrophonic’ format, “proposed by Peter Scheiber in 1968. It required four speakers placed at four separate corners of the listening environment for playback (Torick, 1998). Though much content had been made for this new format, it was doomed for failure due to its lack of cost-effectiveness compared to that of stereo. However, this failed attempt inspired many other multi-channel attempts aimed at the consumer market.

           Throughout the late 70s, Dolby Laboratories have been making a series of surround sound innovations in cinema sound systems, starting from a four-channel matrix system in 1976’s A Star Is Born to 1978’s Superman with the world’s first 5.1 surround sound system (Davis, 2003). This setup provides a much more realistic spatialized sound image than stereo, with wider areas of sweet spots compared to ‘quad’, as a result of its emphasis on creating a center channel (Davis, 2003). Because of its success in the cinemas, Dolby Surround had become a major player in multi-channel sound systems. Subsequently, more discrete channels were added on the horizontal plane and the height axis to form 7.1, 10.2, or even 22.2 channel surround sound systems. (Davis, 2003; Rumsey, 2012)

           Another method of reproducing spatialized sound is the sound field approach. Unlike channel-based systems that are speaker and listener oriented, “the sound field approach is based on a non-speaker-centric physical representation of the sound waves” (Nicol, 2017, pp. 290). In other words, the sounds recorded by individual microphones do not correlate to discrete channels. Instead, multiple microphones are placed in a spherical arrangement, such as the tetrahedron, to capture all sound information in a particular sound environment, thus the term ‘Ambisonic’. The recorded signals are then algorithmically processed to form spatial sound components: W, X, Y, Z, and can be then converted for use in various speaker configurations, from stereo, surround, to binaural (Nicol, 2017). With more capsules in an Ambisonic microphone, we can encode higher-order sound components. However, this requires an extensive encoding and decoding process for us to map signals to loudspeakers.

           One such technique for accurately reproducing spatialized sound components is wave field synthesis (Daniel, Moreau, & Nicol, 2003). By essence, wave field synthesis reproduction is based on the assumption of using “an infinitely large number of infinitely small loudspeakers… to generate sound fields that maintain the temporal and spatial properties” of a virtual sound source (Sporer, Brandenburg, Brix, & Sladeczek, 2017, pp. 320). Furthermore, it is capable of placing virtual sound sources both in front and behind the speaker array, giving it a unique advantage over channel-based formats. (Sporer, Brandenburg, Brix, & Sladeczek, 2017). The disadvantage of wave field synthesis is its high cost associated with the huge number of loudspeakers needed for the system to work effectively.

           In conclusion, the forefront of research for sound recording and reproduction is centered in immersive formats, with various methods aiming for the optimal balance between experience, cost, and ease of use. As I wrote at the beginning of this article, human beings have been looking for various ways to enter into alternative narrative realities for millenniums. Technology is almost always a means to an end. At this stage, it seems to me that the most successful implementation for the average consumer to obtain the balance between these things, is still stereo headphones, or earbuds. Thus, I envision a future where spatialized recording and synthesis, decoded into a binaural format, will be the overruling method of sound reproduction in the years ahead.

           

  

Citations

Blumlein, A. (1933). British Patent Specification 394,325. Reprinted. Journal of Audio Engineering Society, 6(2), 91.

Boren, B. (2017). History of 3D Sound. In Immersive Sound (pp. 40-62). Routledge.

Chourmouziadou, K., & Kang, J. (2008). Acoustic evolution of ancient Greek and Roman theatres. Applied Acoustics, 69(6), 514–529.

Daniel, J., Moreau, S., & Nicol, R. (2003, March). Further investigations of high-order ambisonics and wavefield synthesis for holophonic sound imaging. In Audio Engineering Society Convention 114. Audio Engineering Society.

Davis, M. F. (2003). History of spatial coding. Journal of the Audio Engineering Society, 51(6),  554-569.

Nicol, R. (2017). Sound Field. In Immersive Sound (pp. 290-324). Focal Press.

Reznikoff, I. (2008). Sound resonance in prehistoric times: A study of Paleolithic painted caves and rocks. Journal of the Acoustical Society of America, 123(5), 3603.

Roginska, A., & Geluso, P. (Eds.). (2017). Immersive sound: The art and science of binaural and multi-channel audio. Taylor & Francis.

Rumsey, F. (2012). Spatial audio. Routledge.

Sporer, T., Brandenburg, K., Brix, S., & Sladeczek, C. (2017). Wave Field Synthesis. In Immersive Sound (pp. 311-332). Routledge.

Torick, E. (1998). Highlights in the history of multichannel sound. Journal of the Audio Engineering Society, 46(1/2), 27-31.

纽约年初总结 by Xiao Quan

我们经常拖延做一件想做的事,直到一个合适由头出现,然后以异常决心开始。有时是整理自己的房间,有时是开始早起跑步,有时是一场说走就走的旅行。然而,无论这些改变的动机,它们的作用却难以持续。一段时间后,我们的生活往往有条不紊的回到初始。这篇文章简单的探讨了移动互联网科技对这一现象的促进作用,以及其对于我们认知人生意义的影响。

斯坦福大学Persuasive Technology Lab(现Stanford Behavior Design Lab)创始人B.J. Fogg曾经说:人们大多数行为由以下三点激发而成:动力, 能力, 提示。三者缺一不可。简单来说,如果我们对做一件事动力越大,并且操作越简单,同时有开始这个行为的提示,我们八九不离十就会去执行。很多手机软件之所以容易上瘾,很大程度上就在于设计者在“能力”的层面上,最大化的降低了其目标行为的难度。结识新朋友只要左右刷屏,买东西有一键购买,看电视剧能自动续播等。

这些移动科技一方面为我们节省了时间,但另一方面,因为其在设计上极力求简的特征,经常无意间成为我们逃避生活中各种困难的“救命稻草”。比起眼前抓破脑皮的作业,与情侣需要沟通的矛盾,促进身体代谢的运动,整理复杂情绪的静坐,在社交网站上再看一个视频,再读一个笑话,再赞一张照片,再刷一篇文章,经常成为了更简单,更有诱惑力的选择。一分钟变成一小时,占着茅坑不拉屎成为当代厕所的常态。

这些选择的存在却是无需谴责的。忙里偷闲是人之常情。有效率的做事需要劳逸结合,松弛有度。更何况古往今来,独自消遣时间的方式从没少过:小说,音乐,杂志,电影,收音机,电视机,到现在的移动程序。然而,俗话说“积行成习,积习成性,积性成命”。如果我们不注意我们使用这些工具的动机,不断在面临生活挑战时转向能够给予我们短期满足感的移动互联网平台,却是极其危险的。简单来说有以下三点:

1.     生活容易失去意义:

人生意义大多由为了实现某一目标,而承担的责任决定。原始部落人的目标为在大自然中生存。为了生存,打猎摘果,繁衍后代,便是意义。孟子说,天将降大任于斯人也,必先苦其心志等,则强调了人在追寻目标的过程中,往往充满了艰辛与苦难。只有勇于面对这些无法避免的磨练,才能终当大任。当我们在困难面前打退堂鼓,转向互联网寻求短期慰藉,实际上是一种临阵脱逃,背信弃义的行为:逃离的是的责任,放弃的是磨砺自身的机会。很多人经常将人生意义与“快乐”纠结在一起。即使是曾经立志要完成的事情,做着做着不喜欢了,便去思考意义何在。然而幸福与意义往往没有因果关系。印度流传了三千多年的《摩诃婆罗多》神话中就有说到:严格的修行虽然可以带来快乐,但是以快乐为目的的修行,会将人们带入地狱。

2.     丧失获取意义的能力:

对于一个尚未成家立业的年轻人,生活或多或少缺乏稳定性。这一阶段里,我们既充满了无限的可能性,同时又不断面对着令人不安的未知。自愿与否,我们从少年时期所依靠的家庭与学校中逐步走向独立。身边的人从提供精神倚靠和物质稳定的父母,老师,玩伴,变成了来来往往,形形色色,动机不明的陌生人。在这种环境中,互联网像是一个懂主子心思,擅长讨好的仆人,时时刻刻推送着我们感兴趣的内容。相比起生活中的各种未知,互联网很容易便成为了我们最熟悉的地方。各个社交,视频网站分类规整的展示着五花八门,吸引眼球的内容,等待着我们去探索,点击。生活中无法掌控的未知,一瞬间变得充满了秩序,逻辑,与整体性。然而,这种在互联网中获得的自信感,操控感,甚至励志感,却常常是一种假象。

孔子说,“学而不思则罔,思而不学则殆”。互联网就如同一片富饶却没有捕食者的热带雨林,我们常常受好奇心驱使,本能性的“收集成熟果实”,但是对于收集到的信息却很少去做出有意义的整理,最终失去的却是我们宝贵的时间。然而,对于新信息的整理,却不像折衣服卷袜子一样简单。俗话说,吃一堑,长一智。真正智慧是建立在实际生活中的失败基础上的。如果我们收集的信息不能与自己在现实生活中的挫折相结合,便很少能够被有意义的消化利用。与此同时,我们的时间都是有限的,如今互联网与实际生活的界限越来越模糊,如果不能清醒的意识到使用互联网的动机,便很容易陷入其中的种种诱惑,开始无意识地消耗自己的时间,丧失追寻更长期目标所需时间与精力。

3.     容易迷失更加长远的人生价值:

人生如同一本长篇小说,每个人物都有自己所追求的东西。我们无时无刻不在为了某一目标选择自身行为。指引我们设立各种目标的事物,便是我们的价值观。根据实际情境的变化,我们的价值观不断的面临着挑战。如果生活中没有出现能挑战价值观的事件,我们就会保持已有的价值观,在未来类似情景中指导我们做出同样的选择。如果我们的价值观无法对生活中新情景进行行为指导,我们便容易陷入迷茫,不安,甚至恐惧的心理状态。这时我们会变得更加脆弱,容易受外来势力的影响。这种状态既充满希望,又极其危险:一方面,我们可以借机打破习惯性思维,重新认知自己的长处与不足,调整自身长期目标。另一方面,我们却容易在政治,宗教,道德等层面上接受较之前更为极端,或简化的思想。

2016年的总统选举时期,美国特别流行的一个词是“气泡”,用来形容每个人的价值观受社交网络“量身定制化”内容推送的影响,像空中漂浮的一个个气泡一样,互相排斥,无法包容理解其他集体的观念以及信仰。国家政治环境因此左右分离严重,无法展开任何有效率的沟通。从个人层面来说,网络信息的“定制化”,有两点需要留心:

1.     大多数“定制化”内容为五分钟之内的短视频或图文。这些内容或是笑话,或是导购导游信息,或是知识类文章等。他们大多获得点击率为目的,吸引眼球而华而不实。我们从中获得的更多是短暂的快感,而不是整体性的看待世界的方法。对于这种信息,我们容易陷入一种“再看一分钟也无大碍”的思维陷阱。然而,长时间游走于这些五花八门的内容,却容易使大脑起决策作用的前额叶皮层以及工作记忆退化,与静坐冥想的作用完全相反,导致自控能力下降,并丧失对自身周围实际环境的意识。从这种被动浏览模式下快速回到实际生活,我们常常有一种焦躁不安的失控感。这种感觉其实静坐片刻,注意呼吸,便可以消除。但是我们却常常选择回到为我们“量身定制”,充满控制感的网络环境继续浏览。久而久之,便成了习惯,更加难以静下心来。

2.     另外一部分为需要更多时间消化的“长内容”,如电影,电视剧,采访,讲座,以及政治评论等。由于互联网内容推送的定制化,大众娱乐变成了个人隐私。在朋友面前打开自己YouTube,Spotify或Netflix首页,我们常常因为自身个人喜好和价值观念被集中的暴露在他人眼前,而产生不安感。我们所追寻的电视剧,思考的生活问题,深听的音乐,变得如孤岛一般伴随着我们,很少与生活中其他人所共有。这一类“定制化内容”一方面能够帮助我们更方便的找到我们喜欢的娱乐,接受的观念,深入了解某一领域的知识,另一方面(特别是在大城市中),却容易使我们的现实生活变得更加与世隔绝,促进我们的孤独感。孤独感的普遍性也许是这个时代的产物,但其本质却是由于生活中缺乏和自己拥有相似价值观的人群而至。价值观一旦脱离现实,便难以维持。网络世界花花绿绿,众说纷纭,要在其中要找到长久,可靠的人生目标,难如上青天。

总而言之,互联网是如今最大的一把双刃剑。互联网与实际生活界限逐渐模糊,这是时代发展的必然,今后只可能更加如此,不会反转。能否看清网络环境对与我们价值观潜在的影响力,对于我们个人以及社会发展至关重要。水能载舟,亦能覆舟,我们生活天天离不开水,但也不见得时时刻刻都就要泡在浴缸里。未来互联网或多或少也是这样吧。

 

SUPER PISTA! by Xiao Quan

Bianchi Super Pista 2015 Red!

Super Pista 51cm frame

Mavic Ellipse Wheelset

FSA Carbon Track Crankset 49t/16t cog/FSA Platinum BB

Izumi Standard Chain

MKS Urban Platform pedals w/ steel toe clips and leather straps

FSA Stem and Handlebar

Tange Seiki Headset

Brooks C15 Saddle

Vittoria and Thickslick 25c tires.