Mark Waldrep, Ph.D. is the founder, president, and chief engineer at AIX Records, an LA-based specialty record company founded in 2000. He has written extensively on audio production and new media, been a regular columnist for eMedia magazine, been featured several times in the recording industry’s standard periodical MIX magazine, and has given the keynote at the AES (Audio Engineering Society) conference in Bogota, Columbia.
Mark is an active member of The Recording Academy (NARAS) and the Audio Engineering Society (AES). He has been on the faculty of CSU Dominguez Hills in Southern California for over 20 years, where he heads the university’s Audio Recording program.
Have you noticed that consumer electronics companies – not necessarily just audio equipment vendors – are suddenly pushing 3D, aka "immersive," audio? Sennheiser, Smyth Research, Sony, Dolby, Amazon, and Apple are just a few of the companies moving aggressively into the world of spatial audio. Apple AirPods spatial audio is following the lead of Dolby and others by applying specially designed filters to approximate listening to music in an actual space. For those familiar with how human experience immersive sound, the term binaural immediately comes to mind, but here the techniques are used to emulate a cinematic immersive sound experience.
Apple AirPods Pro were the debut platform for AirPods spatial sound, but now you can enjoy the same effect on the AirPods MAX and AirPods Gen 3 as well. That makes AirPods by far the most widespread platform through which listeners can experience this type of immersive audio.
So, what exactly is binaural audio and how can a fancy set of ear buds, headphones, speakers, or even a beamforming sound bar deliver it? And is state-of-the-art Dolby Atmos cinematic immersive surround sound desirable when it comes to music listening? Read on to discover the exciting new world of spatial audio. It might just be the next big thing.
A Binaural Past
In 1986, I was a doctoral student studying music composition at the University of California, Los Angeles. Composition dissertations are typically written under the guidance of your faculty panel and involve large instrumental resources – a chamber orchestra or full symphony orchestra. A visit to the section of the music library that houses past dissertations has an entire shelf of oversized bright red scores with gold text on the spine – compositions that sadly were never performed. My dissertation is there too. But unlike the others, during my final dissertation defense, the entire faculty panel donned sets of headphones and listened intently for 18 minutes to a binaurally recorded composition titled Morphism IV for tape. I recorded, mixed, and presented the whole piece in 3D binaural sound. The panel was suitably impressed, and I was granted my Ph.D.
At the time, I was already an active recording engineer. I had a small studio at my home, owned a Nagra IV-S portable reel-to-reel machine, and made countless recordings of recitals, concerts, and performances meant for release on compact disc. This was before the era of inexpensive, portable digital recording. I brought in a couple of studio condenser microphones, mounted them on a stereo bar, hoisted them 12 feet in the air just in front of the ensemble, and captured the performances on my stereo Nagra.
In 1994, Newport Classics, a record company based on the east coast, hired me to record the Pasadena Symphony using a Neuman KU-81 binaural microphone. It was the same stereo microphone I had used at UCLA. Called "Fritz," the Neumann KU-81 microphone is a rubber human head with two accurately formed "pinnae," or ears on each side. Behind those ears are two high-quality condenser microphones. When used to capture audio or music, listeners using headphones experience the world as Fritz hears it – including all of the dimensionality. Sounds seem to come from the left, right, up, down, and even behind you. Historically, binaural sound has been used quite effectively to immerse you in a realistic sound field – something stereo and even 5.1 surround systems simply cannot accomplish.
If you want to hear immersive audio, there are lots of binaural recordings available on YouTube, and sites like HeadFi.org discuss them regularly. Put on your headphones and take a listen. It's really quite remarkable.
How We Hear 3D Sound
I've viewed a number of YouTube videos and read more than a few explanations on how we hear in 360 degrees. Some get it right and others don't have a clue. Humans have only two ears, but somehow our brains manage to create a fully immersive 3D model of our environment. Wouldn't it be great if technology could deliver a completely convincing sonic model of a live concert or allow music to flow all around us? It turns out that a variety of current technologies can pretty much do it.
There are three key parameters that our ears and brain use to pinpoint the location of a sound in 3D space. And it's the small differences of these parameters as experienced by our two ears that our brains use to locate a sound. The three parameters are: distance, time, and timbre or filtering.
A few years ago, I worked with a close friend on a crowdsourcing campaign for a sound bar that was capable of delivering spatial audio without the requirement to use headphones. It was called YARRA 3DX. The San Diego-based company raised over $1,100,000 for this amazing beamforming sound bar. I was largely responsible for the campaign. I came up with the name, built the website, created the logo, wrote the copy, and produced a YouTube animation called "How 3D Audio Works." While I don't endorse the product anymore for non-technical reasons, the video is pretty good at explaining how we hear in 3D.
HRTF
HRTF stands for Head-Related Transfer Function. The modifications of sound waves that reach our inner ear through the vibrations of the ear drum are unique to each individual because no two heads are identical, and the shape of our pinnae are as unique as fingerprints. HRTF measurements have been conducted on thousands of individuals and supply the raw data for research into spatial location.
To optimize 3D audio effects through signal processing, equipment manufacturers should ideally use the coefficients of our own measured HRTFs. There have been efforts made to do personalized measurements using smartphone applications. A user takes series of photographs or video and a clever algorithm produces an HRTF. I've seen this used in pitch video and marketing for a variety of high-end in-ear monitors and headphones. The focus is on personalizing each listeners experience.
Smyth Research "Room Realiser"
Smyth Research is a small audio company, based in Ireland, founded and operated by two brothers. These guys have accomplished something truly remarkable when it comes to replicating the immersive experience of listening in a an actual "room" through headphones, combined with their own 3D audio headphone processor. They manage this astounding feat because they measure their customers' HRTFs in the spaces that they recreate. I know this because the AIX Studio main room was among the best places to have yourself measured. Before I moved my five B&W 801 Matrix III speakers and TMH "Profunder" subwoofer out of my 30' x 25' x 14' mixing room, Smyth Realiser customers would fly across the country to be measured in the studio. A gentleman flew in from Boston in the morning, got measured, and flew home in the evening of the same day. The word had gotten around owners of the Smyth "Room Realiser" could walk away with my $250,000 studio on a small SD card.
They've designed and manufactured two versions of their "Room Realiser," the A8 and the more recent A16, which was successfully funded on Kickstarter a few years ago. What makes the Smyth boxes unique in my experience is the custom HRTF they measure and the active motion tracking that they accomplish with an IR transmitter placed on top of the headphones. When you move your head to either side, the location of the sound sources remains fixed. The sounds don't move with the movement of your head.
This emulates the way we hear the real world, and until Apple announced that their new AirPods Pro would adopt a similar strategy, few others had incorporated motion tracking in their designs. Apparently, the accelerometers and gyroscopes in the latest AirPods (Pro, Max, Gen 3) make this possible, by allowing them to track the movement of your head. They'll also track the position of your phone or tablet to keep the origin of the sound perceptually locked to the screen you're holding.
Granted, none of this technology is arising from a vacuum. The 3D audio technology being added to the AirPods Pro and other consumer devices like the Audeze Mobius follow a lot of previous experiments in spatial audio – some successful, some less so – but it seems we're finally reaching a moment in time when it finally works and is finally attainable by the average audio enthusiast. The question is, are you excited about the potential, or are you skeptical based on past experience with antecedents of this new technology?
Additional Resources
• Is Sony Giving Atmos Fans the Shaft with PlayStation 5? at HomeTheaterReview.com.
• AV Bliss Is About More Than Merely Audio and Video at HomeTheaterReview.com.
Product prices and availability are accurate as of the date/time indicated and are subject to change. Any price and availability information displayed on Amazon.com at the time of purchase will apply to the purchase of this product.