2015-04-17

With the recent rise of virtual reality (VR), there is a growing interest in fully spatialized 3D audio. Several plugins are available for implementing 3D audio, and choosing between them can be difficult, especially if you’re tackling this technology for the first time.

While it may seem that all 3D audio plugins do the same thing, there are several factors to consider when choosing the right tool for your project, such as ease of use, performance, sound quality, and level of customisation.

The goal of this article is to perform an objective and thorough review of five leading 3D audio plugins: 3Dception from Two Big Ears, AstoundSound RTI from GenAudio, Phonon 3D from Impulsonic, RealSpace 3D from VisiSonics, and Oculus Audio SDK. I’ll cover their features, compatibility, and pricing, as well as any unique aspects of each plugin. I’ll also report on my personal experience of integrating them into a Unity project, and provide a downloadable interactive demo app that will allow you to audition the plugins, along with video walkthroughs, and performance test results.

This resource is targeted towards sound designers, audio implementation specialists, developers, and anyone interested in using 3D audio in their project, and I hope that people find it helpful!

Why 3D audio?

Unless you’ve been living under a rock, you probably know that VR is on fire right now. There is a massive groundswell of enthusiasm, and several big companies have entered the arena with hardware announcements. However there is an understanding that the primary catalyst for bringing VR into our living rooms will be high quality, compelling content.

Arguably, the whole point of VR is immersion – that feeling of being there. In real life we get input from five traditional senses, but in VR we are (so far) limited to two: sight and sound. It seems rather obvious that it would be near impossible to achieve that coveted immersion with visuals alone, while neglecting the other 50% of available sensory input. Yet currently this is exactly the case in most VR content (surprise!).

The good news is that developers are slowly starting to realize the importance of sound. So there is a need and desire to make it happen. The question is how.

If I started from the basics of 3D audio, this article would be about five times as long. So I’m going to skip right over that, and instead point you to some helpful information to get you started, including this excellent talk by Brian Hook, Audio Engineering Lead at Oculus, and this wiki page.

Skeletons vs 3D Audio

This overview would not be fair or complete if I didn’t personally put each plugin to the test and compare the results. While most developers do provide their own demos, those are not exactly “apples to apples”, since they are all using different scenes and sounds. This is how the idea of a demo app which offered a common experience for comparison was born.

I decided to use Unity 4 as my engine for several reasons:

all the plugins in question have Unity integration

it would be the most straight-forward implementation and a very likely use case

The Unity Asset store has a wealth of free assets that I could use to put my scene together

Side note: in Unity 4 all of these plugins require a Unity Pro license. But in Unity 5 (which is now officially out), everything in the demo can be done with the free license. Now is also a good time to thank Unity Technologies for supplying a temporary Pro license for the purpose of this article.

The idea was to make the app accessible to as many people as possible, so they could audition different plugins in a VR environment, and make their own conclusions about the sound. I decided to use Cardboard SDK, since Google Cardboard is currently the most accessible and affordable way to experience VR.

The result of this little experiment is Skeletons vs 3D audio VR – an interactive VR experience available on Google Play store. So if you have an Android phone, you can go ahead and download the app here. Even if you don’t have the actual Cardboard to experience VR, you can still hear 3D audio as you move the phone around.

And if don’t have an Android device, I made playthrough videos of each plugin in action.

We’re not so different, You and I

Before I get into details about each plugin, and in order to avoid repeating myself, let me briefly cover what they all have in common.

3D audio spatialization
This is the basic premise of spatializing the sound, or placing it in space with azimuth (left / right / front / back) and elevation (up / down) cues. All the reviewed plugins provide this functionality.

Adequate documentation

Some documentation resources are better organized than others, but in all cases they provide all the information to get the job done.

Great support and passion to improve the product
I have met several of these developers in person, and I’ve communicated with all of them online. Everyone has been very honest and forthcoming with information about their plugin (including limitations), and very open to critique and suggestions. A few bugs that I encountered were fixed in the matter of hours. Also, in cases where a full-featured plugin wasn’t available as a free download (GenAudio, Two Big Ears, and VisiSonics), the developer provided me with a temporary full evaluation license for the purpose of writing this article.

Now let’s review each plugin in detail.

3Dception (Two Big Ears)

Version tested: 1.0.0

Two Big Ears had their plugin 3Dception available as beta for some time, and they recently announced the first stable release.

From my experience speaking and working with the team, I get the impression that they take 3D audio very seriously, and they are determined to keep pushing their product forward.

3Dception offers a wide range of features and customizations, and the team has put a lot of emphasis on performance and workflow. They also have been active writing and speaking about 3D audio, and have a blog with some interesting resources.

Unique Features

Room modeling with spatialized reflections. You can create a room object, match it in position and size to your actual in-game room, and define its reflective properties. A couple other plugins in this list provide room modeling, but spatialized reflections is a feature unique to 3Dception at the moment. Meaning that the listener can move freely inside the room, and the reflections reaching the listener will change accordingly.

Delayed time of flight. Direct sound is delayed based on distance, which enhances distance cues.

Many features and adjustments exposed in Unity Editor

Support for ambisonics. I will not get into ambisonics, since it’s a bit outside of scope of this article, but I will say that I’ve played around with ambisonic recordings, and if done right, they can produce stunning results. So support for ambisonic playback is a welcome feature.

Environment simulation (limited). Currently there are options to change the world scale and speed of sound.

Extensive API. Anything that can be tweaked in Unity Editor is also available through the API.

Upcoming Features

Geometry based reflection modelling (the plugin will generate reflections based on actual in-engine objects)

Occlusion and obstruction

More advanced environmental simulation through parameters such as temperature and humidity

There are some playable demos available, and here is a video demo of some upcoming features in action.

Implementation

I highly recommend that you read the documentation before implementing 3Dception into your project. There is a required “setup” step in order to get everything working properly. It’s handled by a drop-down menu option in Unity, which creates the necessary objects in the scene, and makes some additions to the script execution order.

After the initial setup you will mostly be dealing with 2 components: TBE_Room and TBE_Source.

TBE_Room takes care of the room modeling feature. You drop the provided prefab in your scene and match it in size and position to your in-game room. Until we get efficient room modeling based on in-game mesh geometry, I feel like this is currently the most intuitive solution.

As you can see in the screenshot, there are several options to modify reflective properties of the environment. There are a few Reflection Presets, or you can tweak the individual reflection values manually. Show Room Guides will make your TBE_Room visible in the scene at all times, and Show Help will add some helpful instructions to each field.

As a side note, TBE_Room provides only some reflections in order to help spatialize the sound. The documentation recommends combining TBE_Room with a standard Unity Reverb Zone for better results. I did not add a Unity Reverb Zone in this specific test.

Once your room is ready to go, you add the TBE_Source component to your sound emitter, or if you already have a Unity AudioSource component in place, you can use TBE_Filter, which will add spatialization to your existing 3D sound. This option is handy if you’re converting an existing project with lots of sounds.

TBE_Source component provides most of the standard Unity AudioSource parameters, as well as some additional settings, such as Rolloff Factor (how fast the sound will drop off), and Max Distance Mute.

In the Room Properties and Advanced sections, you have some options to disable more advanced spatialization features – useful for performance optimization.

Overall the implementation went smoothly, with no issues or crashes. The documentation is well-organized and clear, and aside from the first setup step, workflow is intuitive. I used a few API features, and they worked as expected.

Results

Click here to view the embedded video.

I feel that the results are very convincing. There is a certain clarity of sound, and reflections sound natural, even without the addition of Unity Reverb Zone.

You can find performance test results, compatibility and pricing information at the end of the article.

AstoundSound (GenAudio)

Version tested: 1.1

AstoundSound has been around for some time now – according to this video, the technology patent was filed back in 2004. But with the recent interest in VR, it looks like the company has again become more active. For instance, they recently made the technology available as FMOD and Wwise plugins.

In addition to a real-time 3D audio plugin, GenAudio provides several other solutions for working with 3D sound.

There is the Expander (enhances pre-mixed tracks), Fold-down (maintains surround sound mixes over stereo), and 3D RTI (the real-time 3D audio solution which I’m going to focus on).

Unique Features

Spatialization over multiple output configurations (headphones, stereo speakers, and surround setups). From my admittedly limited understanding, this is achieved by using a different approach to spatialization. According to GenAudio, they use the psychological sound localization model, in addition to the physics-based one. The ability to achieve sound spatialization over stereo speakers may not be relevant to VR specifically, but it’s a great plus if you’re supporting a game or experience across multiple platforms.

Ability to crossfade between spatialized and non-spatialized sound. This feature requires a bit of scripting, but it could be very useful in certain situations.

Upcoming Features

Addition of 3D elevation cues to surround-sound setups

Implementation

The plugin consists of two components: AstoundSound RTI Listener (add it to your main listener in the scene), and AstoundSound RTI Filter (add it to the sound emitter with an existing Unity AudioSource component).

The AstoundSound RTI Listener has a few handy settings, which let you trade some spatialization quality for performance, such as Enable Reverb (can be used for distance cues in lieu of Unity Reverb Zone) and Enhance Distance Effect (uses better quality reverb algorithm).

Spread Threshold is a nice option – if a sound’s Spread parameter is higher than the threshold, it will bypass the plugin spatialization altogether. This will also respond to changes in Spread in real-time, which allows for some creative solutions, especially with ambient sound sources.

Culling settings are there to limit the number of voices to be fully spatialized, and decide how those voices will be selected (options are Volume and Distance). Any sound sources outside of the maximum number of sounds will fall back to the regular stereo panning method.

In the AstoundSound RTI Filter component you can override global settings on a per-sound basis, as well as set priority for culling. Other 3D sound settings (such as min and max distance) are set on the Unity AudioSource object.

Be sure to read the documentation regarding the Filter Audio Stop checkbox, especially if you are triggering sounds via script. Basically, if you’re playing any sounds via PlayOneShot method, this should be unchecked (relates to a limitation in the Unity API).

Overall, the implementation was smooth and fast, and documentation was clear and to the point. One suggestion for improvement would be exposing some APIs for controlling settings in real-time.

Results

Click here to view the embedded video.

I think that the results are quite convincing. The spatialization is very good, and overall sound has a clarity to it. Reverb sounds notably different to that in 3Dception, but I still find it convincing. Also, I did get the effect over stereo speakers, albeit not as clearly as over headphones.

You can find performance test results, compatibility and pricing information at the end of the article.

Oculus Audio SDK

Version tested: 0.9.2 (alpha)

Oculus recently took note of the current lack of proper positional audio in VR content, and have been making a big push to change the situation. The first step towards that was licensing the RealSpace 3D technology for their Crescent Bay prototype.

And just recently at GDC 2015 they’ve taken it a step further and released their own 3D audio plugin, Oculus Audio SDK (currently in alpha), based on the same licensed technology.

What sets this plugin apart is that it’s absolutely free to use, which is a pretty cool move from Oculus.

Since it is a free offering, there isn’t a fancy marketing website or a lot of demo videos. But there is a new Audio section in the Oculus Forum, and it’s a great place to ask questions and receive support. So far the forum has been active, and support has been very good.

Implementation

The implementation was mostly smooth with only one hiccup. My main dev machine still has Mountain Lion (OSX 10.8.5), and the Audio SDK refused to run on it, so I had to dust off my old laptop with a Yosemite install.

There are two main components in play: OSP Manager, which gets placed in the scene, and the OSP AudioSource, which gets added to a sound emitter with the Unity AudioSource component attached.

Let’s take a look at the OSP Manager.Oculus’ plugin offers two types of spatialization algorithms: HQ (high quality) and Fast, which doesn’t provide Early Reflections. The documentation strongly advises against using the HQ algorithm on the mobile platforms, in order to minimize CPU resources. As you can see, I heeded their advice and checked Use Fast.

With the Gain setting, you can adjust the overall volume of the mix. As I noticed with all the reviewed plugins, spatialization tends to bring down the overall volume, so ability to do a blanket gain bump is a nice touch, and something that I would like to see in other plugins.

One more thing to keep in mind is that the virtual room is currently tied to the main Listener, and will move and rotate with the Listener, so even when you move around, you are always in the center of the virtual room. My understanding is that an independent virtual room object is in the works.

OSP AudioSource. The Bass Boost is supposed to compensate for bass attenuation, but according to documentation this feature is still unpolished (as well as the priority feature). Also you’re supposed to use Play on Awake in this component, instead of the Unity AudioSource one, to prevent an audio hiccup at the start of a scene.

The current limitation is 32 sound sources using the Fast algorithm, and 8 sound sources using the HQ one. Here you have an option to mix and match: enable the global Use Fast setting, then override it on the 8 most important sound emitters by checking Use Fast Override.

It’s not quite clear what the Frequency Hint dropdown does. I set my music and speech sounds to the Wide option, as suggested in the documentation, but didn’t notice a significant difference.

Overall, the implementation was pretty straight-forward, and the documentation was quite clear, especially considering that this is an alpha release. It definitely felt like an alpha though, with quite a few things still “in progress”. I felt that the Unity Editor interface could be more intuitive.

Results

Click here to view the embedded video.

I am not quite sold on the spatialization with this plugin. I do get the azimuth cues (left / right / front / back), but not so much with elevation (up / down) or distance. Also the sound seems to shift unnaturally between left and right ears with a slight turn of head. Lastly, when compared to 3Dception and AstoundSound, the overall sound quality is noticeably degraded and has an odd mid-rangey character to it. It is especially evident with the piano sound.

You can find performance test results, compatibility and pricing information at the end of the article.

Phonon 3D by Impulsonic

Version tested: 1.0.2

Impulsonic is a fairly recent newcomer to the field, and they are taking a unique approach to the 3D audio problem. They decided to break it into several smaller problems – HRTF-based spatialization, reverb, and occlusion/obstruction – and tackle each one independently. Hence, the three separate product offerings.

Phonon 3D is used to position sounds in space using HRTF algorithms, and it does no more and no less than that.

Phonon Reverb allows sound designers to create physically modeled reverb based on game geometry. The process involves tagging in-game geometry with acoustic materials and then “baking” the reverb – that is the process of calculating reverb tails at multiple locations in the scene. The baking process is done to optimize performance, by offloading real-time calculations to an offline process. Then in real-time, pre-calculated reverb will be applied to sounds based on listener position in the room.

Phonon SoundFlow is still in development, but it will eventually provide geometry-based occlusion/obstruction. The sound paths will also be “baked” during the design phase, and then applied to the listener’s position in real time.

There is also Phonon Walkthrough, which performs real-time sound propagation (including all the above features). It is in beta at the moment, and from my understanding it is quite resource-intensive. So unless you are working on a very specific type of experience and have the machine to run it, it’s currently not a viable solution for the majority of projects.

You can find both interactive and video demos of all Impulsonic products on their demo page.

Unique Features

Phonon Reverb takes a unique and interesting approach to setting up reverb in the scene. It can certainly be very useful even outside of VR, since it replaces the need for multiple Reverb Zones. Just tag your mesh geometry with materials, and “bake” the reverb. However, it is still not applicable to the subject at hand (3D audio), since at the moment it’s not compatible with Phonon 3D.

Upcoming Features

Compatibility between Phonon 3D and Phonon Reverb

The release of Phonon SoundFlow beta in Spring 2015

Support for more platforms and integrations

Implementation

In my tech demo I would have liked to use both Phonon 3D and Phonon Reverb in tandem, but unfortunately they’re still not compatible, so I just focused on Phonon 3D by itself.

This plugin follows the already familiar implementation method. You add the Phonon 3D Listener component to your main listener in the scene, and Phonon 3D Source component to a sound emitter with an existing Unity AudioSource.

As you can see, the customization options are quite limited. You can tweak the maximum number of spatialized sources, maximum distance past which the sources will not be spatialized, as well as set the priority of any given source.

The documentation was clear and easy to follow, and the implementation would have been straight-forward had I not encountered a few bugs. So even though I was working with the first official release, it still felt a bit like a beta, and left me feeling like the plugin is not quite ready for prime-time.

On the positive side, the developer has been very responsive and quick to fix any issues, so I’m hopeful that they will tighten it up in no time.

Results

Click here to view the embedded video.

The spatialization is convincing as far as azimuth and elevation cues. However, similar to Oculus’ plugin, there’s an unnatural shift between left and right ears, and overall sound quality also seems to be slightly degraded.

You can find performance test results, compatibility and pricing information at the end of the article.

RealSpace 3D Audio by VisiSonics

Version tested: 0.9.9 (beta)

VisiSonics, a tech company founded by the University of Maryland Computer Science faculty, has more than just a 3D audio plugin in the works. At the moment they also have offerings for 3D sound capturing and authoring tools, and future products include tools for analysis of listening spaces, surveillance, diagnostics, and more. The company has the benefit of and exclusive license to the extensive research and patents developed by the founding members.

For the purpose of this article, I will focus on their real-time 3D audio plugin, RealSpace 3D Audio.

Unique Features

HRTF settings: in Unity Editor you are able to select from a few HRTF presets, as well as tweak some measurements, such as head and torso size.

Extensive API: Most options in the Unity Editor are available via API calls.

Upcoming Features

Occlusion/obstruction

Exposing HRTF settings via scripting, so that those selections can be made by the player

Plans to develop their own HRTF database, since current HRTF databases don’t have any information about sound from below listener’s waist level

You can find some videos and interactive demos over at RealSpace 3D Audio demo page.

Implementation

Integrating this plugin was a bit of a chore. While it still follows a similar implementation structure as the others, it breaks a few paradigms in unexpected ways, which hindered my already established workflow for this project.

As with the other plugins, there are two main components in play: RealSpace 3D Audio Listener, which gets added to the main listener in the scene, and RealSpace 3D AudioSource replaces the Unity AudioSource component.

RealSpace 3D Audio Listener provides options to tweak HRTF settings, and also set the world unit measure type for correct environment sizing.

RealSpace 3D AudioSource replaces the Unity AudioSource component, however it doesn’t provide all the options of the Unity AudioSource, such as Pitch and Doppler.

Also it changes some of the established concepts of the AudioSourcecomponent, which ended up hindering my specific workflow. I brought up my concerns to the developer, and I won’t go into much detail (it will not be relevant to anyone else), but here are a couple of small changes I’d like to see with this component:

make the Audio Clip not a required field

add the ability to dynamically pass in a loaded AudioClip object to the RealSpace 3D AudioSource

These two small changes would have made my life a lot easier.

Another feature that I found a bit counter-intuitive and time-consuming is setting the environment properties for each RealSpace 3D AudioSource. Not only do you need to define the environment properties for each source individually, you have to repeat this for each source any time time you’d like to make a change to the environment. There are a couple other issues with the current handling of the environment, but from my understanding it is getting a complete overhaul in the next iteration, so I will not go into details here.

Lastly, the documentation could use some work. There is a lot of information (the documentation is 44 pages long), but a lot of it feels unnecessary (how to download and install Unity), and the essential stuff is hard to find. I would suggest breaking up the documentation into two or three files: the actual plugin documentation (how to set it up and what each field does), API reference, and perhaps a separate document discussing the demo scene.

Results

Unfortunately, I cannot comment on the sonic results at the moment. I’ve encountered some performance issues, so I wasn’t able to run the full scene without stuttering. Good news is that the developer has been alerted to the performance issues, and they since were able to find and resolve a few culprits. I’m hoping to update my demo app with the working RealSpace 3D plugin in the near future.

You can find performance test results, compatibility and pricing information at the end of the article.

Performance Testing

Let me preface the performance test results by saying that I did not attempt any kind of performance optimization on any of the plugins. As you have seen from the screenshots, some of these plugins provide options to trade some quality for performance. I believe you can gain significant performance increases with few audible differences, if used in a smart way.

But for me, the point of this exercise was to get a rough idea on how far I could push things and how these plugins perform “out of the box”, so I left the default settings on most everything.

There are a total of 15 Audio Sources in the scene, and up to 14 voices may be playing simultaneously (the scene is mostly scripted with some random elements). I tested the performance of each scene on a mobile device (Samsung Note 4), as well as my laptop (MacBook Pro 2013) using the Unity Profiler.

Unfortunately, Unity Profiler doesn’t provide a way to get the average or median numbers over time, short of writing a custom profiling script. So I opted to use the “eyeball” method – just testing the app, and watching the Audio CPU numbers to get the rough range. So I’ll be the first to admit that my approach wasn’t exactly scientific.

But it was helpful nonetheless, since I was able to identify performance issues with two plugins and alert the developers.

So without further ado, here is how the plugins performed (click image to go to spreadsheet with additional information).

NOTE: Checkmarks symbolize the features of each plugin in the test setting. Checkmarks highlighted in green are the default plugin features (cannot be disabled or enabled). Checkmarks highlighted in yellow are features that can optionally be disabled, but were enabled in the test app. N/A means that this feature is not available in a plugin.

As you can see, Phonon 3D and RealSpace 3D had some issues with the mobile performance, while Oculus Audio SDK had the best performance numbers, followed by 3Dception and AstoundSound.

Even More Data

As if this still wasn’t enough information, I also thought it would be helpful to put together some charts of the current OS/platform compatibility and supported audio engine integrations of the plugins. This information is also available as a spreadsheet online, so click the images below to get to the latest information.

Integrations

Compatibility

Verdict

Hear it to believe it (or was it the other way around)?

Congratulations, you’ve made it through the boring parts! (Or more likely you just skipped to the end)

I tried to be as objective as possible, and provide my honest feedback and opinion on the plugins. Still, at this point you might be asking “So which is the best plugin?”

Well, if there is one thing we all learned from #TheDress (and also this interesting bit), is that perception is a subjective and mysterious thing. Our brains use different cues when trying to interpret information, and in different people some cues may be more dominant than the others.

My main goal was to provide tools and as much information as possible, in order to help you make the decision.

So here are the tools again, all in one place:

Skeletons vs 3D audio – free VR app on Google Play. Download it to your Android device and use it with a VR viewer, such as Google Cardboard

Playlist with walkthrough videos of each plugin, plus the “Stereo Only” video for comparison

Online spreadsheet with performance test results and other useful data

And since I am likely one of very few people out there who tried working with all of these plugins, I suppose my opinion might be worth something.

At the time of writing, 3Dception and AstoundSound would be my top picks in terms of sound quality, performance, stability, and ease of use. 3Dception feels like the most mature plugin with lots of advanced features, and AstoundSound has the advantage of working over additional hardware setups. Plus I think that both of these plugins sound great.

The other plugins in the list all have a lot of potential, so I would recommend keeping an eye on them.

Oculus Audio SDK is free to use, but has some work to do in terms of sound quality, user interface, and stability.

Phonon 3D needs to address performance, and then continue to incorporate their three plugins so that they can be used in tandem.

RealSpace 3D has the foundation of great research, but it has work to do in terms of performance and usability.

In Conclusion

This is an exciting time, because we now have several viable options for implementing 3D audio in VR projects. Each developer has their own unique approach to the problem, and each one is passionate about providing the best solution. In any case, competition is great, and it will only push the technology further, which will in turn allow us to create better content, and that’s a win for everyone.

I would like to thank all the developers for their support and for providing these incredible tools. Also, I’d like to wrap up with a wish-list that would be my personal Holy Grail of 3D audio technology, and I’d love to see these features available someday.

Better room modeling: I really like the current approach of 3Dception to room modeling. Until we get fast geometry-based environment modeling, this is a solid approach, which produces convincing results – I feel like the other developers could take a cue from this approach.

Even better room modeling: in-engine geometry-based real time reflections. I would love to be able to tag my geometry in engine with acoustic materials, and have the plugin create proper reflection and absorption in real time. Understandably, this is a tough nut to crack, but there is more than one way to crack it, so I’m hopeful that we’ll have a workable solution soon.

Occlusion / obstruction: real time, and based on world geometry.

Directionality of sound (sound cone). This is especially relevant for human speech. I know that there is already a way to achieve this with middleware and coding, but I would love to see a robust plug-and-play solution.

Gain bump: This is a really useful feature in Oculus Audio SDK, and would like to see it implemented in other plugins. Taking this a step further would be to somehow incorporate it with loudness metering.

Support for microphone input: taking the microphone input, applying the acoustic properties of the environment to it in real time, and playing it back to the user. Again, this is likely already possible, but not out of the box. Would love to see a plug-and-play solution.

Virtual speaker setup: I’d love to be able to take a multi-channel (say, stereo) sound, still treat it as one sound (Play, Stop, etc), but be able to position “left” and “right” speakers in the world independently.

I hope this resource was helpful, and I welcome any questions, comments, and corrections. Also I would love to hear what is on your wish-list of the perfect 3D audio solution.

Anastasia Devana is a freelance composer and programmer with a special interest in combining the two disciplines. Get in touch at www.anastasiadevana.com or on Twitter @AnastasiaDevana.

Show more