Game Audio Level 1: The Basics of Sound for the Interactive Age

You’ve probably been hearing a lot about game audio these days. There is no doubt that games keep on growing in popularity as gamification continues to permeate our everyday lives.

From amazing new virtual reality (VR) and augmented reality (AR) experiences to massive multiplayer online games (MMOGs), the gaming revolution is still in full force.

Maybe you’ve read articles about “middleware” or “adaptive audio” or watched various videos of sessions large and small with behind the scenes reports from all types of games.

Learning sound for games means covering a lot of ground, and much of it may be confusing or hard to contextualize at first without a good primer like this one.

Fortunately, at the Game Audio Institute (GAI), we’ve been teaching the art and craft of how to create and implement sounds for games and interactive experiences for nearly a decade, and are excited to share this series of articles covering the exciting, dynamic, and ever-changing world of game audio.

In this hands-on series, we’ll be covering topics like audio middleware, sound for VR/AR games, and game engines, using our classroom tested materials and methods. We hope to be able to draw the curtain back on the inner workings of how sound is handled, let you experience a bit of the overall process yourselves, and whet your appetite for going further.

The Fundamentals

So what is game audio? Simply put, it is the art, science and technology of producing and configuring sounds for any kind of game or interactive media.

Unless you’ve been living in a concrete bunker in an undisclosed location for the last 20 years or so, you probably know that games are incredibly popular worldwide—a roughly 123 to 135 billion dollar market according to recent surveys.

A chart illustrating gaming revenue throughout the years.

The gaming industry is a huge mega-business, bigger than movies, books or music, and it’s been steadily growing over the years. Along the way, the industry has experienced a number of ups and downs, shifts in focus and other dynamic changes.

Change is ongoing and will always be a part of what it means to work in games, and this means that professionals can take pride in being part of a field that often pushes the envelope of what is possible with technology, and is in many ways still in the process of defining itself as an art form.

That’s all well and good, you might say, but how does audio for games even work? First, let’s consider how games differ in comparison to linear mediums, like film or television.

Photo of the word "linear".

The keyword here is the predictability.

In a film, if a character goes into the dark spooky castle and opens the door at 13 minutes and 22 seconds, you can easily create or obtain a sound of a creaking door and place it on the timeline in your favorite DAW at that exact point.

Once you synchronize it correctly with the film itself, it will always play at the same time. In other words, it’s predictable—you know exactly when it will happen. Because of that predictability, we can simply mix all of our sounds and music down to the output format we need: stereo, 5.1 surround and even Dolby Atmos, the result will be one or more files that play with 100% reliability in sync with the visuals.

That’s usually not the case in games. A game is often like watching a film acted out at several times and from different angles, as the characters make decisions at a different point in time each time a scene is played. In a game, a character could conceivably wander around forever on the grounds of a spooky castle and never go into the castle itself! So we have to be prepared for nearly any outcome.

Photo of the words "Non Linear".

How could you successfully create sounds for a medium in which you don’t know when in time (or even if) a particular action is going to happen? How can you trigger these sounds reliably? The answer is that you need to throw away the idea of using only time as a basis for organizing sound and simply concentrate on the actions themselves.

The key is to focus on how and not when.

Let’s think of our spooky house situation from the angle of the action involved. Let’s assume that at some point, the character is going to come up to the house and open the door. It doesn’t matter when. So let’s list it as an action like this:

Action #001 Spooky Door Opening  Play ‘spookydoor.wav’

Now we’ve defined our action. But how do we trigger it? In all games, there’s something that’s going to cause the door to move. Most likely this will be some kind of animation code, or physical interaction where you can push open the door, or maybe you just click on it with your mouse or joystick button.

Whatever the trigger for the animation of the door is, you simply link that action with the sound file of a creaky door opening and voila! Instant synchronization. Whenever the character opens the door, the sound will play.

However, this seemingly simple shift in thinking requires that each and every sound in the game must exist as a separate item. We can’t use a full mixdown of the sound effects and music into a single file.

Full mixes in interactive experiences are the exception, not the rule. Everything has to be independently mixed and mastered separately—and in many cases in real time. Furthermore, we have to be really organized with all of these audio files. (For immersive AAA adventure games there can be hundreds of thousands of files!This way, the programmer or integrator can track the function of each sound asset.

This also means that how the audio is triggered in a game is intimately tied up with how the game is designed, and each game is a complete universe unto itself. It’s got its own set of rules and regulations for behavior and interactivity, and any change in terms of game design can significantly affect how the sounds are triggered.

Let’s have a general simplified look at a process by which a sound gets into a game. Note that this is a generalized view of how things work and may not apply precisely to every game development situation.

In the case of playable games, you’ll usually be starting with a document or description of the game project that answers many of the most important questions, including what type of game it is, what the audience is, what hardware platform(s) it’s supposed to run on, and so on.

This document is referred to as a GDD (Game Design Document). You’ll also usually have a seperate TDD (Technical Design Document), as well as a audio asset list that tells you what sounds are required for the project. You may even be able to get a very early copy of the game (usually called a “Developer build”) that you can run on the intended platform, whether that’s an iPad, PC, mobile device or gaming console.

Remember that this is a nonlinear process so you will often be bouncing out individual music stems, sound effects, voice over and music by category. Then, you’ll have to organize and name each sound based on a file naming convention that you or the game developer have discussed and decided on. You will then want to further master and edit and trim each file so that there’s no delay in triggering, the base audio levels are appropriate and that if the sound file is supposed to loop, it does so with no pops or clicks. Seamless looping is must in game audio!

After the designing, bouncing, naming, and editing, there are typically two approaches to getting a sound into a game engine.

Chart illustrating steps to success.

The first method is often called “throwing it over the fence”. In this scenario, the sound designer or composer hands in the asset list and the assets to the programmer (also called an integrator) who will then put them into the game by writing code or setting up configurations that trigger the audio at the right moment and in the right manner inside the game engine itself.

In this situation, good communication is a must because the composer or designer has very little control over how the sounds will be played, and must trust that the programmer will do the right thing. Usually, test builds are made so that the sound design can be refined and revised as well. In this case, the overall impact of the sound design depends significantly on the programmer’s understanding of how to make all of those assets work correctly.

Sound Implementation with Middleware

In the second method, we use tools called audio middleware which allows the audio designer expanded control over the audio behavior in the game. Some industry standard and well known middleware tools are FMOD Studio, Wwise, Fabric, and Elias. Some companies will even go ahead and create their own audio middleware in-house.

By using middleware, the composer or sound designer has magically become an audio implementer as well. Communication is still key however, because many sounds in the game will still require the audio designer to work with the integrator or programmer who sets up the triggers or parameters in the game.

In this scenario, the person responsible for the audio works with the asset list and names the files like before, but instead of all of the individual audio files needing to be sent to a third party, you can use the middleware to export all of the sounds in a bank file, which contains all of the raw audio data needed, as well as a configuration file indicating where each audio asset is in the bank.

Because you can test the sound balancing and interaction directly in the middleware—and often in the game engine itself—the result can be an even more seamless, dynamic and exciting audio experience that immerses the game player more fully. Of course it’s also a lot more work for you, the audio designer.

Summing it Up

OK, that’s a rough overview of the overall process of the basic actions that go into putting sounds into a game environment. Keep in mind that the methods described above are just a couple of common approaches, and they do not apply to every game development situation.

Sound for games is not a single discipline, but a combination of disciplines that all come together to create a complete sound design. You may just be interested in one area such as music, but even then, a complete understanding of the overall system is essential to your sounds actually working in the game.

Game audio experts have to look at issues of sound design and integration critically in order to create satisfying aural landscapes. We encourage you to think of yourself as an audio explorer. Who knows—you just might find that the technology of games is far more fascinating than you ever imagined!

In our next article we’ll talk about how game engines think about sound, and walk you through a complete step-by-step process, so you can trigger your own sounds in a real game!


Find out more about GAI below and register for the hands-on summer intensive workshop at San Francisco State University June 25-28th, 2019. Hurry as seats are limited and fill up fast!

Meet Steve and Scott

We started the Game Audio Institute based on our experience in the industry and the classroom, where we have been developing curriculum and teaching both graduate and undergraduate courses. To really get involved with the game audio industry these days, it is essential that you understand what’s under the hood of a game engine, and be able to speak the language of game design and implementation. You don’t have to be programmer (although that can’t hurt) but a general understanding of how games work is a must.

Unfortunately, all too often these days, people are missing out on developing a solid foundation in their understanding of interactive sound and music design. The GAI approach is fundamentally different. We want composers, sound designers, producers and audio professionals of all kinds to be familiar and comfortable with the very unique workflow associated with sound for games and interactive environments. We have many resources available for budding game audio adventurers to learn the craft, some of which we’ll be using in this series. To find out more about our Summer Hands-On Game Audio Intensive Workshop, Online Course, Unity Game Lessons or our book, just click over to our website

Logo of the Game Audio Institute.

Steve Horowitz is a creator of odd but highly accessible sounds and a diverse and prolific musician. Perhaps best known as a composer and producer for his original soundtrack to the Academy Award-nominated film “Super Size Me”, Steve is also a noted expert in the field of sound for games. Since 1991, he has literally worked on hundreds of titles during his eighteen year tenure as audio director for Nickelodeon Digital, where he has had the privilege of working on projects that garnered both Webby and Broadcast Design awards. Horowitz also has a Grammy Award in recognition of his engineering work on the multi-artist release, True Life Blues: The Songs of Bill Monroe [Sugar Hill] ‘Best Bluegrass Album’ (1996) and teaches at San Francisco State University.

Scott Looney is a passionate artist, soundsmith, educator, and curriculum developer who has been helping students understand the basic concepts and practices behind the creation of content for interactive media and games for over ten years. He pioneered interactive online audio courses for the Academy Of Art University, and has also taught at Ex’pression College, Cogswell College, Pyramind Training, San Francisco City College, and San Francisco State University . He has created compelling sounds for audiences, game developers and ad agencies alike across a broad spectrum of genres and styles, from contemporary music to experimental noise. In addition to his work in game audio and education, he is currently researching procedural and generative sound applications in games, and mastering the art of code.

Article originally posted by Sonic Scoop March 18th, 2019