Turning Sights into Sounds: The Art of Audio Descriptions

Image Description: Björk performing onstage at Háskólabio, April 12, 2018. She wears a mask that resembles an orchid flower, and a blooming red dress that looks as if it contains blood vessels. Photo from Emma Birkett (OLI Management).

What’s an Audio Description?

Audio description is an optional narration track intended for blind and low-vision consumers of visual media. A narrator talks through a presentation, describing what is happening on the screen or stage during the natural pauses in dialogue, such as a musical sequence:

Or during pauses in dialogue:

In the United States, entertainment companies are legally obligated to provide audio descriptions like these. Major cable networks and large local affiliates must provide at least 87.5 hours of audio-described programming per quarter.

Movie theaters must provide audio descriptions via headphones. If moviegoers prefer to use their own equipment, then they can download apps (such as Actiview) that use their smartphone’s mic to sync with a movie and provide real-time audio descriptions, if available.

How Do People Write Good Audio Descriptions?

Translation is a tricky art, and translating visual things into verbal things is no exception. How much audio description is too much, or too little? If many things are happening at once — too many to describe — then which things should be emphasized? How do we decide what is significant and what is not?

As a digital accessibility consultant who is also a fiction writer, I love considering these questions and listening to audio descriptions on Netflix. The creativity and economy of language involved are fascinating, and I enjoy thinking about what makes an audio description good, or not so good.

To explore the challenges of the medium more fully, I decided to write and record some audio descriptions of my own. I then asked Sofia Gallo, a colleague who is blind, for some honest feedback on my work.

Audio Description for a Music Video?

In terms of visuals, music videos are some of the most heavily stylized media in existence. And yet, I have never seen an audio-described music video, nor have I ever heard of one. Perhaps this is because, for any given video, the song itself is ostensibly what matters most, and thus it isn’t necessary to provide audio description for what is more or less just decoration. However, given the limited amount of accessibility everywhere else, it’s more likely that there are so few audio-described music videos because, as is so often the case, it hasn’t occurred to sighted people that such an accommodation might be necessary.

Whatever the case may be, the shortage of audio-described music videos motivated me to attempt one.

Here is the first minute of Björk’s “The Gate” — with audio description:

What did my colleague Sofia think of it? When I sent the clip to her, this was her response:

The description is great – it created a very vivid picture in my head.

What did Björk look like (her hair or eyes)?

Did the magical creatures have a specific shape to them?

Good questions.

Like a lot of creative work, most issues with audio descriptions fall into one of two categories: the easily anticipated problems, and the easily overlooked problems.

For example, even though my description of the creatures is the longest one in the clip, I sensed that “magical creatures” was too vague before I’d spoken with Sofia. “Magical cephalopods” might have sounded strange, but it would have been much more evocative, and it would have given Sofia a sense of their shape and movements.

On the other hand, because I had been so focused on Björk’s circumstances — her environment, her spectators, her costume — I neglected to fully describe Björk herself. I left out her physical features, facial expressions, warm demeanor, and everything else that a sighted person would take for granted.

Another misstep, which Sofia was too polite to point out, was my use of the term “we see.” I imagine that for people who rely on audio descriptions, it might be a frustrating thing to hear. And it isn’t relevant. The point of audio descriptions is to describe what’s happening onscreen, not what the dominant audience is experiencing.

The deceptive simplicity of writing audio descriptions is similar to writing fiction. Many people confidently assume they can do it before they have ever attempted it, at which point they realize how difficult it really is.

Lessons Learned

While I still have very limited experience, some of the lessons I have learned from studying and writing audio descriptions can be applied broadly, and I want to share them here.

1. People first, everything else second

The two demos I shared at the beginning of this article make a clear point: Music and silence are interruptible, but people (and animals) are not. They are the most important part of the show. Even if set pieces are very important to the story being told, and occasionally engulf the characters, it’s usually more helpful to focus on the people within the engulfing environment, so we know who is being affected by it.

2. Neutrality is helpful, but too much neutrality is unhelpful

Audio descriptions should probably avoid value judgments about what’s happening onscreen. As a describer, your goal should be to translate the visual to the verbal as neatly as possible and allow the listener to make their own value judgments.

But human voices don’t exist in a vacuum, of course. If someone is talking, we pay attention to their tone, volume, emotions, and dozens of other social cues. Your voice transmits certain values, and you should be mindful of this as you record narration, or when you choose the right person to do it for you. The describer’s voice should match the media, and become a supporting cast member in the performance.

Ultimately, you don’t want a voice that negates or clashes with the media being described. For the Björk video, I chose to speak in a low, quiet voice so as not to obstruct the low, quiet music. I could have chosen to imitate Rob Halford from Judas Priest, but it might have been distracting.

Now What?

Going forward, I have three questions lingering in my mind.

First, what about visually overloaded media with fast-cut editing? Is it even possible to provide good, real-time audio descriptions for something like Scott Pilgrim vs. the World?

If audio descriptions are meant to provide an equivalent experience, then can that clip truly be considered a success? I’m skeptical.

Second, do the same principles of good audio description apply to live entertainment? In many theaters, people watching a play can receive audio descriptions via a wireless device. The description is provided live by a person located in a booth acoustically insulated from the audience, but from where they have a good view of the performance. If the audio description principles are different, then how so?

Third, has the abundance of podcasts affected how people think about audio descriptions? One would have to imagine the answer is yes, but how?

To learn more about what makes audio descriptions helpful or unhelpful, I’m going to volunteer as a describer for the Be My Eyes app. It enables people who are blind or low-vision to use their smartphone cameras and ask remote-sighted volunteers for help distinguishing colors, navigating a new place,  and anything else they might want assistance with.

I will report back in a few months. In the meantime, leave a comment and let me know what makes an audio description work well for you. Equal Entry is proud to do audio description work, and we strive to do it as well as possible.

James Herndon
Accessibility Consultant | Atlanta, GA


  1. I am most impressed with your ability to verbalize music. You credit your fictitious writing, I read imagination. Not only is an extensive vocabulary important but you also need your talent as a musician. It was inspiring—something I would never be able to do, something I would never have thought to do. I’m truly awed. Proud to know you!! Much love, AMA

Let us know your thoughts!

Your email address will not be published. Required fields are marked *