Equal Entry Accessibility Consultant James Herndon is presenting Live Audio Descriptions for 360-Degree Video: Best Practices at CSUN 2020. As a preview to his presentation, he asked me to review one of the illustrative examples he will be using in the presentation.
One of the most challenging tasks for blind people is watching videos. Chances are, we wouldn’t be able to visualize and fully grasp what the content is all about. To address this, audio descriptions are injected into the video.
An audio description is a modified version of the video’s original soundtrack. It adds more information to the original track that would aid blind people in understanding the unspoken parts of a video. It means describing movements that are not audibly explained. In a nutshell, it adds context to visual information, allowing the blind to imagine and visualize the context.
However, as technology evolves, developments and innovations are born. Before, there were only 2-dimensional videos. These materials can be easily infused with audio descriptions. With the birth of 360-degree videos, or those that are captured through Virtual Reality / Augmented Reality (VR/AR), it is a completely different story.
VR is a technology that produces a simulated environment which is experienced through sensory stimuli. It provides a true-to-life real-time experience that allows the user to immerse and interact with the 3D world. Given its real-time nature, the spontaneity of the user’s movement makes it more challenging to inject audio descriptions.
VR Video of Van Gogh’s “The Starry Night”
To have a better feel of this 360-degree video, I walked through the actual experience. The default listening experience is only background music and was totally unavailable to me.
Next, I listened to a video clip of Vincent Van Gogh’s “The Starry Night” painting prepared by James Herndon with two separate audio descriptions for two different views.
One view was the front-facing, or the “optimal” view. It allows the viewer to experience most of the sights available in the video. Here is a brief clip:
The other view is facing backwards:
The optimal view is one I consider to be the straightforward perspective. The audio description experience is very similar to what I get when watching a typical video that is audio described. It was easy for me to visualize what was happening in the video. I was able to imagine the scenes and was able to put myself in the experience. The details are perfectly outlined, and it painted a clear picture of the landscape of the painting.
The experience with the backward view is distinctly unique as compared to the optimal. Had I not heard the description for the optimal view, I could not have easily visualized the events in the video. Since it’s backward, the more detailed scenes portrayed in the previous view are absent, scenes such as the view of the village and the outside façade of the house. I needed to confirm this from a sighted person, and true enough, what I imagined matched with what the sighted person described.
If I were to choose, the optimal view is definitely preferable. The way it was described makes more sense for a totally blind viewer like me. I am guilty in that the fact that I was once sighted became an advantage. The use of more complex colors such as ultramarine did not become an issue for me, and I can vividly paint the picture in my head. This is not the case for people who are born blind.
To add more value to my observations, I asked some friends who were born blind to also take a look at the video and the descriptions. Most of them didn’t immediately get the concept of 360-degree videos. Just like me, it only made sense because I was briefed beforehand that it is a VR video. For them, when they listened to the audio description as is – without prior instructions that this is a VR video, it was just a typical video with a typical audio description. As a common denominator, all of us agreed that the optimal view is better and easier to understand.
If we talk about the video and the audio description per se, the experience I described above works. However, if we judge this as a VR video with audio description, here are major observations that completely change the perspective of blind viewers like me:
- The first question that came to mind was how a built-in audio description can simultaneously describe the video if I move and change views or paths. For example, if I am initially moving forward then I suddenly decide to face backwards, how will the recorded audio description relay to me the real-time changes?
- I realized that for 360-degree videos provisioned with audio description, for blind people, it is no different from a typical 2D video. If an audio description is injected, it will be limited to just one view.
- It is important to inform the blind user beforehand that the video is a 360-degree video. It is also a must to explain how VR works because people who are born blind are most likely clueless what a VR video looks and feels like.
The real-time and spontaneous nature of 360-degree videos makes it very challenging to infuse built-in audio descriptions. At this point in time, considering all the technologies we currently have, I personally think that the only way for a 360-degree video to be perceived by a totally blind person is when there is an actual sighted person that simultaneously describes the video. Another option is when different audio descriptions are pre-recorded for all the different views.
There may still be challenges at this point for VR technologies, but on a brighter note, it is comforting to know that there are at least available ways for blind people to experience these VR videos. It may not be exactly the same as with sighted people, but what matters most is the fact that it is already possible.