With many employers switching to remote work and schools moving online, it’s no surprise there’s an explosion of video and conference calls. With that in mind, I’m thankful for two things: One, I’m not currently attending classes. Two, I’ve been a freelance remote worker for more than a decade.
Before I reveal why, here’s a challenge for you. The next time you attend a webinar or a conference call (video or audio), turn off the sound. Or go to YouTube. Pick a video. Any video. Mute the video and watch it. No cheating with captions! What’s the experience like for you? All you can do is rely on visuals. Occasionally, you can figure it out. Most likely, you’ll have more questions than answers. This is often the case in my experience. It’s the same with a webinar or video conference.
Webinars and Video Conferences without Sound
Not every presenter provides slides with text and visual cues to help you follow along. Lipreading video? Fahgettaboudit. It’s impossible with a jerky video and the sound falling out of sync with the mouth movements. I’ve seen webinars with nothing but a title page (no slides) and all audio. If you’re lucky, maybe there’s a text chat on the side. Often, the chat leads to more confusion as people comment on the presentation.
Now imagine your volume control is busted. No sound comes through the speakers. Or you’re in a public place and you don’t have headphones with you. That’s how it is for me every day as a person born profoundly deaf. Yes, I have a cochlear implant. Listening still depends on lipreading. I can’t close my eyes and understand what someone says. Occasionally, I can make out a word here and there. Like my name. It’s hardly enough to follow a conversation.
If I were a student or part of a company that switched to remote work for the first time, I fear I’d be left behind. Sure, I could use the relay service for conference calls. However, they’re imprecise, leading to more confusion. The relay operator often can’t tell who is speaking. Imagine my surprise when an invitation from Thomas Logan arrives in my inbox. He asks if I would attend a virtual reality (VR) webinar. And here’s the kicker. He says it’ll be captioned. Yes! Yes! Yes!
Exploring the Mozilla Hubs World
The VR webinar takes place in Mozilla Hubs. Hubs is an experimental web-based chatroom in a mixed reality environment. It’s platform-agnostic. You can watch videos, chat with others, play with 3D objects, and attend webinars.
Open your web browser on any device, go to the URL for the specific hangout, and you’re in. No logging in or plug-in needed. How cool is that? The only time Hubs requires creating an account is to pin items and gain access to other features. I go to the VR webinar and enter my name. This helps people find me, especially Mirabai Knight. She’s the person with speedy fingers captioning the webinar.
The chatroom opens. Right away, I notice the ground looks like four-paned windows lined up side by side. A blue matrix. Robot-style avatars pop up around the scene with their human’s name above. I spend a good amount of time getting familiar with the environment and the keyboard controls.
Remember that Hubs is an experimental environment. It’s a work in progress. The Hubs team welcomes feedback. Initially, I feel overwhelmed because I’ve never been in a VR setting like this before. I clumsily move around working my way toward the presentation.
Connecting with the Live Captions
In the text chat, Thomas and Mirabai greet me. They provide the URL to the Streamtext service where I can view the full captions. The captions appear briefly in the VR and then unexpectedly disappear. The team behind the scene works to restore the captions. Meanwhile, I’m reading the captions in a second tab of my web browser as the next image shows.
One problem. If I scroll back to read the captions I missed, the captions jump back to the most current. It’s like when you’re trying to scroll through a text conversation on your phone. If the other person types something, it takes you to that and you lose your place. You ask the person not to type for a minute while you go back and read something. That’s what happens with the captions. Except I can’t tell Mirabai to stop typing, of course.
After a bit of fiddling, I finally make my way to the presentation screen. A few minutes later, the captions show up. As we all wait for the presentation to start, I’m trying to take screenshots through my app. But it doesn’t work. Neither does the web browser’s screenshot tool. Unfortunately, that means switching to my phone for screen captures. The phone’s images won’t be as good as those taken with a screen shot tool. Something is better than nothing, right?
Experiencing a VR Webinar with Live Captions
Right away, I’m frustrated. Things constantly block parts of the presentation screen and captions. The big one is the Mozilla menu at the top with the mute button. The “You are muted” message won’t go away. It keeps covering parts of the presentation as shown in the next image.
Then, everything goes kablooey! Emojis appear, multiple windows and menus surface, the whole screen is covered by a thousand objects. Well, maybe not a thousand. But a lot. If you think the following screenshot is messy, the screen before the freeze looks almost completely covered by objects.
So, I switch to the tab with just the captions to see what’s going on. Shortly after, everything stops moving. D’oh! The web browser froze. I shut it down and reboot it.
So. Many. Things. On the Screen
It doesn’t take long before I resort to telling strangers what to do. Talk about awkward. Occasionally, I have to ask people to move their avatars because they’re blocking the presentation or captions. Just call me Ms. Bossy! The following image shows an avatar hiding part of the captions.
Another problem pops up. Literally and figuratively. At the bottom of the screen is a box for entering text for the chat. As soon as you submit your text, the box appears at the bottom. The next person who says something follows behind pushing the previous text upward. Repeat. Soon, the entire chat overlaps the middle of the presentation and captions.
It makes for a wearying experience in trying to read the boxes and the captions while looking at the presentation. The chat boxes are transparent. It’s like reading in a fog. A better option would be to create a white chat box on the side of the screen with the text showing up in black. Or vice versa: black box with white text.
Remember the menu with the mute button? I’m constantly adjusting my location and view to bring the presentation below it, all the while being fearful the screen will freak out again. The avatars constantly move around. It’s distracting. People with certain conditions struggle with a lot of motion and this will be a problem. Hubs’ menu contains preferences, but I cannot find the ones that address this.
For example, one option provided is, “Only show nametags while frozen.” I’d prefer, “Don’t show nametags.” Then I can turn it off and on at will. Better yet, I’d like to, “Hide all avatars and nametags.” Another option would be to select objects you want to view and hide the rest. Then, I could select the presentation, captions, and chat box. In response to this, the Hubs team asks me to select the captions, press the spacebar, and select the magnifying glasses. As soon as I let go of the spacebar to select the magnifying glass, it disappears. It would not stay long enough for me to click it.
Follow Those Captions
Notice in all the images that the caption box and presentation are crooked. I couldn’t get in a position to get them to appear straight. I would like to block out the entire matrix and avatars. It’d be great to view just the presentation screen, caption box, and chat box. That’s it.
After the main presentation ends, we break out into groups. One goal for this is to see how the captions would work. The Hubs team moves the caption box as far away from the original presentation as possible. Certified Realtime Captioner Mirabai and I both follow those captions. The challenging part about this setup is that Mirabai could hear distant conversations from the other groups. It makes it harder to hear our group conversation and tune out the rest. There’s less noise with the presentation gone and fewer avatars around. Still, the caption box is crooked. It drives my need for symmetry bonkers!
Since 1983, TV and movie captions have appeared with a blackish-dark grey background and white text. Rarely has it ever strayed from that. Briefly, the captions have appeared in yellow, green, or violet. You can bet that didn’t last long! Captions are boring for a reason. They’re supposed to be barely noticeable. Their job is to capture the audio with minimal distraction. The VR webinar’s blue box with yellow text is not the most comfortable read.
I’ve experimented with different shades of black and white captions. While experimenting, people with dyslexia and other differences provide feedback. They find off-black and off-white colors works better. Instead of #000000 (black) and #FFFFFF (white), I use #242424 (slightly off-black) for the background and #FFFFFD (slightly off-white) for the text.
Near the end of the webinar, I hear my name. Or so I thought. But the captions don’t show anything. After a minute, the captions kick in again. Sure enough, my name appears in the captions so I did hear it! Everyone thinks I’ve left since I didn’t respond. I let them know the captions lagged. Not sure how that happened considering most of the audience had gone home.
I must apologize to Thomas! To be honest, I can’t recall too much of what he talked about in the presentation. That’s not his fault. I’ve juggled a lot of things during the webinar. Managing my view, reading the captions, asking attendees to please move when they block the captions, keeping up with the transparent chat texts, tinkering with the menu and preferences, and answering questions.
For captions to be effective in a VR setting, it requires limiting distractions and clear, readable captions that sit straight while attached to the presentation. A chat box out of the way on the side would help immensely. I know Hubs wants to build a world where we feel like we’re surrounded by the people joining that world. I see the value in that. In a presentation situation, however, it interferes with the important part of the experience: the content of the presentation.
It’s fantastic that Thomas and the Hubs team are working to create an accessible setting. And this is a good start for those who need captions. Want to experience the webinar and captions yourself? Check out the recording of the A11yVR Meetup.
Oh, I got to see Mr. Spock! He was one of the avatars in attendance. Live long and prosper, y’all!