Links to all resources mentioned in this article are listed at the end of the page.
Shari Trewin started with a story about captions. In 2020, she was one of the accessibility chairs of a large virtual conference. And they had a dedicated human captioner live transcribing the sessions in a separate live feed.
During the conference, Shari spoke to a hard of hearing attendee who uses captions to supplement what he can hear. And he said the live feed had a noticeable delay. So, he also used the automated captions that were streamed through the conference provider. These appeared at the bottom of the slides. They had less of a delay, but accuracy problems.
He also turned on the automated captions in his browser, which used a different speech-to-text engine. And he supplemented that with an app on his phone that used a third different speech recognition engine. That captured the audio that was playing out from his computer and transcribed it for him. That’s four sources of captions for him to read. On top of this, he needed to look at the slides and the presenter.
Ideally, the web conference tool would have provided better support for the live captioner in the first place. But in the situation that he was in, the AI-powered captions were helping him to access the conference. But it wasn’t a very usable experience. He had a huge burden to manage his own accessibility.
Shari told this story because it illustrates some of today’s challenges and opportunities for artificial intelligence (AI) and accessibility. On the opportunity side, AI can be very empowering as it can support people as they independently access content and the world.
On the challenge side, the AI was exclusionary. It didn’t convey the words of people with atypical speech as well as it did for other people. In general, AI can be biased and it can make mistakes. This is an important factor when thinking about AI for accessibility.
Applications of AI in Accessibility
Here are some of the ways that AI is already being used or explored in accessibility.
Speech-to-text has many different applications. It can generate captions for video authors to edit or for consumers to use.
It can enable voice control for devices. Like smart home controllers or personal assistants. It enables voice dictation when typing is hard. As speech recognition improves, it’s becoming a mainstream way to control digital devices. It’s essential that people with atypical speech aren’t excluded from that.
Hence, researchers at Google and Project Euphonia investigated the potential for personalized speech recognition engines for people with atypical speech. They recorded many utterances from hundreds of people and built a personal model for 432 people.
The researchers found that the personalized models could significantly reduce word error rates (WER) in speech recognition. They could reduce it from 31% down to 4.6% in their research. So, it’s a meaningful improvement. In some cases, the personalized models outperformed a human transcriber who wasn’t familiar with that person and their speech.
These personalized speech models from apps like Project Relate look promising. Maybe similar approaches to this could be taken to build personalized models for gestures or for typing or touchscreen interactions. Last year, A11yNYC hosted Pan-Pan Jiang who talked about Project Relate.
Here’s an update that occurred after Pan-Pan’s presentation. The University of Illinois Urbana-Champaign has partnered with tech companies and organizations representing users. The goal of this partnership is to gather a large corpus of atypical speech data to further research in speech recognition.
Another form of audio is environmental sounds. So, Dhruv Jain and colleagues at the University of Washington and at Google developed an app called ProtoSound. It identifies sounds in the environment. Then, it displays the name associated with what it currently hears. For example, if someone is playing piano, the app would show “piano” along with the decibels of the sound.
This allows users to make a few recordings of sounds from their personal environment and then train a custom AI model. It also has a library of pre-trained sounds like a fire engine. Something that would be hard to track down and record yourself.
AI for Images
AI models can identify objects in images. Some apps include Google Lookout and Microsoft SeeingAI. These are like a visual assistant that provides a talking camera. It describes what it sees in the real world as well as on-screen content. These apps have different specialized AI models for different tasks, such as finding text, reading bar codes, or identifying currency.
To be able to do this, they need clear and well-framed images. The Lookout app provides audio feedback to help users that are blind take good images. Then, it adds image processing to improve the sharpness of the images. It uses AI models that are trained with images captured by blind people.
Like with speech-to-text, it’s valuable to personalize these kinds of models. They can only recognize objects they’ve been trained to recognize. So, researchers are working on methods for people to train their own object recognizers, to identify things that are important to them in their homes.
Reading and Writing with AI
Many people find reading a struggle because of dyslexia, low vision, or lack of fluency in the language. The Android Reading Mode app makes reading easier by providing you with a personalized view of the text you want to read. Then, it reads the text aloud and can highlight each word as it goes. It takes advantage of some of the newer speech-to-text models that have more and more expressive, more natural voices that are easier to understand.
It uses AI technology to identify what to read. For example, if you’re pointing it at a web page, it has a lot of things that are not part of the main content. Hence, it can help reduce clutter as you read.
And finally, looking ahead, large language models like ChatGPT have been in the news, impressing people with how they can generate, rephrase, and summarize text. There was a Google research study that looked at their potential for use with human-AI writing tools.
Working with people with dyslexia, they explored a prototype kind of email authoring assistant. It could generate outlines and suggest rewrites. The paper’s conclusion indicates that state-of-the-art AI is not quite ready yet to provide reliable writing support.
These are a few of the applications of AI that offer a lot of exciting opportunities. Unfortunately, AI models can have harmful biases.
Machine Learning Development Pipeline
First, here’s a general overview of the machine learning development pipeline. Every company and developer process will be different. There are different phases and opportunities to introduce and mitigate bias at these touch points. The first phase is dataset development. Machine learning models must have data to learn.
It’s important they identify and develop datasets. These datasets help determine what the model can and cannot do, and as you can imagine, what the model can and cannot do can introduce bias. Often developers identify existing datasets that fulfill their needs, and then they may identify gaps and curate new datasets to fill those gaps.
The next phase is model development. In this phase, developers train the model with data from the previous phase. Then, they choose different techniques for how the model will learn. These decisions can introduce bias.
The next phase is evaluation and mitigation. This is where developers, researchers, and analysts evaluate whether the model does what they want it to do and creates mitigations. What’s important here is: Benefits and harms can only be measured if they’re in the evaluation. Whatever is not evaluated is not measured or mitigated.
Finally, the model is launched with ongoing impact assessments. This is the point where we learn about benefits and harms that were not anticipated. Hence, it’s important to kind of stick with this development cycle for maintenance to address any unanticipated harms.
AI relies on averages. This shows distribution with clusters and averages. Recall the data development phase. Many datasets reflect the most common, not the most representative data. So, people with disabilities or accessibility requirements aren’t going to be in the datasets that are most easily available. And the way that machine learning works in relying on averages can have negative implications for people outside of that. This includes people with disabilities.
The next problem is that misclassification in alt text can provide wrong information. As an example, groups of pixels on an image may not be labeled representatively in training data. This often happens when women wear professional uniforms like firefighter or military uniforms. They are stereotyped as being associated with men. The model may learn by association that someone in uniform is male.
Another way this can happen is when certain labels are not included in the data. As an example, there’s an image of someone who identifies as Black, nonbinary, and disabled. The person is not correctly identified. Most model training does not have gender labels, except for binary gender labels. In this case, nonbinary is being excluded. And that’s what’s happening with disabilities too.
The last form of AI bias is missing data, which may exclude disability. Emily Ackerman, a person in a wheelchair, encountered a delivery robot in the curb cut. She was crossing the street and needed the robot to move. While those who are walking could choose to not use the curb cut, Emily relied on it. This was a dangerous situation.
This scenario shows if AI does not recognize some people, it can respond in inappropriate and potentially unsafe ways. This can have downstream impacts.
A lot of these use cases have the potential for real benefit. But they need to be able to recognize and respond to diverse people, things, and public infrastructure.
Challenges of AI for Accessibility
Researchers may encounter different challenges when trying to prevent these biases. One of them is performance inconsistencies may interfere with access. Machine learning often makes a guess based on training and learned associations resulting in unpredictable outputs.
An example of this is someone who is blind using an AI tool. Her daughter wanted to wear yellow tights. The tool announced the pair as orange, a color her daughter does not even have. The user had to use past knowledge to deduce orange meant yellow.
The next challenge is acquiring inclusive data. Finding enough data is hard as people are underrepresented in the data. It requires intentionality to ensure they’re included. Additionally, disclosing personal information, like a disability, can lead to negative consequences.
Moreover, people with disabilities are a heterogeneous group. They use inconsistent terms to describe their experiences. For example, some people use the term “disability.” Some don’t. Some people who use the same disability terms may present them in different ways. And so this makes it tougher to just rely on labels.
Synthetic datasets are another issue in inclusive data collection challenges, which may lead to disability stereotyping. Collecting authentic data requires a lot of effort on the subject’s part and it could be fatiguing and potentially unethical.
The third challenge of AI is accessible verification. Researchers did an evaluation of how much blind people trust alt text and the study showed they tend to overtrust it. They showed an image of Hillary Clinton on stage to a participant. The AI generated an image description that read: “I’m not really confident, but I think it’s a man doing a trick on a skateboard at night.”
Despite the low confidence, the participant said he would’ve shared the image on social media thinking it was a photo of a skateboarder. He said he would’ve gotten in trouble with friends if he had actually shared it because it was political.
Here are some things that you can do to help maximize the benefit and address some of the aforementioned challenges. The four steps are as follows:
- Advocate for inclusion through the entire development pipeline
- Identify risks
- Partner with communities
- Set expectations
At the start of the project, identify the risks. Ask who might be impacted by this application. What’s at stake? Does it use data about humans? Push for more data diversity. Ask good questions.
Select any of the bullets to jump to the topic on the video.
- Examples of applications of AI in accessibility
- Forms of AI bias
- Challenges of AI in accessibility
- Recommendations to maximize benefits and address challenges with AI
- Q&A with Cindy and Shari
Watch the Presentation
- A communication tool for people with non-standard speech
- Ableism And Disability Discrimination In New Surveillance Technologies: How new surveillance technologies in education, policing, health care, and the workplace disproportionately harm disabled people
- Accessibility and the Crowded Sidewalk: Micromobility’s Impact on Public Space
- AI and Accessibility (not open access)
- AI Fairness for People with Disabilities: Point of View
- Android Reading Mode app
- Artificial intelligence and disability: too much promise, yet too little substance?
- Automatic Speech Recognition of Disordered Speech: Personalized Models Outperforming Human Listeners on Short Phrases
- Considerations for AI Fairness for People with Disabilities (not open access)
- Designing AI applications to treat people with disabilities fairly
- Disability, Bias, and AI
- Ethics Guidelines for Trustworthy AI | FUTURIUM | European Commission
- How technology can help break communication barriers: Project Relate
- Google’s Lookout App: Your Android Swiss Army Knife
- “It’s Complicated”: Negotiating Accessibility and (Mis)Representation in Image Descriptions of Race, Gender, and Disability
- LaMPost: Design and Evaluation of an AI-assisted Email Writing Prototype for Adults with Dyslexia
- My Fight With a Sidewalk Robot by Emily Ackerman
- Personalized ASR Models from a Large and Diverse Disordered Speech Dataset
- Project Relate: A Communication Tool A11yNYC recap
- ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard-of-Hearing Users | Makeability Lab
- Responsible AI Practices
- Responsible AI Toolkit
- Revisiting Blind Photography in the Context of Teachable Object Recognizers
- Sidewalk Toronto and Why Smarter is Not Better
- SIGACCESS: October 2019 newsletter with eight position papers on AI bias and accessibility
- The Data Cards Playbook: A Toolkit for Transparency in Dataset Documentation
- University of Illinois joins five technology industry leaders in new Speech Accessibility Project
Dr. Cynthia (Cindy) Bennett is a Senior Research Scientist in Google’s Responsible AI and Human-Centered Technology organization. Her research concerns the intersection of AI ethics and disability. Her work has received grant funding from Microsoft Research and the National Science Foundation, and eight of her peer-reviewed publications have received awards.
Cindy is a disabled woman scholar working in the tech and academic sectors, and she regularly volunteers to continue raising the participation of people systemically excluded from STEM. Cindy is a Brooklyn resident who enjoys building her tactile art collection, listening to jazz music, and talking, perhaps endlessly, about geography. Follow her on Twitter @clb5590.
Dr. Shari Trewin is an Engineering Manager at Google, leading a team that develops new assistive technologies and features. Her background is in research, with 21 patents and 70 peer-reviewed articles including AI fairness, accessibility tools for designers and developers, web accessibility, access to virtual worlds, and self-adaptive input devices.
Shari is a Distinguished Scientist at the Association of Computing Machinery (ACM). Outside of work, she enjoys volunteering at a therapeutic riding center; finding new ways to reduce, reuse and recycle; hiking; camping; and anything that involves rocks.