Alt Text Accessibility: Balancing AI and Human Oversight

Image Description: Illustration of Thomas and Ken at a desk with A11y Insights. Thomas has a laptop in front of him. A city skyline is in the distance behind them. The news window shows img alt code with the Scribely logo.

At a time when visual content drives engagement and shapes brand identity, accessibility often becomes an afterthought. Yet, for millions of users who rely on image information and assistive technologies, accessible content is a necessity. To bridge this gap, companies like Scribely are pioneering innovative solutions. By specializing in high-quality image descriptions for photos, videos, and audio content, Scribely empowers organizations to meet accessibility standards while maintaining their brand’s unique voice and visual impact.

This episode of Accessibility Insights unpacks the challenges of ensuring alt text quality, the nuances of workflows prioritizing accessibility, and the potential legal and reputational risks of overlooking this vital aspect of digital media. It explores using AI as a powerful tool for scalability while maintaining human oversight and contextual accuracy.

Intro

Thomas Logan: Hello, everyone. This is Thomas Logan from Equal Entry here with Ken Nakata of Converge Accessibility. In this episode of A11yInsights, we’re excited to have Caroline Desrosiers and Erin Coleman from Scribely. Scribely is a company dedicated to enhancing digital accessibility by providing image description services for images, videos, and audio content.

We’re excited to speak with them about how their company delivers the highest quality accessibility solutions and how they are navigating today’s business world with artificial intelligence solutions.

Ken Nakata: Greetings Caroline and Erin! What is a typical client for Scribely and how do they get started working with you?

Caroline Desrosiers: Well, thank you so much for having us on the podcast. A typical client for Scribely can be any number of companies that are working with media. So, we work with media producers that are tackling a large scale of content.

So, images, videos, and audio, and that’s a core part of their business. So, definitely scale, and then also brands and businesses that are very focused on creating a strong visual presence on the web. So, these are those types of companies that stand out immediately. They’re taking their images and their videos and really all of their media content very seriously.

And it’s a core part of who they are as a brand and how they reach their customers.

AI, Accessibility, and Workflows

Thomas Logan: That’s great to hear. I imagine that there are quite a few companies out here in the world with large image libraries, because we know that the way that we operate frequently on the web and on technology is to be looking at images before we have conversations about things we want to buy or things we want to do completely makes sense. How would you describe or imagine the ideal workflow going for a client that you work with at Scribely to produce quality descriptions for their content?

Erin Coleman: Great question. The first part of our process is always to understand what our client wants to do with these images. What’s the impact these images will have within an experience or within their data systems?

So oftentimes these images are being utilized as part of the customer journey. They’re helping customers complete Tasks. They’re helping customers make decisions. They’re helping customers get information. Similarly, images can be used as a data input. So, whether you’re training a new AI model or you’re trying to assess your business outcomes, an image can be an important part of your data story and your data knowledge sets.

So, our first step is to understand that scope. Our second step is to understand their current processes and systems where this information exists today and is being stored or worked upon. After we get a picture of their value and what their intended output is and how they’re managing that today, we prep the data.

So, we look at the images and we begin to understand what image descriptions, such as alt text in the case of accessibility are important to drafting as part of the writing process. So, we prep the data. That data is then funneled into the stage of writing. So, that’s either by humans or assisted by AI, so using AI as a drafting tool as part of the writing process. And once that’s completed, there’s review. So, feedback cycles to understand is this image description data valuable? Is it going to create that impact that we identified in that first step? And lastly, we help them prepare it for distribution.

So, what’s interesting about image description metadata is it can live in a lot of different digital bases. Let’s say digital codes. It can be used in a lot of different ways. So, we help them get prepped for that delivery. And finally, we provide ongoing support. This data exists beyond this process.

It’s helping people get experiences. It’s helping AI develop its knowledge. So we always try to remain an ongoing input into that process to guarantee quality and accuracy.

Caroline Desrosiers: The workflow question is really interesting because this is where we’re seeing accessibility really break down. When we start talking to our clients about where do you source your images from?

Where do you store them? How do they pass through your systems? Which teams are involved in actually working with those images? And it turns out there’s a lot of questions, really. And just mapping that entire process, what we found is incredibly complex. And actually, accessibility is falling through the cracks at many different stages of the workflow process.

So Scribely, has become this company to really help organizations solve those problems of where images are falling through the cracks and all along the way as Erin was saying, view this important metadata, this alt text as something that can actually benefit their business in the long run.

Alt-Text Quality

Ken Nakata: Oh, thanks. I’ve always believed that alt text is a lot more complex than people assume it to be. And oftentimes in accessibility, it’s seen as more of an art than a science. So, it’s a problem for people to get their alt text correct. A question I had to us that maybe you could elaborate on for the audience is why is it important to monitor quality and accuracy with alt text?

Caroline Desrosiers: This is key to understanding alt text in general. The Web Content Accessibility Guidelines (WCAG), tell us that for all non-text content, we need to create what’s called a text alternative, and the short form of that is alt text. And that text alternative needs to serve the equivalent purpose.

So, these words that we’re choosing to describe images actually need to take the place of that image on the page so that they can be used by any number of assistive technology users out there. So thinking of quality and accuracy, that’s of the utmost importance. And alt text has been a requirement for over 25 years at this point.

Images have always needed text alternatives. So, we’re really focused on what makes a great description. And a lot of the time that comes down to context, of course. But what does that mean? Context can take many forms. Context, the way we define it is really everything surrounding the image on that page. It’s part of the experience. When you’re navigating through a website, you’re experiencing text, you’re experiencing visuals, you’re experiencing the brand.

And all of that contextual information needs to be taken in, in order to actually produce a quality text alternative. So, just really focusing on that word equivalent and how we do that is just an interesting topic in and of itself.

Erin Coleman: Yes. And I would add at a data level, the image description, the alt text, is the data describing the content of the image and that data and that image exist together in the ecosystem of usage.

So, whether it’s being used for accessibility purposes like with a screen reader or it’s being used to train AI or it’s being used in business analysis. If the data about the image is incorrect, then it doesn’t have any value. It’s null. It’s not supporting the value of the image. So that’s also why it’s really important to have it be accurate is because its intention is to tell us about the image itself.

We often say we’re image librarians of the internet because our responsibility is in a lot of ways cataloging the information that needs to live in association with this image. For all its purposes, so that you can have access to the image in ways beyond just seeing it with your eyes or seeing it with your AI model or coming across it just from a visual landscape.

Using AI to Create Text Alternatives

Thomas Logan: Thank you. That is great. I really understand what you’re saying about the importance of equivalence for alt text. And I think a fear that we have going into 2025 and the future is people looking at AI to just completely solve this issue for themselves. So what would be your message to a company that is saying, “Oh, I’m just going to use AI to generate text alternatives.”

What would you be telling them about how they can be considering this area, if they think that they can only use AI?

Erin Coleman: Well, I would refer back to the workflow and if you look at the workflow, there’s typically AI is being used today to draft the alt text, but that’s only one stage, there’s still the what is the purpose of the image?

How is the image stored? What’s the context of the image? What’s the drafting and writing part, but then what’s the feedback? What’s the refinement? How do we know this is quality? So, really, AI or human. The process needs to stay the same. So, if you’re choosing to incorporate AI, which is a scalability tool, right?

You can oftentimes use it as an efficiency play in alt text. You can get more alt text written in a period of time than you may be able to with just human writers.

However, you have to ensure that you have in place a way to review and determine whether or not that alt text that has been drafted is accurate, is consistent, is complete, is not inappropriate, because at least our clients as customers are putting that information out into the world, out into their data systems. And that’s a form of guarantee. So, if your alt text is bad or incorrect or offensive or anything, then the impact that’s going to have on your customer experience or your business capability is bad.

If you’re choosing to use alt text as a writing tool, you have to have processes in place to be able to review it so you can guarantee its outcome, you can guarantee its product, just like you would in a human writing process.

Quality and assurance is part of the image description and alt text game. And that’s true regardless of whether it’s AI or humans or both together writing that alt text.

Caroline Desrosiers: The way that I’d like to answer this question is to get everyone to think about the different context that surrounds images.

Because that’s where we’re really seeing AI struggle to attach that context to a meaningful description. In May 2024, Scribely released an e-commerce report where we looked at every stage of a potential customers journey where we started at social media. And the images that were presented to customers there.

Then we followed a breadcrumb to go to a landing page that was advertising that particular product, and checked the alt text there. And then we went to the product page, the point of purchase to review the quality of the alt text and all along the way we were checking to make sure there was quality there and we were seeing the experience completely break down at every stage.

A lot of AI-generated alt text used on social media for sure, and those were very generic descriptions. They didn’t really grab you like these images grab you. They’re meant to stop the scroll, but they kind of missed the mark there with this sort of bland description that didn’t reflect the brand’s message.

And then we go to the landing page and we’re seeing alt texts that’s either very basic or it’s missing. So once again, you’re letting your customers down at that stage too, when they’re trying to learn about your campaign. And then the worst case was the actual product page.

Where we see maybe up to 15 images that are highly descriptive of what this product is, what you can expect as a customer purchasing this product, those images answer questions. They help empower the customer to choose whether they want to buy it. And at this stage, we were seeing an overwhelming amount of formulaic alt text, meaning, some sort of formula was applied, like product title plus, size or color, plus the number of the image in that sequence equals alt text.

This is what we’re talking about. We cannot have that. It’s not meaningful. I think we can all agree it does not provide a text equivalent for the image. Whereas AI can be helpful. We have to ask ourselves, is it really doing context properly here? And are we happy with the quality and the customer journey at every stage of this process?

Legal Risks for Using AI in Text Descriptions

Thomas Logan: I want to say too, I love the conversation about the same image being used in different contexts. I don’t think we’ve seen that in a legal case yet, but in my understanding of listening to all. It’s really important. As you mentioned, depending on where that image is served, if it was on social media or on the homepage versus the product page, it’s the same image, but that intent changes and I really get that from the work that you all do.

I would say I have not seen that thought process so far in our legal system, because I think we’re mostly working on this place where image is unlabeled, image is unlabeled, but as you go and you get to a more sophisticated area, and this is why I love this conversation, I think you all are really making accessibility in this regard more advanced, and that’s what we’re always seeking to do as people working in accessibility. So, it’s great.

Ken Nakata: It sounds like most of the things that we’ve been talking about are mostly business risks or customer-related risks in terms of your reputation. But between Thomas and myself, I’m the lawyer of the group, so I just naturally gravitate towards legal issues. And in that context, what legal risks do you see existing for relying solely on AI to provide text descriptions for this stuff — visual, audio and video content?

Caroline Desrosiers: I think that we have to ask ourselves, “What is a satisfactory image description and what is not?” And I think that we could agree that a file name that just includes a nonsensical string of letters and numbers is non-descriptive, right? But then, let’s take AI. How can we decide whether this is a descriptive alt text or not? And that’s where we’re really seeing the debate occur because we are lacking a quality framework to determine what is a good and a bad description. And frankly, we need to debate that topic and we need to discuss it. There needs to be more research on this.

Certainly, to actually provide that information because otherwise, if we’re looking at truly whether alt text is there or not, we haven’t gone far enough. So, I think more work needs to be done to actually define what is a good quality alt text description that passes this WCAG text equivalent requirement, and what is not.

Erin Coleman: And I would add, how is AI being used in the alt text creation process when you think about legal risks? Because we see AI being used right now as a technology that can run alt text completely on its own from start to finish to delivery. And that, in today’s world, is risky because AI cannot manage that workflow.

It can contribute to that workflow in moments to address certain needs, as I previously mentioned, scalability, which is an alt text issue. Historically, there are many images in the world and they certainly don’t all have alt text.

However, if it is not in a workflow where it’s receiving feedback or there is somebody checking to guarantee its work, then I would imagine, I am not a lawyer, you’re running many legal risks because you’re putting out into the experience of your business or your service unguaranteed information and it is guaranteed that AI does hallucinate and does draft things that I don’t think the business’s legal team would want in their alt tech space. So. you would have to tell me what that means in terms of like legal gray areas.

But if I were the CEO of a company and talking to my general counsel, I would be concerned about an unguaranteed AI employee of mine, if you will just aimlessly shooting out information into the ether, and that seems risky, and that’s technically where we are today when it comes to AI as in alt text capability from the writing perspective, from the production perspective.

Ken Nakata: Well, I think that the legal risk comes down to ultimately whether a person’s able to access the goods or services or experience that’s equivalent to others. And often that’s in the eye of the beholder. But I would agree with you that relying on AI and its current iteration or its current incarnation is really risky.

Maybe in some day in the future it could get better. Thomas and I actually had a conversation about this a couple of months ago. And we had this idea of something that I came up with of this term mutable alt text. So, that if a person was just scanning down a page, and they weren’t really concerned about the image, it wouldn’t tell them what the image is.

But if they really wanted to drill into it, they could get more detail, and more detail, and more detail, depending upon what their needs. And that kind of world would be ideal if the AI was perfect, but yeah, we don’t live in an ideal world.

Caroline Desrosiers: And I have a question for you. Have there been any cases that have come down to the quality of alt text? Because alt text, it is commonly mentioned as one of the top five issues for brands and businesses to say, “You are violating this WCAG requirement. So this is the reason that you’re receiving this demand letter.” And I’m curious if that’s ever made it all the way to actually a discussion of quality.

Ken Nakata: Not really. In all the cases that I’ve seen, alt text is almost always mentioned as an allegation and a complaint, but the courts never really get into the real quality of it. It’s either there or it’s not on the majority of images for most courts.

Thomas Logan: But I think this is the future, right? We see the sophistication coming as this field continues to make progress, and there are the lawyers that took this sort of blanket approach of, “I’m going to use an automated scanning tool, and I’m going to scan hundreds of websites, and I’m going to allege because I know they have no alt text, it’s an easy thing to put in.”

But as we get into a world where people understand that, you could have the AI-generated alt text fill out, “Okay, I just want alt text put on every image on my page.” That’s where I feel like we’re going to immediately get to this point.

Just because you put some text into your page does not mean that it makes sense. And it doesn’t make sense. I feel like if we even did that test case and we generated AI alt text for any of the pages you probably work with, you immediately have a user saying this doesn’t make sense. Or it’s either way too verbose or just not logical or not consistent. It seems that’s what we would see.

Erin Coleman: Yeah. And if you’re going to use AI to draft alt text, you need a feedback plan because that’s how writers, human or AI, get stronger, and that’s how you guarantee quality. So, you can use AI, you just need a workflow and you need a feedback plan to be able to determine if in fact, that output is doing what you’re saying, Thomas. Well, we’re all saying, is that, is it acting as an equivalent?

Is it acting as a representation of the content in the image? And that feedback, also, if you do choose to use AI, is how AI models learn. So, it’s important for the ecosystem of AI as well, just as it’s important for the ecosystem of human writers to grow and to become better at their skills.

Improving Processes

Thomas Logan: How do you all receive feedback at Scribely and how do you improve your own processes for making your text alternatives that your humans are generating better?

Erin Coleman: Well, for one, Scribely is a very customer-centric end-user company, so we get feedback from the end user. Is this alt text? With this image, are you able to complete your goal?

As this alt text is embedded in this customer journey, in this context, with this intent, can you do what you’ve sought out to do? So, we believe that customer feedback or user feedback is paramount, and we always incorporate that back into the writing process, so you can do that in a variety of ways.

Usability testing, having partners you work with who are going to be using the images in a screen reader context, people who need to be able to have alt text do its job. Also, when we work with our large-scale clients and help them develop their processes, there’s a way to use, again, feedback cycles that are technically driven to inform your writing process.

So, there are ways to get feedback about your AI writing, about your human written writing at a technical level that can feed back into your drafting process. And we do work with clients who are using AI to assess that and gather data and inputs and markers of success to help inform their AI tools or their human writing teams to become better at writing and implementing their alt text so that it achieves that customer usability goal.

Caroline Desrosiers: We encourage our clients to stay very close to their customers on this in particular, where, even when we’re crafting our own proposals, we are putting in that there needs to be feedback directly from users, and this needs to be a part of our project together.

In addition to Scribely actually collecting that feedback from the disability community about our own work, we’re also encouraging that good practice with clients and their customers, because that’s really what it’s all about. We want them to actually understand this in a different way and see the direct impact of what their images are producing in terms of an experience for all of their customers.

Erin Coleman: And Ken, to answer I think one of your earlier questions about processes, the products that Scribely has built and our internal workflows have feedback cycles and levers to pull as well as flagging systems. We do have a product that AI is embedded into the writing flow. We can use it. We can choose not to use it.

We can have writers flag that they need help with an image. We can have people indicate, “Oh, a note, this was confusing, I need help with this.” So, that’s not necessarily consumer or customer input, but there are inputs and feedback portions of our workflow processes that also contribute to accuracy and quality and that end customer experience of the equivalency of the data being able to allow them to succeed.

Thomas Logan: What a great conversation. Thank you all both for being here from Scribely today. For our audience, we’d love to hear from you. Let’s continue the conversation. Please add your comments and we’ll be happy to respond. Thank you so much for your time and we will see you in our next episode.

Do you need accessibility auditing and a VPAT?

Is it time to update your VPAT / ACR and WCAG conformance statements? We can help. We do accessibility audits and VPAT reviews. If you’re not sure about these or want more info, contact us.

Equal Entry
Accessibility technology company that offers services including accessibility audits, training, and expert witness on cases related to digital accessibility.

Leave a Reply

Your email address will not be published. Required fields are marked *