Rich Schwerdtfeger was the CTO for Accessibility at IBM and chairman of the board for Knowbility. He helped develop the first GUI screen reader for the IBM PC. Rich has led accessibility efforts on OS/2, Java, Windows (IAccessible2), Web 2.0 Accessibility (WAI-ARIA), and transcoding middleware for seniors.
How did you get your start in accessibility?
I got my start in accessibility through a software development contract at IBM’s Watson Research Center in Yorktown, New York in 1990. My manager was Jim Thatcher. In this contract, I was asked to create the first offscreen model of the OS/2 Presentation Manager desktop to be read by the new Screen Reader/2 screen reader in development.
A screen reader reads the user interface to a blind or low vision user. An offscreen model is a database representation of what you see on a Graphical User Interface (GUI). This would be the very first GUI screen reader for the PC. Another example of a GUI is what you see in Windows.
At the time, I knew very little about accessibility technology. My background had been software development in the defense and oil industries — worlds away from anything related to accessibility. However, I had extensive OS/2 experience on my last job which helped me get the job.
During that time, the world was switching from a DOS character-based user interface to a Windows, and sometimes OS/2, GUIs. There was no solution on the PC to make a GUI accessible. With DOS, screen readers could access a readily available text buffer (model). In GUIs that did not exist and a more complex model needed to be created from graphics engine drawing calls. Like most firsts, they are both challenging and rewarding.
I ultimately figured out how to build an offscreen model of the desktop that included text, their associated window information, icons, and other things that needed to be read by Screen Reader/2 to a user. This would become known as an Offscreen Model. Nobody knew we had solved the problem.
In the summer of 1991, a man named Joe Lazarro wrote an article for Byte Magazine, called “Windows of Vulnerability” that talked about losing access to the computer to the blind. The fear of this loss was enormous. I saw this and thought nobody knew that we had indeed solved the problem.
I reached out to John Udel at Byte Magazine and wrote an article for the magazine called “Making the GUI Talk” that discussed how we solved the problem. The impact was really quite massive. I recall going to my first CSUN conference and having this feeling like I was Mick Jagger.
A renewed hope had spread across many attendees that they would not lose access to a critical device needed to work, go to school, and communicate — the PC. This was not only my start into accessibility but also my start at becoming an activist.
No other computer job or life effort I had done in the past has made such a meaningful impact on so many. It proved to me that my work could truly matter to people. It became much more than a job. What I received became so much more than what I gave.
You led the efforts to create WAI-ARIA. How did the ARIA solution come about?
Since working on Screen Reader/2, I was involved with a number of accessibility efforts. One of the biggest, and which contributed greatly to Web Accessibility Initiative – Accessible Rich Internet Applications (WAI-ARIA), was Java Accessibility. For Java, we created the first cross-platform accessibility API with Sun Microsystems and we built the first cross-platform screen reader for Java called the Self Voicing Kit. This effort happened in the late 1990s.
Fast forward to 2003, David Boloker, the CTO for IBM Software Group’s Emerging Technology effort, asked me to come over from IBM research to be the accessibility architect for Software Group. The Software Group generated over $20B annually for IBM. I had previously worked with David who spearheaded a number of IBM’s Java software efforts. I did not know why until I was asked to attend an Emerging Technology yearly kickoff event in Cambridge, Massachusetts.
The First Web-based Office Suite
During the break, someone asked to come to a cubicle hidden in the far back of the floor to see an IBM acquisition form a company called AlphaBlocks. It showed the first office suite running in a web browser. This was years before Google Docs was even conceived.
He said this was going to change everything and by that he meant that the whole world was locked in a client platform monopoly with Microsoft Office and Windows and Microsoft was actively trying to use this monopoly to lock customers into using their middleware technology that used Silverlight. I recall David stating this would be a “Renaissance in the browser.”
I mentioned to David that the web technology needed to do this required the use of JavaScript and CSS. These were essentially banned by all government accessibility legislation all over the world in that you had to be able to run your web applications with these technologies turned off. The reason for this is nobody had solved the “JavaScript Accessibility Problem” for six years. He said, yes and we need you to solve it as our entire software business was dependent on it. No pressure there.
The JavaScript Accessibility Problem
I went back to Austin and worked on the problem. I was in a unique position in that nobody who worked on web accessibility had ever looked at OS platform accessibility other than me. Also, few had ever looked at OS Platform accessibility infrastructure on multiple operating system platforms due to my work on Java.
Looking back, I was the only person who had the background to step out of the box and rethink web accessibility. It was not long before I recognized similarities between how web pages, in the browser, and native software platforms were constructed. I had also given IBM’s home page reader team guidance on using Internet Explorer’s Document Object Model (DOM) to read web pages.
An earlier version of Home Page Reader used the HTML directly. A web page is converted into a DOM tree structure, just like window and object tree hierarchies on the desktop. The reason for this, in part, is that you have to propagate mouse and keyboard events up the document tree to process them just like desktop applications.
I quickly discovered that the Javascript accessibility problem was caused by the inability of HTML to provide a means to convey the new semantic meaning of each of these DOM nodes changed such as an HTML <div> being repurposed to look and act like a checkbox with a state of checked or not checked.
To a screen reader, this would just be a document object with no semantic meaning. If the screen reader did not know what it was neither did the user. Neither knew that this had now become a checkbox and whether or not it was checked.
I realized that if I could apply a role and state to the object (Now referred to as WAI-ARIA semantics), I could map this and the associated text to platform accessibility Application Programming Interfaces (API) on all the operating system platforms, which included notifications to assistive technologies stating that the states changed. In the checkbox example, it would have a role of the checkbox as well as a checked state.
Also, the document structure could be used to associate document elements, such as those that form listbox items within their listbox container. In the case of a listbox, you could also know how many elements were in it. This would enable all web applications to become cross-platform accessible. The web browser just had to map the semantic information to the associated platform accessibility APIs which assistive technologies used in software applications to read the screens.
As part of the strategy, I made it clear to management that the technology needed to be open and free of patent restrictions. We also needed to do it in an open-source browser so other browser manufacturers could see how it was done and also to get over another critical hurdle. At the time, Microsoft’s Internet Explorer (IE) browser had over 90% market share and had parked the IE development team in Beijing, China with the task of doing maintenance.
No major open web advancements were picked up, intentionally, as this would stop web growth and maintain their monopoly on client applications. My work on Java accessibility enabled me to discover the solution in about a month. However, like all technology, the hard part is getting the rest of the world to adopt it and this is never accomplished by one person.
Key contributors to making this happen were Aaron Leventhal who we hired to do the Firefox implementation of what was initially called JavaScript Accessibility; Becky Gibson who built the first working accessible web widget library using the technology; Andi Snow-Weaver who worked with Becky to remove the restrictions on the use of CSS and JavaScript in the Web Content Accessibility Guidelines (WCAG) and push for government adoption; the ARIA working group which I chaired for years; the HTML working group; as well as Judy Brewer and Steven Pemberton who believed in my crazy idea to solve JavaScript Accessibility problem and supported work to go change the existing standards as well as create new ones that enable the WAI-ARIA solution.
Ironically, we actually borrowed an implementation from Internet Explorer to fix critical keyboard accessibility deficiencies on the web. On the web, to make web pages keyboard accessible, you needed to use an HTML feature, called tabindex. This allows you to put HTML elements in the tab order. Although this allowed you to make web pages keyboard accessible it also made elements only accessible through the tab key. You needed to make some keyboard elements focusable using JavaScript without putting them in the tab order. This IE feature was the tabindex value of -1.
Just imagine placing a menubar in your web page and having to tab through all the menu items in the menubar to get to the content the menubar applied to on the web page. It is better to navigate to the menubar using a tab key and use arrow keys to navigate it. A subsequent tab should immediately allow you to exit the menu and move to the next piece of content that should logically be in the tab order.
WAI-ARIA Beginnings
I should point out that WAI-ARIA did more than just solve the JavaScript accessibility problem. WAI-ARIA paved the way for the open web as an application platform and broke one platform’s monopoly on client applications to make room for new browser application platforms around Chrome, Firefox, and Safari. When we started, Internet Explorer had roughly 95% of the browser market and Microsoft completely controlled the client application space.
Ultimately, Microsoft joined the open web accessibility effort and early on I would have Monday night calls with Linda Mao of the Microsoft IE team in China while Microsoft attorneys were addressing legal issues with joining the World Wide Web Consortium WAI-ARIA standards effort.
I still recall the one Monday night call when Linda said she could no longer work with me as the IE development effort was moving back to Microsoft’s main campus in Redmond, Washington. The reason she stated was accessibility. This was the start of making the open Web the client platform of choice. WAI-ARIA educated the web community on how platform accessibility worked and how they needed to think beyond alt text to solutions that would be interoperable with assistive technologies.
One of those watershed moments (literally), as an activist, that really sticks with me to this day was when Aaron Leventall sent our first working interactive, ARIA-enabled, tree widget to Tom Wlodkowskiy. Tom directed accessibility efforts at AOL. We had been working with Freedom Scientific on JAWS screen reading support of WAI-ARIA in Firefox. I convinced David to hire Aaron to enable WAI-ARIA support in Firefox as Aaron owned the accessibility modules in Firefox.
I had hired Aaron away from Tom who was and still is a friend. He was quite unhappy with me but I told him that within one year he would be thanking me. Tom is blind and literally cried when he heard a web page tree widget be actually usable and sound just like on a desktop computer. He told me I was right in hiring Aaron. Again, being an activist can be hard but what you get in return is so much more.
Can you talk about developing the first GUI screen reader for the IBM PC?
I covered this a bit already, but I will try to go a little bit deeper to share my personal experience. This project was headed by Jim Thatcher and the research was done at IBM’s TJ Watson Research Center in Yorktown Heights, New York. I started as a contractor and had no idea what a screen reader was.
I recall my first day walking up to this ominous-looking dark glass curved structure sitting up on the top of a hill in the middle of a forest. It was deathly quiet as after all seminal research was going on here.
I waited in the foyer for Jim to come down and take me up to my office. I looked up to see a door on the floor above fling open and this much older man ran down the stairs like a 15-year-old absolutely excited to meet me and get started on this project. To this day I have never met anyone more motivated to help the blind than Jim Thatcher.
Screen Reader for OS/2 Problem
Jim was obsessed with building a working GUI screen reader solution. His key motivation was a fellow researcher, Jesse, who was losing his sight. Jim, who headed the GUI screen reader project for OS/2, spent months at IBM Hursley learning about the construction of the OS/2 desktop user interface and making the correct technical contacts.
He also worked to get funding for the project, which back then was something IBM did because it was the right thing to do. The big hurdle was being able to capture what was being drawn on the GUI desktop and putting it in an offscreen model to be read by a screen reader (Screen Reader/2). There was no roadmap for doing this. I just had to figure it out.
One of my traits is I am the type of person who can start with nothing and figure it out. I don’t get deterred easily. Kelvin Lawrence, one of Jim’s OS/2 contacts, had created graphics engine hooks for another IBM project that enabled me to go in and intercept the graphics engine drawing APIs in order to create an offscreen model.
These low-level drawing calls, such as drawing text at a position on the screen, or drawing a rectangle were used to draw text, blinking carets, highlighted text, move windows, etc. Now I had the text and other information to recreate a model of the screen (offscreen model) that included things like text and its attributes (color, position, font, font style, etc.), carets, highlighted text, icons, and much more.
Screen Reader/2 would read what I gave them through an API and present the information to the blind user. This API was used as my part in the foundation of the Java Accessibility API with Sun. I should also point out that Screen Reader/2 had a scripting language itself, called PAL, that was used to determine what was read to the blind user and it could also be used to customize the user experience. This was another first. PAL was developed by Jim and another researcher David Jameson.
I remember one of the aha moments where the OS/2 control panel (system settings) would crash constantly due to a design issue in my model and debugging it was a nightmare. I would use the OS/2 kernel debugger to dump diagnostic information out to an RS232 serial terminal. One morning I figured it out and left the 3.5-inch floppy on Jim’s desk to try out.
Jim worked on the PAL script used to generate the speech interface for the blind. There was an eruption of elation in the room and the solution required a bit of a redesign of how the text was merged in the model but we were on our way. Jim was notorious for his emotional discussions on technical issues.
We actually worked quite well together. I recall the meeting where we had one of those “emotional discussions” and you could hear our screaming and yelling two to three hallways down. The great thing about those discussions was when we left the room we thought nothing of it. We had come up with a solution.
Another big challenge with this project was the code had to be fast and operate at a privileged operating system level without blowing the desktop out of the water. My prior work on embedded real-time system development really helped a lot. Before IBM I worked on embedded real-time systems, namely the F15E Weapon system and a digital signal processing system for the oilfield industry.
My first real appreciation for the importance of the work was when Jim was asked to go to a meeting in Washington, DC in early 1992 to talk about the loss of access to the computer by the blind as the industry moved from DOS to Windows.
Jim recounted going up to the second floor of the government building and saw stacks of hundreds of copies of Byte Magazine containing my December 1991 “Making the GUI Talk” article. The world knew we had solved the problem and thanks to IBM, everything would be alright.
Working with Jim Thatcher on the Screen Reader for Windows
Yet, the reality was the world was moving to Windows and not OS/2. Windows GUI access for the blind still needed to be solved. Jim had a way of motivating me by saying that we would not be able to solve this on OS/2. Well, for me that was pretty much all you had to say. At this point, I was working part-time out of my house in Columbia, CT (I used to drive 120 miles one way to work back then).
I went to work on figuring out how to hack the Windows display drivers to capture what was drawn on the screen. I recall one morning, it was 50 degrees in the basement and I had just loaded the woodstove. I had created a fake display driver that would act as a proxy to the actual display driver and started sending messages to the debugging terminal.
I recall being successful when the Windows hourglass showed up and my debug messages started coming out to the terminal. I had solved step one. Jim was in disbelief. The next hurdle was getting the information over to OS/2 while running Windows applications seamlessly on the desktop to populate the same OS/2 offscreen model which Screen Reader could access.
Most communication mechanisms to do this were very slow. So slow, in fact, that it took Screen Reader/2 10 seconds to hear the file menu to be announced in Windows. Solving this required my having to hack into OS/2 source code and reverse engineer an undocumented operating system feature to pass information between OS/2 and Windows.
This was how we had to work back then. I recall Jim going away for Christmas break and coming back to find the performance problem solved. I firmly believe that life sets you on a path you might never expect and all my prior job skills uniquely enabled me to contribute to this project and help make it a success.
I should point out that this is my part of the project. Actually building a production-grade Screen Reader/2 that could be used by the blind and in multiple languages was done by the Boca Raton Special Needs team.
Also, it is important to note another thing about IBM that uniquely enabled us to solve this problem. At the time IBM actually had quite a few blind employees and customers who helped us build a user experience that really made them productive. That is not the case in most companies today.
Finally, If not for IBM Special Needs Systems we would not have put out the first working Screen Reader for OS/2 and Windows applications (running on OS/2). We changed the world by making the GUI accessible on the PC and laid the foundation for accessibility technologies such as IAccessible2, Java Accessibility, and WAI-ARIA.
Beyond his amazing technical work, I would like to also add a special thanks to Jim Thatcher as he was the man who got the funding for Screen Reader/2 and the beginnings of IBM Special Needs Systems. Without him, none of this would have happened.
You’re an avid scuba diver that you relocated to Bonaire. How does your accessibility experience come to play there?
This Foundation works to preserve Bonaire’s heritage sites. In addition to IBM, I also volunteered as chair for the board of Knowbility which worked to break down barriers to people with disabilities. Heading a non-profit is quite a different animal than doing technical work heading up a non-profit. Helping a non-profit that supports those in need is activism in one of its truist forms.
You learn to work with volunteers, which is far different than working with employed workers. They are not there to receive a paycheck. You also have to work harder to reach out to the community to get help. These skills helped me to reach out to local Bonaire people to get help learning about the areas we were protecting and to get help conveying the issues around developers wanting to destroy heritage sites for profit.
Another thing I did when moving to Bonaire was follow a lifelong dream to be an underwater cinematographer. I started a YouTube channel called a Diver’s Life. Due to my accessibility heritage, I subtitle all the videos in English to reach a broader audience. Yet, the native language of Bonaire, Aruba, and Curacao is Papiamento. I needed to be able to better reach this audience to educate them about the sea.
Unfortunately, YouTube had no Papiamento subtitle support. I reached out to the former IBM employee, Aaron Leventhal, who worked with me on WAI-ARIA. Aaron worked for Google. I asked him if he would put in a change request to support Papiamento subtitle support in YouTube.
After a year, this support showed up and I worked with one of the local people I met while working for the foundation and asked them to add Papiamento subtitle support for one of my videos which let people experience our restricted Willem Alexander Reserve where diving and snorkeling have been prohibited for over 40 years to make people better aware why we protected it.
I am now reaching out to government officials to see how we can fund people to work at home and get paid to add Papiamento subtitle support to videos produced by the government and businesses. It’s much like Knowbility’s Access Work project which empowers people with disabilities to do usability testing for companies and augment their income.
In actual fact, subtitles make content more inclusive to all users and is a much more cost-effective solution to doing this than recreating the entire video to be narrated in all the languages you wish to reach. It is also much more effective in quickly reaching these broader audiences. Nobody on Bonaire has contacts like that or the awareness of how using assistive technology (subtitles) could be used to make information more inclusive to all people on the islands.
What is an accessibility barrier you would like to see solved?
What I would like to see solved is accurate voice-to-text. So many people rely on existing voice recognition solutions to create accurate subtitles on YouTube. These solutions are excellent tools but they are not the final solution. Subtitles have improved but they are really only 70% accurate at best which is not enough to check the box.
Also, existing voice to text doesn’t help command and control, or transcription, by people who have speech impediments due to cognitive issues (Down syndrome), hearing impairments, and so on.
What I have learned being on an island with multiple languages (Dutch, English, Spanish, Papiamento, Chinese, and some Portuguese) is that the need to support multiple languages is a big issue. In line with that, I would like to see language translation capability to and from Papiamento. I understand that it is a smaller market than other languages but communication is one of the biggest barriers we all face.