The GW Micro blog has been discontinued. For instant updates on GW Micro products and events, follow us on Twitter, and like us on Facebook.
Response to Serotek's mirror driver document
by Doug on Wednesday, March 14 2007Recently, a document titled, "The Facts About Mirror Drivers in Assistive Technology" was brought to my attention. This document was co-authored by Matt Campbell and Mike Calvo from Serotek. Unfortunately the document was written as an advertising technique interspersing fact with fiction, smoke and mirrors. This document has gained some attention because of the scare tactics employed. I plan on using my experience (over 25 years worth) in the assistive technology community and strong relationship with Microsoft over most of this time to clear up the fact from fiction. There are so many things that caught my attention while reading this document it is hard to know where to begin. I hope to cover all the issues. However, the biggest thing that caught my attention was the omission of the technique actually used by System Access. Knowing a great deal about screen readers, I made a guess at what they must be doing. We verified this using a demonstration version of their software. After confirming my guess was correct, I now understand why no specifics were mentioned. They used terms such as “emerging alternative techniques” which of course sounds impressive. I want to first start by examining these techniques they are using and then comparing them with what serious AT products are using along with Microsoft’s full support. System Access is not using an emerging technique. In fact this technique they are now using was used by Window-Eyes 1.0 under Windows 3.1x and even used by Window-Eyes 5.5 under Windows 9x/Me. The approach used is called API hooking. Basically, the application is hooking or patching Windows itself on the fly which means when their application is launched it hooks into the operating system. This is where you find, for example, the location where an application would send text to be displayed on the screen. You then insert a bit of your own code into that location which in turn causes the execution to temporarily jump to your own code first. This allows you to see the information before Windows gets a chance to fully process the information. Once you've had your look at the data, you simply call back to the original location you hooked into. To call this an emerging technique and imply this is the future is completely false and deceptive. In fact I remember reading a white paper written by Microsoft’s Greg Lowney back in the late 80’s discussing this exact technique. Greg was the sole accessibility person at Microsoft at that time. He, along with other Microsoft engineers, came up with this approach in order to make Windows 3.1X accessible to screen readers. As mentioned, we used this API hooking for all of Windows 3.1x, 95, 98, and Me. Window-Eyes never used a display driver for those operating systems, though the document implies all screen readers did. Not only is API hooking not a new technology, it has many issues—which is why Window-Eyes is moving away from API hooking. Microsoft is also pushing AT products to stop using API hooking, mainly because of security. Let me outline some of the issues:
- Microsoft has only put up with API hooking because there was nothing better. They obviously don’t want applications to hook into their operating system. Microsoft has, however, been working on long term solutions such as UI Automation (UIA). I’ll talk a bit more about UIA later. But it is interesting that Microsoft is pushing hard to stop API hooking, yet the Serotek document states API hooking is a state of the art, long term solution.
- Obviously hooking into Windows code itself can make your system unstable. You are basically rewriting Windows code on the fly.
- Oddly enough, API hooking is a technique that spyware and rootkits use to infect your system, so as anti-malware software gets smarter it will start to target your assistive technology.
- API hooking won’t work on a system that’s properly secured against unauthorized writes to code segments, which makes API hooking impossible. Again, Microsoft's long-term goal is to disallow hooking because of stability and security issues.
- API hooking depends on Microsoft not switching to new technologies that dynamically rewrite function prologues and epilogues to deter stack-smashing and other attacks. Security is top on Microsoft’s hit list.
- API hooking makes no provisions for co-existence with other applications also dependent on API hooking. The document spent a great deal of time criticizing DCM on co-existence yet made no reference about co-existing with other API hooking applications. Launching and unlaunching API hooking applications in the wrong order can easily make your system unstable. This is not a long term solution.
- API hooking could very easily disable HD video content as Microsoft could choose to interpret nonstandard access to code segments as hostile activity typical of that used by applications that are used today to hijack protected Windows Media content. Again, API hooking is a huge security hole.
- API hooking can miss graphical information that is generated and presented entirely at the kernel level (this is a much lower level then API hooking allows). Serotek just lucked out that all of that content is accessible through MSAA and other more advanced technologies that wouldn’t be there if AT vendors hadn’t been working with Microsoft a decade ago to move away from antiquated technology like API hooking.
- Off Screen Model (OSM) – This was well described in the original document. Techniques used to populate the OSM are API hooking and display drivers (whether using DCM or mirror). Window-Eyes uses DCM for Windows 2000, 2003, and XP and uses mirror drivers for Windows Vista. API hooking is used as sparingly as possible as we continue to wean off a dying technology.
- Win32 controls – Window-Eyes has rich support for talking directly to standard Windows controls such as buttons, listview, and treeviews. Applications which use standard Windows controls or subclass standard controls will work extremely well.
- Microsoft Active Accessibility (MSAA) – This is a standard application programming interface (API) that was designed by many assistive technology companies along with Microsoft back in the '90s. This was a good solution at the time but had limited expandability making it difficult for today’s applications to make themselves accessible exclusively with MSAA.
- IA2 – This is an open source extension to MSAA. IBM spearheaded the development of this approach. It is very new and has lots of potential. Window-Eyes 6.0 ships with IA2 support. This support will grow as more developers implement this technology.
- UI Access (UIA) – This is Microsoft's replacement for MSAA. Where IA2 is an extension to MSAA, UIA is basically a new approach that is extensible. Although it is a new approach, Microsoft has made it relatively easy for application developers to switch from MSAA to UIA. Currently Window-Eyes does not support UIA, and to my knowledge no screen reader does. This will certainly change as this powerful approach gets used in newer applications.
- Document Object Model (DOM) – This is basically an interface into an application itself. It typically offers a very rich interface giving very detailed information regarding the data. For example the Window-Eyes Office support uses the Word, Excel, PowerPoint, and Outlook DOM. It also uses the IE DOM. Working with an application’s DOM, as mentioned, allows access to very detailed information. But not all applications have a DOM, or they may not make it available outside their own application. But Window-Eyes obviously can’t support every unique application's DOM. So only major applications like Office and IE were considered.
- Hack – Because Window-Eyes is expected to work with all applications and all operating systems, if the above techniques fail for one reason or another then we must come up with some other approach. Whether we hack or not, we always work with the manufacturer to fix the accessibility hole correctly so such hacks aren’t necessary in the future. We go with the hack approach kicking and screaming.
- The document stated “…it is highly unusual for display drivers to be given the information that assistive technology products need. For example, it is very rare, and even considered strange, that a display driver would receive the text to be rendered to the display as text. However, this happened to be the case with Windows.” - My response: There's an actual advantage to providing textual information, because high-end display adapters can do hardware TrueType rendering. It’s not a historical accident, it’s by design. (Though actually, most text output these days shows up as glyph indices, not as text.) In fact, those DDI calls still happen at some level, even with Aero on, for non-WinFX apps. They just go to a software driver that Microsoft didn’t want us to chain to for reasons of security and stability. For WinFX apps, there’s UIA.
- The document stated “Luckily for the entrenched assistive technology vendors, mirror drivers are still based on the Windows 2000 display driver model, so they still receive the level of information that assistive technology products need.” – My response: Microsoft actually improved mirror driver support in Vista to the point where it’s actually used by AT software, because they themselves realized that there are still lots of products that need that level of information. And not just accessible technology, either: Microsoft’s own remote desktop uses that “old” display driver model when dealing with GDI-based applications, which is to say most applications, because it’s more efficient than trying to squirt a bunch of bitmaps through a wire.
- The document stated “Direct3D is apparently available, but it might be limited while a mirror driver is active. After all, the Desktop Window Manager, which uses Direct3D, is disabled while a mirror driver is active; perhaps DWM is being disabled because the mirror driver introduces some limitations in Direct 3D.” – My response: This is pure speculation, and probably not true. DWM is disabled because the classical uses for mirror drivers don’t work efficiently with pure bitmaps, as was more or less already said. This is just to introduce an element of fear without any factual information to back it up.
- The document stated “Clearly, Microsoft’s primary reason for retaining the Windows 2000 display driver model in Windows Vista was to maintain compatibility with existing hardware. However, as hardware advances, Microsoft will probably want to eliminate the Windows 2000 display driver model entirely in a future version of Windows.” - My response: No, it was to maintain compatibility with those thousands of GDI-based applications people use every day. Even under the Vista driver model, there’s still a stub driver that implements DDI, because those GDI apps are going to be around a lot longer than your old video card. Microsoft clearly has no problem with expecting you to buy new hardware, and most of their OS sales are for new systems anyway.
- The document stated “While no currently known applications require DWM to be enabled, it’s conceivable that future applications wishing to take full advantage of Windows Vista’s much-touted new features, will require DWM. Furthermore, it’s likely that Microsoft will focus future user interface developments on DWM and pay much less attention to the classic, pre-DWM windowing system.” My reply: Again, a scare tactic based solely on speculation. But if turning off DWM is supposed to be so detrimental to proper operation of Vista, how do they explain the fact that some versions of Vista itself disable DWM all the time?
- The document stated “The entrenched assistive technology vendors make the seemingly reasonable assertion that disabling Windows Aero does not cause any loss in functionality but only changes how the desktop looks, which is irrelevant to blind users anyway." They go on to state two reasons this is false. First because sighted people just aren’t going to comprehend the screen if aero isn’t enabled and secondly the lack of DWM seriously degrades the functionality of the system. My response: Yet another scare tactic. Aero is just a theme that you can choose to use or not use. Claiming that sighted users won't comprehend Vista without Aero is like saying that because you prefer to use high contrast black, it would be impossible for a sighted person to recognize and deal with your system. Speaking as a sighted person this is simply false. I think we need to give sighted people a little more credit here. I would like to discuss this with more low vision users but my first impression is that the glass effect would be more of a detriment to low vision users in the first place. Aero is a theme, which means it is a choice. If Microsoft were so dependent on Aero then it wouldn’t have been an option, it would require all video cards to support it, and RDP and terminal services would support it.




