|
||||||||||||||||||||||||||||||||||||||||||||||||||
Protected PDFs - A Rant and Solution2007-09-04Protected PDFs - A Rant and SolutionBefore I begin, let me make something perfectly clear: GW Micro does not condone the act of hacking or circumventing security restrictions explicitly applied to protect content in Adobe PDF files. If an author set a password on a PDF document, they probably did so for a reason, and we're not in the business of defrauding those trying to safeguard their livelihood. With that in mind, on to my rant. We tout support for protected PDFs in Window-Eyes, so what the heck am I going on about? PDF protection isn't as clear as on or off. Using Adobe Acrobat, when an author makes the conscious decision to protect a PDF document, they can choose to add a password, restrict editing and printing, restrict copying images, text, and other content, and (hold on to your seats) restrict text access for screen readers. Yep, you heard that last one correctly. Adobe provides authors with the ability (pun intended -- you'll know why in a second) to explicitly deny access to assistive technology. This aberration is clearly marked with a check box labeled, "Enable text access for screen reader devices for the visually impaired." I applaud Adobe for taking the lead in creating accessible electronic documentation by providing access to PDF documents, but I will never understand the inclusion of an option that gives someone the ability to decide whether or not accessibility should exist. That check box should have never been created, and it needs to be removed. Accessibility is something that should not be decided by a flick of a mouse button from the hand of a sighted person who doesn't have the first clue as to why a blind person needs access to a PDF in the first place. Accessibility should not be optional, and that scenario is precisely the reason why I have no objections to providing a solution to access restricted information, assuming that you legally own the PDFs that you need access to. Let me make more perfectly clear what I previously made perfectly clear: we are not looking to break the security model of PDF files. We’re not talking about removing passwords, or enabling the ability to modify the text of a PDF. We don’t want you to be able to print when you want to print, copy when you want to copy, or anything along those lines. Protected PDFs are a decent way to protect content, just like password protected Word documents, password protected ZIP files, secure web pages, emails, and so on. We are highly sensitive to the need for security, and even implement our own security models wherever we can. We are instead simply providing a solution that provides access to text that has been unduly restricted, most likely due to the ignorance of the individual who enabled the restrictive security methods. And, once the process is all said and done, it’s really no different than printing a PDF, scanning the result, and OCR’ing into your favorite word processor. In fact, if the printing security restriction has been enabled, this trick won’t work anyway. I think I’ve disclaimed enough, so let’s move on. Although there are various means to access protected PDF text (many of them quite actionable if you don't legally own the PDF in question), I'm going to discuss one that uses the Microsoft Office Document Imaging feature available with Microsoft Office 2003 and up. The basic gist of the process involves printing a PDF to the Microsoft Office Document Image Writer, and then using the OCR features of the Microsoft Office Document Imaging application to provide the text to Microsoft Word. First make sure you have Microsoft Office 2003 installed along with the Microsoft Office Document Imaging feature (which, I believe, is installed by default, at least with the Professional Edition of Microsoft Office 2003). Next, make sure you have either Adobe Reader or Adobe Acrobat installed, which you would need anyway to read non-protected PDF files. Finally, you'll need the PDF file that you can't read through normal Adobe means. Here’s the step by step:
There are a few things to note about this process.
Although I’ve been discussing this method for use with restricted PDFs, it will also work fairly well with PDFs that contain nothing but images. If you don’t have access to another utility that boasts PDF OCR capabilities, this may be a good solution for you. For example, I took a screen shot of a web page, and created a PDF out of it; the PDF contained nothing but an image of what was on my screen. I ran it through this process, and for the most part, the text on the web page was readable. PDF files, in general, are very accessible despite their enigmatic stigma. Adobe even provides their own methods of tweaking accessibility settings (i.e. changing reading order, overriding tagged order, etc.). There’s even an Accessibility Quick Check in the Acrobat Reader (even more detailed Accessibility tools in the full Adobe Acrobat) for examining documents, and reporting problems to the PDF author. Now you have an additional resource when you encounter a not-so-friendly PDF file that doesn’t live up to good accessibility standards. Do you have any other tips for reading PDF files? Comments, Pingbacks:
Comment from: manosinu [Visitor]
The screen-capturing proves solid but gruesomely tiresom if you'd want to do it manually. Try www.copistar.com that will do it automatically and create a printable pdf.
Comment from: web design [Visitor] · http://www.xelonline.com
How about printing the document as a text file as explained here http://sethf.com/infothought/blog/archives/000751.html
Am I missing something here?
I printed the protected document using the (print to file) MS XPS Document Writer which immediately produced a .xps file. I opened the xps file in Acrobat and then saved it as a pdf file. The protection was removed.
Does anyone know of any software that prevents this from working? I'm looking into publishing a book online and want some protection for it. I am looking at www.locklizard.com and it seems to do everything (prevents screen capture, print limits and prevents printing to file). Does anyone know any more about it or about other software?
Comment from: Geno [Visitor]
I'm a college student working late on a paper and spent damn near an hour downloading shitty demo versions of pdf decryptors that only worked on half the document unless I paid $30 bucks. So thanks, you're a life saver!
Leave a comment:
|
Archive
SearchMisc |
|||||||||||||||||||||||||||||||||||||||||||||||||
|
© GW Micro, Inc. All Rights Reserved. GW Micro, Inc. 725 Airport North Office Park Fort Wayne, IN 46825Ph: 260-489-3671 Fax: 260-489-2608 www.gwmicro.com sales@gwmicro.com support@gwmicro.comHours: M-F, 8a-5p, EST |
||||||||||||||||||||||||||||||||||||||||||||||||||