I recently have been testing the definitive OCR and scanning suite: Pagis Pro Millennium Scanning Suite from ScanSoft.
What, I hear you say, has TNPC suddenly joined the ranks of computer publications that focus on keeping manufacturers and their PR departments happy? Well, although we may or may not be as willing to be seduced as the next guy, in this case there is no real issue. ScanSoft, the Peabody MA-based Xerox spin-off, pretty much IS the OCR/scanning software leader.
Having previously acquired the PaperPort software from Visioneer (which made the primary rival for Pagis in document management), ScanSoft now has bought out Caere, makers of OmniPage, which battled ScanSoft's TextBridge in the OCR realm. If you need products in this category, they most likely will come from ScanSoft.
For now, the company says it intends to maintain all three product lines. A new version of PaperPort has just been released, and previously announced new editions of OmniPage still are on the market as is PageKeeper, Caere's document manager. Company officials would not be pinned down on their exact intentions, but it looks as if Pagis will be positioned as the mainstream product for end-user and small office business use, while PaperPort will be focused on home use and the Caere line will be aimed at larger corporate users.
Happily, the consolidation looks promising. Typically, the software that is bundled with their scanner dictates what users employ for these purposes. Hence, Visioneer customers got PaperPort, HP customers got software from whatever company had most recently cut a deal with HP, and so on. My last HP scanner came bundled with de-contented Caere software, plus an upgrade offer for the "pro" versions, and so I became a Caere user. Frankly, I was never much enchanted by those products, finding them buggy and clunky. In the end, I yearned for the elegant simplicity of the PaperPort software I used in the past.
The Pagis Pro Suite in my testing looks to have a nice blend between simplicity and capability. It installs as a sub-folder of your Windows "My Documents" folder, and nicely integrates with Windows Explorer. It has the usual capabilities you want in a document management program such as the capability to move content to other applications, preview document contents, and do extensive searching. The suite also includes a very good Forms Fill-In software, which solves the age-old problem: how do you "type" in information on a paper form when there is no typewriter in the house? In the examples I tried, it correctly identified where it should place its fields that allow the user to "type" in data.
Less critical is a Copier utility, which did an OK job of making direct scanner-to-printer copies, but didn't match the quality of HP's own utility. Kai's PhotoSoap 2, a basic graphics editing program, also is in the package, and ScanSoft earned points with me by arranging the setup routine so that you can easily skip that installation--much appreciated for users who already have preferred graphics software in place.
And--the centerpiece of the bundle--there is the latest version of TextBridge. I was very impressed with it. Accuracy was noticeably better than I had seen with OmniPage. Remember, of course, that "accuracy" for OCR is always a relative term: even very good software will make errors. But TextBridge made relatively few. It also did a good job of passing documents along to Word 2000 and other applications for editing. I particularly liked TextBridge's ability to convert magazine and other publication pages into Word documents or HTML while preserving much of the original layout and graphics. The biggest obstacle here turned out not to be the software, but the current fashions in layout design, which seem to demand ridiculously busy graphics.
The software is not without flaws, especially in the realm of interfaces. For example, since Pagis uses a proprietary file format, it must translate its items into standard formats before it can pass a document along to other software, and you have to struggle to set that intermediary format. Similarly, TextBridge would make me a lot happier if it didn't open up another instance of Word every time it sent a document to it, and had a simple setting for putting all scanned information into a single font size and style (for when you want pure, raw text entry, free of formatting).
But all in all, this is a very nice package, and one that I use
regularly.
http://www.amazon.com/exec/obidos/ASIN/
B00004S3C6/tnpcnewsletter

