Open Text Capture Document Reader (DOKuStar)
Document Interpretation
Open Text Capture Document Reader (DOKuStar)
Open Text Capture Document Extraction (DOKuStar)
Open Text Capture Document Validation (DOKuStar)
Adaptive Classification Technology
Adaptive Read Technology
Document Technologies for SharePoint ®
Open Text Capture Document Reader is a software for document analysis that automatically classifies digital documents, extracts data and delivers structured data (e.g. TIFF->XML). The central processing modules of the Document Reader are Document Extraction, Adaptive Recognition Technology (ART) and Adaptive Classification Technology (ACT).
Open Text Capture Document Reader Architecture
Open Text Capture Document Reader works with configurable processing chains or pipelines. In a processing chain, a document passes through one module after another. The whole processing chain appears as a single block to the calling application.
Open Text Capture Document Reader offers significant enhancements above and beyond the functions offered by Open Text Capture Document Extraction (DOKuStar):
- In addition to rule-based classification and data extraction, Open Text Capture Document Reader offers a learning procedure. Referred to as ART (Adaptive Read Technology), this patented procedure allows users to indicate to the system where the required information is located in a document by moving the mouse pointer over sample documents. Afterwards, ART can find this information in similar documents. This new procedure significantly smoothes the system optimisation process and noticeably increases recognition performance within the applications.
- The rule-based classification in Open Text Capture Document Reader can be supplemented with the self-learning process, Adaptive Classification Technology (ACT). The ACT module features an automated, content-based classification of unstructured documents. With the high-performance and easy to use administration tool the recognition accuracy can be optimised significantly.
- Adobe's "Portable Document Format" is becoming more and more important in the world of documents. The PDF/A standard now makes archived PDF documents future-proof, ensuring view ability for the future. This makes PDF a very real alternative to TIFF in archiving scanned documents.
- Open Text Capture Document Reader processes multiple page documents. The software can even support complex document structures with sub-documents. As a result, even extremely demanding mail inbox solutions can be quickly displayed.
- The recognition functions can be quickly adapted to suit specific project requirements using the various programming interfaces. An off-the-shelf system, by contrast, cannot always meet the needs of demanding projects and users. Company databases or flow controls, which depend on the content of a document, can be accessed via .Net or COM interfaces.
- Load balancing in a server cluster ensures that Open Text Capture Document Reader always uses the available processing power. The server interface is just as simple to use as the interface of the recognition software itself. This ensures that Open Text Capture Document Reader can be directly integrated into any enterprise application.
Related Documents
Open Text Capture Document Reader Brochure (English - PDF)

International
Deutsch
Française
Italiano
USA