Concept

Introduction

TextConverter solves a common problem.  You have data in files (PDF, DOC, etc.) or web pages (HTML) and you need to get this data into your system (accounting system, database, etc.).  TextConverter  automatically extracts the data from your files and prepares it in a format that is ready for your system.  How does TextConverter do it?  Most computer generated documents contain data organized into a multi-level hierarchical structure with reoccurring patterns of layout and formatting. A typical printed report consists of a hierarchy of chapters containing summary information, each with a series of details located on the lowest level of the hierarchy.

TextConverter can automatically reverse engineer your input files back to the framework of the system that produced them.  The task of extracting data from such documents comes down to identifying data fields and records, extracting the data, and saving data as related tables thus reconstructing the original relational database structure that was used for the document’s generation.

Before TextConverter, the only options were time consuming and expensive programming-based parsing with custom software or in some relatively simple cases, the task could be resolved by employing regular expression based tools like Datawatch's Monarch.  TextConverter is more powerful and efficient than Datawatch's Monarch and less expensive than employing a programmer to construct custom software.  TextConverter has also replaced many manual data entry processes saving our customers substantial sums of money.

TextConverter provides a much more efficient solution based on the unbeatable combination of automatic pattern recognition and scripting support.  TextConverter delivers the best of both worlds:  the ease of automation and the virtually unlimited flexibility of VBScript programming.

High Performance

TextConverter has no data volume limitations and provides ultimate speed of processing when used without scripting.  Our testing shows a substantial increase in performance over competing solutions.  TextConverter's algorithms are optimized for ordinary machines and do not require large blocks of memory or specialized hardware.  TextConverter's performance will scale directly with faster processors and faster I/O devices.

Automation

TextConverter is a programmable automation component equipped with application programming interfaces (API). It can be integrated into your applications using virtually any programming environment (including scripting engines). The run-time automation modules can load and customize projects produced by the TextConverter authoring tool. You can find the information necessary for working with TextConverter as an automation component here. The redistribution instructions can be found in the Developer version.  We will be happy to assist you with configuring your first project at little or no cost (depending on the complexity of the project).

Technical Support

After purchasing TextConverter you will have access to our technical support service provided through the Customer Support and Data Extraction Service sections on our site. You can use them to receive help on configuring your TextConverter projects. Submit your task and you will receive a fully configured TextConverter project within a few hours. Customers are also eligible for up to 30 minutes of free project development as well as free online support.

Data Conversion Service

At any time you can submit your task to us using SiMX's Data Extraction Service (or via email) and receive a converted data and/or a configured TextConverter project for the local data processing automation within a few hours.

Step by step

Computer generated files with text have an implicit structure.  Typically this structure includes rigid recurring patterns that while not always obvious, can be converted into highly structured data.  TextConverter's greatest benefit is the ability to break down this complex task of pattern recognition into simple steps.

At a high level, the steps to a conversion process are as follows:

1. Open your file to be converted.

2. Use text highlighting in the input text preview pane to identify fields' values or tags or entire records

3. Use fly or main menu commands to generate input dictionary and additional templates.

4. Make adjustments to the templates and input fields objects properties to achieve the desired results.

5. Connect to your output database.

6. Run the conversion!

To learn more, go to the TextConverter's Concept Explained

Related Sections

Getting Started

User Interface

Setting up a conversion: Step-by-Step

Samples and Walkthroughs

Programmer's Reference