SiMX Help‎ > ‎

1. TextConverter

SiMX TextConverter is a powerful and easy to use tool for extracting data from popular document formats such as PDF, DOC, RTF, XLS, HTML, CSV, XML or just about an file that contains text. TetConverter offers a flexible and intuitive visual interface combined with advanced support for scripting and professional programmers.

1. Overview

Learn the basics of TextConverter:


2. User Manual

More in-depth instruction on special tasks, and other information:

A. TextConverter, a functional inventory

i. Operating Mode  Set the mode to decide how TextConverter will operate

ii. User Interface  This is a more detailed look at the SiMX TextConverter's User Interface. 

iiiTemplates  A guide to understanding your document for the effective use of visual Templates in Auto Mode. 

ivTemplate Properties  Template properties in Auto Mode establish the action for each template object.


B. Learning Tools

iTutorials  You can use TextConverter with almost any document that contains text. This page lists several video tutorials to help you to successfully implement your TextConverter projects. 

iiVideo Tutorials  Here you can find video tutorials to assist your conversion needs.


3. Developer's Reference:

Information on using script to get even more functionality from TextConverter:


What is TextConverter?

TextConverter solves a common problem: you have data in files (PDF, DOC, legacy reports, etc.) or web pages (HTML) and you need to get this data into your system (accounting system, database, etc.).  TextConverter  automatically extracts the data from your files and prepares it in a format that is ready for your system.  How does TextConverter do it?  Most computer generated documents contain data organized into a multi-level hierarchical structure with reoccurring patterns of layout and formatting. A typical printed report consists of a hierarchy of chapters containing summary information, each with a series of details located on the lowest level of the hierarchy.

TextConverter can automatically reverse engineer your input files back to the framework of the system that produced them.  The task of extracting data from such documents comes down to identifying data fields and records, extracting the data, and saving data as related tables thus reconstructing the original relational database structure that was used for the document’s generation.

Before TextConverter, the only options were time consuming and expensive programming-based parsing with custom software or in some relatively simple cases, the task could be resolved by employing regular expression based tools like Datawatch's Monarch.  TextConverter is more powerful and efficient than Datawatch's Monarch and less expensive than employing a programmer to construct custom software.  TextConverter has also replaced many manual data entry processes saving our customers substantial sums of money.

What's new in TextConverter 4?
Text Converter 4 offers a host of new features: