Multi-line Documents

 The steps to a typical multi-line conversion/data extraction process

  1. Load the input file by dropping the file on TextConverter or choosing "Load input file" from the file menu
  2. Click on a set of lines lines for one record of detail information, Right click and choose the green check mark - "Setup fields from selected lines"
  3. Make adjustments to the input fields as required 
  4. Right click away from the detail lines and choose "add top level template"
  5. Setup a tag based or other top level template as required
  6. Set the output data source or data file that will receive the extracted data
  7. Save the project and set up any automation for processing additional files
  8. Run the conversion!
The Kice Quote sample is great for demonstrating generating a multi-line field. Please first load the Kice Quote PDF into TextConverter, and set up a template with the data line, as if you are setting up a Single Line Positional Template.

Multi-level field

  1. You will notice that the single line set up divided the line into several fields. However, the multiple line portion is everything after P4. Therefore, you must merge all the fields P4 and up.
  1.  After you merge the fields, you will see that TextConverter picks up a lot of extraneous information. To handle this, we find a piece of information that is specific to the desired lines. For example, the first field consists of numeric values. Right now, P1 is set to be a "String" value, but we can change it to "Numeric"
  1.  Now that we have our desired results, we can finally set a multi-line field. The multi-line field that we want for extraction is in the fourth field of the detailed layer (P4).
  2. Select P4, and in the Field Properties pane, check off "Multiple-value."
  3. You may need to drag the field over to pick up all the desired data.
  1. You will notice that TextConverter picks up a lot more than we want, as it assumes everything before the next template is desired data.
  2. To accomodate for this, we can set a bottom boundary tag in the Field Properties pane. For this example, we will use "Lead Time," as it is after all the desired data, and is a good separator. However, we cannot be sure that there will only be one space between the two words, so we use regular expressions to solve this issue ("Lead +Time").


Congratulations, you have set a multiple-line field!

Back (Single Line, Positional) | Next (Tag-based fields)