TextConverter's "this" Object

TextConverter methods and properties provide the ability to access different TextConverter objects and their data to modify TextConverter's default behavior. The methods and properties can be referenced from the script by the "this" keyword: this.MethodName. The Script editor conveniently shows all object methods when a period is placed after "this" (pictured below) Once the hint list appears, the method or property name can be typed in either manually or selected from the list. 

Click a method below for an explanation: 

 

Most methods can be used strictly in the context methods, which are called on a per-input-record basis (OnRecord, OnRecordDone). The only exception is IsRunning method, which can  be called in any context method.

Properties

append

This property provides access to TextConverters output option. Append is boolean.  When Append is true, TextConverter will append the existing output file or table or attempt to create the output file or table if it does not exist.  When append is false,  TextConverter will always attempt to create the output file or table.

You can read or update this property.
 

buffer

Buffer related methods and functions are grouped under this object.  In the functions section below see AddRecord, Flush, GetCount, GetField, and GetRecord.

Buffer supports an auto-fill mode. Input records are placed in the buffer automatically in this mode (up to the ‘Buffer size’ property). When the buffer is full it works as a FIFO queue. When ‘auto buffer’ mode is OFF buffer’s behavior/size is controlled by the GetField method which allows scrolling to a buffer’s record and getting a field value in a single operation.

The buffer command syntax found in previous versions (AddToBuffer...) does not work in the current version of TextConverter. In order to have an error-free operation, make sure to change any of the older buffer methods to the new group method.

DictIn

This property provides access to the input fields. The input field values are automatically initialized by TextConverter for each new input record. You may want to retrieve the input field values in your script in order to customize field delimiter recognition or to modify the calculation of output fields.  (See samples and walkthroughs for more information)


You can select an input field from the list which appears when you type in DictIn. Each input field has a single property value, which you can use to get the data from the field.

VB sample

Function OnRecord
DictOut.Field.value = Left( DictIn.Field1.value, 3 ) & Left( DictIn.Field2.value, 4 )
End Function

DictOut

This property provides access to the output fields. All connected output fields are assigned values from the corresponding input fields automatically by TextConverter for each new input record. You can make any customization to the output values through DictOut.  (See samples and walkthroughs to learn more)

You can select an output field from the list which appears when you type in DictOut. Each output field has a single property value, which you can set or get.

VB sample

Function OnRecord
If DictOut.Field.value > 100 Then DictOut.Field.value = 100
End Function

DSout

DSout related properties, methods, and functions are grouped under this object. These methods relate directly to the output dictionary (internal data source). These methods cannot be used to modify or use an external data source (use "DB." to for a lit of methods to modify external sources).

Examples:

You can set output table using two methods: setting the table name (assuming you are connected to a database) - (a) and setting the entire data source - (b):

If you are writing script outside TextConverter the API looks like this:

Set DS = TextConverter.GetOutputDS()

DS.table = "tablename" )               ‘- (a)

DS.SetDS( "c:\data\output.dbf" )       ‘- (b)

 

If you are writing script inside TextConverter it looks like this:

This.DSOut.table = "tablename" )             ‘- (a)

This.DSOut.SetDS( "c:\data\output.dbf" )     ‘- (b)

     

      An Excel file is also considered as a database, so you can set the worksheet as:

   Set DS = TextConverter.GetOutputDS()

   DS.table = "worksheetname" )               ‘- (a)

DS.SetDS( "...\file.mdb:worksheetname" )   ‘- (b)


Methods

SkipRecord

Call this method from your implementation of the OnRecord context methods to indicate that the current output record should not be inserted into the output database table. This function is usually used with a heterogeneous input text when multi-level information should be filtered out in order to insert only a specific level into the output database.  (See samples and walkthroughs for more information)

Function SkipRecord()

Parameters:
none

Return value:
none

VB sample

Function OnRecord
If Left(DictIn.Field1.value, 8) = "Subtotal" Then this.SkipRecord()
End Function

GetValueByTag

Use GetValueByTag when the input fields are marked by tags and are not separated by constant delimiters. After the input value is retrieved, it can be assigned to an arbitrary output field.

Function GetValueByTag( tag, delimiter(optional), start(optional) )

Parameters:
string tag - string value to search for in the input record. Next character after the tag will be the beginning of the input field if found.
string delimiter(optional) - a string indicating the end of the field, you can use the field delimiter syntax for multiple delimiters; the field delimiter option will be used if this parameter is not set. Use VB script string constants for special symbols: vbTab - horizontal tab; vbCr - carriage return; vbLf - line feed or <Tab> - horizontal tab; <CR> - carriage return; <LF> line feed; <FF> form feed.
integer start(optional) - starting position for the search, set this parameter to a value greater than 0 to start the search from a certain position, the first position is 1; set this parameter to 0 to continue search from the the previous successful search position. The default is 1 - the beginning of the line.

Return value:
string - the field' value; empty if the field is not found or has an empty value.

VB sample

    You could handle an input like this:

   Author="Aberer, Karl",
   Title="The Use of Object-oriented Data Models in Biomolecular Databases",
   BookTitle="Conf. on Object-Oriented Computing in the Natural Sciences",
   Address="Heidelberg, Germany",
   Year=1994

with the following script:

Function OnRecord
OutDict.Field1.value = this.GetValueByTag( "Author" )
OutDict.Field2.value = this.GetValueByTag( "Title" )
OutDict.Field3.value = this.GetValueByTag( "BookTitle" )
OutDict.Field4.value = this.GetValueByTag( "Address" )
OutDict.Field5.value = this.GetValueByTag( "Year" )
End Function 

GetVar

This method can only be implemented in TextConverter Developer (not in TextConverter Standard). Call this function to set the variable both in and out of TextConverter in automation (API).

Function GetVar()

Parameters:
none

Return value:
variable

VB sample

set crt = cnv.GetVar ("crt")  

GoNext

Call this function to move through the records of the selected database.

Function GoNext()

Parameters:
none

Return value:
none

VB sample

If IsSSN(c) Then
  FindSSN = p
  Exit Function
Else
  this.GoNext
End If

Cancel

Call this function to stop the conversion procedure. No output records will be appended to the output database table after you call this method.

Function Cancel()

Parameters:
none

Return value:
none

VB sample

Function OnRecordDone( ok )
If Not ok Then this.Cancel()
End Function 

FStrGet

Call this function to receive extended information about a string. Can be used in conjunction with IsFontBold and IsFontItalic.

Function FStrGet(rec_num, from_pos, to_pos)

Parameters:
numeric  record number - the record to be tested
numeric  from - position of record to begin the test
numeric  to - position of record to end the test

Return value:
variant text - the text tested
array segments - the starting points of the segments in the text
array length - the number of characters or spaces between the segments
boolean font ID - if true, the font match the ID it is being tested for (either bold or italic); if false, it does not match the ID

VB sample

Function OnRecord

      r = this.GetRecordNumber
      if r <> 10 and r<>15 then exit function 'looks at lines 10 and 15

      a = this.FStrGet(r, 1, 10)

      c = ""
      for i = 1 to ubound(a)

           for j = 0 to ubound(a(i))
                c = c & a(i)(j) & ","
           next
           c= c & "|"
      next

      target.Message this.IsFontBold(a(3)(0))

End Function 

CloseOutputDS

Call this function to close the output data source. It can be used when a data source needs to be closed after appending. This function can only be used in the OnFinishProcess part of the script.

Function CloseOutputDS()

Parameters:
none

Return value:
none

VB sample

Function OnFinish
this.CloseOutputDS()
End Function 

GetInputFields

Use this function to grab input fields names to insert into an output field.

Function GetInputFields()

Parameters:
none

Return value:
variant - the input field names

VB sample

Function OnRecord
DictOut.InputFields.value = this.GetInputFields
End Function 

IsRunning

The function is helpful if some portions of your script should run in the real mode only, not in the preview mode. The Preview mode is run when any change is made to your setup if preview auto update option is turned on. Real mode is executed when you press the Run button (), i.e. the actual conversion process is started.

Function IsRunning()

Parameters:
none

Return value:
boolean true - script is running in the real mode; false - preview mode

VB sample

Create a proxy database table only in the real mode.

Function OnStartProcess( ok )
If this.IsRunning() Then db.Create()
End Function 

System buffer management methods (Advanced)

Sometimes it is not enough to operate on a per record basis. Let's assume that your input text consists of the following text areas:

  219434342  GASKET SET       123.35    3      370.05     19980301     19980630
  798734776  OIL PUMP         345.23    1      345.23     19970403     19960704
  872375762  VALVE GUIDE       10.25    8       82.00     19980523     19980723
  897987237  VALVE SPRING       4.95    8       39.60     19980523     19980723
  987987765  PISTON SET       805.00    2    1,610.00     19980523     19930913

SUBTOTAL ENGINE PARTS                  22    2,446.88         

  987293744  TORSION BAR      218.50    2      437.00     19980427     19981010
  098230984  SWAY BAR         399.00    1      399.00     19981010     19970119
  958430987  CV JOINTS        318.75    3      956.25     19881112     19980909
  092834844  STRUT ASSEMBLY   449.00    3    1,347.00     19970617     19970530

SUBTOTAL SUSPENSION                     9    3,139.25

and we would like to merge the subtotal information for each group into each detailed record. The methods, which accumulate, query and extract the data from the system buffer can facilitate the task.


First of all, we do not want TextConveter to insert any output records into the database until we encounter a closing line for each text block.  Therefore we call SkipRecord at the beginning of the OnRecord method implementation (line 2).  Then we accumulate all input records in the system buffer (line 16) until we encounter a closing line (line 5). After that we iterate through all of the accumulated records (line 6), take some data from the closing line (lines 7-9), fetch an input record from the buffer (line 10) and get some more data for each detailed record (lines 11-12). Now we can insert an output record into the output database table (line 13). We flush the system buffer (line 15) to get ready for the next block of records.

 1. Function OnRecord
 2. this.SkipRecord()
 3. Dim input
 4. input = DictIn.Field_1.value
 5. If Left(input, 3) = "SUB" Then
 6.      For I = 0 To this.GetBufferCount() - 1
 7.          DictOut.GroupName.value = Mid( input, 10, 25 )
 8.          DictOut.GroupQty.value    = Mid( input, 43, 4 )
 9.          DictOut.GroupCost.value  = Mid( input, 49, 10 )
10.          this.GetFromBuffer( I )
11.          DictOut.ItemNo.value  = Mid( DictIn.Field_1.value, 3, 9 )
12.          DictOut.Descript.value = Mid( DictIn.Field_1.value, 14, 19 )
13.          this.AppendRecord()
14.     Next
15.     this.FlushBuffer()
16. Else this.AddToBuffer()
17. End If
18. End Function

AddToBuffer

 AddToBuffer adds the current input record to the system buffer. 

Function AddToBuffer()

Parameters:
none

Return value:
none

(See the sample for System buffer management methods )

GetBufferCount

Returns the number of input records in the system buffer.

Function GetBufferCount()

Parameters:
none

Return value:
integer - the number of records in the system buffer

(See the sample for System buffer management methods.)

GetFromBuffer

Fetches an input record from the system buffer using its index and initializes the input fields of the DictIn object. You can consider the fetched record to be the current input record.

Function GetFromBuffer(index)

Parameters:
integer index - the index of the record to fetch from the system buffer. The value should be greater or equal to 0 and less than GetBufferCount()

Return value:
none

 (See the sample for System buffer management methods.)

FlushBuffer

 Flushes the content of the system buffer to get it ready for another cycle of record accumulation.

Function FlushBuffer()

Parameters:
none

Return value:
none

 (See the sample for System buffer management methods.)

AppendStart

Prepares for a record insertion into the output database table. Useful for splitting an input record into several output records or for simply generating a new output record.  You must finish the process with AppendRecord (below)

Function AppendStart()

Parameters:
none

Return value:
none



Function OnRecord
this.AppendStart
DictOut.Field1.value = val1
DictOut.Field2.value = val2
DictOut.Field3.value = val3
this.AppendRecord
End Function

'Example 2
'----------------- OnStartProcess -----------------
Function OnStartProcess 'creates header record
this.AppendStart
DictOut.Field_1.value = "HEADER RECORD" 'literal value can be anything
this.AppendRecord
End Function


AppendRecord

Inserts the current output record into the output database table. Returns the unique identifier for the record if such supported by the underlying database system.

Function AppendRecord( get_id(optional) )

Parameters:
boolean get_id(optional) - true: ask for the record's unique identifier to be returned if possible. Default - false

Return value:
variant - the record's unique identifier if applicable



Function OnRecord
this.AppendStart
DictOut.Field1.value = val1
DictOut.Field2.value = val2
DictOut.Field3.value = val3
this.AppendRecord
End Function

'Example 2
'----------------- OnFinishProcess -----------------
Function OnFinishProcess 'creates footer
this.AppendStart
For i = 1 to dictout.GetFieldCount(): Dictout.SetFieldValue i, Empty: Next ' clears the fields of prior data
DictOut.Field_1.value = "FOOTER RECORD" ' insets leteral but could be variable
this.AppendRecord

this.AppendStart 'creates empty line
For i = 1 to dictout.GetFieldCount(): Dictout.SetFieldValue i, Empty: Next  ' clears the fields of prior data
this.AppendRecord
End Function


GetInputFile

Inserts the current input data source file name into the output database table.

Function GetInputFile()

Parameters:
none

Return value:
variant - the input data source file name

Function OnRecord
DictOut.Field1.value = this.GetInputFile
End Function

GetOutputFields

Use this function to grab output fields names to insert into an output field.

Function GetOutputFields()

Parameters:
none

Return value:
List of output fields

VB sample

Function OnRecord
DictOut.InputFields.value = this.GetOutputFields
End Function 

GetPageNumber

Inserts the page number from which the record was obtained into the output.

Function GetPageNumber()

Parameters:
none

Return value:
variant - the input data source page number.

Function OnRecord
DictOut.SetFieldValue "page no", this.GetPageNumber()
End Function

GetRecordNumber

Calls on the record number to use as part of a function.

Function GetRecordNumber()

Parameters:
none

Return value:
variant - the record number.

Function OnRecord
If this.GetRecordNumber() = 2 Then title = Trim(DictIn.Parcel.value)()
End Function

IsDelimited

When using automatic extraction and the "Show all lines" option, IsDelimited separates records extracted by implied delimiters from other text data in the document.

The main distinction between the lines suppressed by default (or shown with the "Show all lines" option) and the rest of the document is that for the rest of the document TextConverter's AI was able to identify implied delimiters.  This function separates the data that IsDelimited from the data that is not.  The example script will use SkipRecord to keep the non-delimited records from the automated extraction.  You can then extract those records manually.

Function IsDelimited()

Parameters:
none

Return value:
boolean - true or false

Function OnRecord
If this.IsDelimited() then
Else
    this.SkipRecord
End If
End Function

SetInputFile

Defines the input file if processing multiple documents. Can only be called when using OnSetUp (or through the API).  Do not SetInputFile OnStartProcess as the results may be inconsistent.

Function SetInputFile()

Parameters:
The desired input file

Return value:
None

Function OnSetUp
dim inputA
inputA = "C:\Users\Name\Desktop\dataquick.pdf"
this.SetInputFile(inputA)
End Function

Related Sections

Conversion Concept

Setting up a conversion process step-by-step

Samples and Walkthroughs

Scripting

Automation

Comments