In this exercise we use an external file (or database) to transform data extracted from an input data source. In particular, we will add the county name to addresses in our input data source. First we will add fields for the countys to the output dictionary, then we will use use the ZIP code in our data source to look up each county from an external database file. Finally, we will load the county name in the output database or flat file. For this exercise, in addition to the input data source, we will also need the file "ca_zip.dbf" provided as a sample file with your software. location: InstallDir/Samples/TextConverter/Automatic Mode/2 With script/ project file: InstallDir/Samples/TextConverter/Automatic Mode/2 With script/7 TansformLookup.ConverterX 1. Clear the project by pressing the New button (
2. Make sure the "Open file as" option is set to "Pdf".
3. Open the sample file provided with the software named "dataquick.pdf" 6. Create fields to hold the name of the county for the lender and the site. a. Go to the output dictionary (upper right pane) and scroll down the list to the bottom.
b. Create two new fields by right clicking in the space below the last field and select "New Field" form the fly menu.
c. Name the first new field "LenderCounty" and the second new field "SiteCounty", set the type to "string", and set the widths to 20. Note: You may choose any names that fits your particular naming convention. You may also choose any widths, but the width of the data source for the lookup is 20 so you can be assured it will not produce wider data. d. Right click on the new fields and choose "Create Script Conversion" from the fly menu. The script tab will open and the following text will be inserted automatically: '----------------- OnRecord ----------------- 7. Create a DB Data Object Variable to represent the external look up file. This file contains the county names that we want to add. Note: If the variable pane is hidden, click the blue pencil icon to restore it. a. Right click on the left side of the scripting pane under the word "Variable" and from the fly menu choose "Data Properties" then "DB". 8. Use the new variable to find the county and load the new field.b. This will create a DB Variable called "DB". Click on the name and change it to "CA_ZIP" c. Double click on the new variable and the "Property Editor" dialog will appear. On the "Data Source" tab, click the open data source icon and open the file "ca_zip.dbf" provided as a sample file with your software. location: InstallDir\Samples\TextConverter\Automatic mode\2 With script It took about 10 minutes to create the ca_zip.dbf file. First, we found a web page with city, ZIP, and county for California. Next we downloaded the web page HTML file and extracted the data with TextConverter. Finally, we set the output data source to the file ca_zip.dbf and ran the extraction (see Step 7.1 Extract data from webpages for a detailed explanation on how to do this). a. Add a line to the script to find the record where the ZIP code in Field_15 (the Lender ZIP code) matches the ZIP code in the look up file. We will use ".scope.Set" and a SQL expression to find each match. Now go to the end of line two "Function OnRecord", hit return, and type CA_ZIP. (see the fly menu pop up when you add the dot), choose or type "scope.", (menu again) choose "Set", then the expression "zip like '" & Left(DictOut.Field_15.value, 4) & "%'". You can even save some typing by dragging in Field_15 from the output dictionary.
In Step 4 we combined the values from several input fields into one output field. Notice how "&" is used in the same way in this exercise to build the SQL expression for the scope set. This is the ease and power of Visual Basic (VB). b. Add another line and type the If Then statement that will load the county data. It goes like this If CA_ZIP.Select() Then DictOut.Lender_County.value = Leave this part and replace the placeholder "value" with CA_ZIP.GetCellValue("County") End If c. The whole thing looks like this: '----------------- OnRecord ----------------- Function OnRecord CA_ZIP.scope.Set "zip like '" & Left(DictOut.Field_15.value, 4) & "%'" If CA_ZIP.Select() Then DictOut.LenderCounty.value = CA_ZIP.GetCellValue("COUNTY") End If End Function e. Now copy and paste to create a second set of the four lines between "Function OnRecord" and "End Function", then edit these new lines using "Field_34" and "SiteCounty". The final version of the script should look just like the sample provided. Here it is: '----------------- OnRecord ----------------- Function OnRecord CA_ZIP.scope.Set "zip like '" & Left(DictOut.Field_15.value, 4) & "%'" If CA_ZIP.Select() Then DictOut.LenderCounty.value = CA_ZIP.GetCellValue("COUNTY") End If CA_ZIP.scope.Set "zip like '" & Left(DictOut.Field_34.value, 4) & "%'" If CA_ZIP.Select() Then DictOut.SiteCounty.value = CA_ZIP.GetCellValue("COUNTY") End If End Function d. Click the "Check Syntax" button in the lower right corner to check for errors in syntax. Click the preview tab to see the results (you will have to scroll all the way to the right to see the new county field). 9. Now you are ready to connect the output data source and run the extraction process. Save your project - TextConverter stores all of the current settings in a project file. To save your project, choose "Save Project" form the file menu or click the disc icon (
Learn more with Step 7.1 Extract data from a webpage (or go back to Step 6. Extract second level data) |