User Guide : Using Content Extraction Language : Use of Content Extraction in the Integration Platform
 
Share this page                  
Use of Content Extraction in the Integration Platform
Content Extraction Language (CXL) is one product in the integration platform. This topic gives of overview of its place in the product family to explain what it is used for and when it would be used.
Map Designer is an application for the transformation of structured, field- and record-oriented data from one format to another. Its users can convert data in databases, spreadsheets, flat files, ASCII, Binary/EBCDIC, SQL, ISAMs, ODBC, accounting systems, legacy COBOL, math/stats packages, text, and other file formats. Map Designer allows users to filter and edit the data to the exact output format required. Transformations can be set up to run interactively or for automated command-line job execution.
You can extract the desired data fields from unstructured text files and assemble those fields into a flat data record to enable Map Designer to use the data as a source that can then be transformed convert to a target format. The flattening of the source file is accomplished by special scripts that you define using the Content Extraction Language (CXL). These CXL scripts can be created in two ways: by using Content Extractor Editor or by writing them manually in a text file.
Content Extractor Editor is a visual tool to graphically select desired fields from a text file. After the selections have been made, Content Extractor Editor converts them to parameters in a CXL script file to use to flatten the data structure. The user is not required to do any script programming in this case or even know anything about the CXL scripting language. The Content Extractor Editor can be used in the majority of cases where field-oriented text data needs to be flattened for a Map Designer transformation. However, if the target is one of several basic output formats, Content Extractor Editor can write the data directly to those formats.
The second method for defining scripts is to author them directly as an ASCII text file. To do this, you can use the CXL SDK, which includes this documentation, script examples, and a set of scripts for basic purposes. CXL scripting can be used for any text data extraction job, but it is most useful to create or customize complex scripts for text files whose patterns and rules are too difficult to be captured in Content Extractor Editor. Also, when compared to scripts authored using the Content Extractor Editor interface, scripts authored directly using the CXL SDK can have better performance speed when the transformation is executed. CXL provides many advanced data manipulation and formatting capabilities that allow scripts to massage the data while it is being flattened. CXL cannot write output directly but requires Map Designer or Integration Engine for transformation execution.
The remainder of this guide covers the concepts, elements, and syntax of the Content Extraction Language. In addition, multiple examples are provided for language elements, as well as entire CXL scripts.