User Guide : Map Connectors : Types of Connections : Unknown Application or File Format Connections
 
Share this page                  
Unknown Application or File Format Connections
There are several options available in the integration platform for connecting to a file when the original application or file format are unknown.
Unknown File Format
What if you know the particular application type, but do not know the format of the file and the company who wrote the application is no longer in business, or does not provide the file format? Three options for this situation follow:
Intermediate File Format
Examine the customer application and its documentation. Determine whether there is an Export or Save As option. If so, use that option to move data to an intermediate file format that the integration platform can read and transform.
Report File
Examine the customer application and its documentation. Determine whether or not the customer can generate a report file that contains the data they want to transform. If so, generate the report and print the report to a file. This creates what is often called a print image or spool file. After this, use Content Extractor Editor to extract the data from the report and transform it to a more usable format.
Neither of the Above
Use the procedure outlined in the section "Unknown Application." All the same rules apply.
Binary Data Types
Since unknown file formats are often labeled binary. You may be tempted to label anything that is not text as binary data. But if it is binary, it is useful to understand what kind of binary data it is. The integration platform supports some basic types:
Numerics
Graphics and BLOBs
Proprietary Application Data
Compressed Data
Numerics
Numeric data is the classic case of binary data, which is usually embedded as fields within a schema (often interspersed with nonbinary Text Fields). Three common types are listed below:
Packed Decimal—Useful for real numbers, commonly used in COBOL (called COMP-3). Sizes from a 1-byte field and larger.
Binary Integers—Useful for integer numbers, often used in C language (also called Shorts and Longs, COMP in Cobol). Lengths of 1, 2, 4 bytes are common and sometimes 8-byte (64-bit binary size) for huge numbers.
Floating Points—Useful for integer and real numbers, especially if high precision is required (often called floats). Size of 4 or 8 bytes.
Graphics and BLOBs
Graphics and BLOBs (binary large objects) include many file types, including JPG, TIFF, PNG, BMP, fax, and images. These types are typically found as stand-alone files, but can also be embedded inside files.
Often such binary data needs to be transmitted with nonbinary systems (for example, email, ASCII, XML). A common technique for handling such data is to first encode the binary stream into a larger text equivalent stream and then decode it to original binary values at the other end. This method is similar to MIME (Multipurpose Internet Mail Extensions) for email and Simple Object Access Protocol (SOAP).
B64 Encode/Decode Technique
You can do this with Base64 encoding, which the integration platform supports with the B64 Encode and B64 Decode functions. B64 Encode encodes any binary stream, even if you do not know what the binary format is, since it is treated as a stream of bytes. Then you can use B64 Decode to convert in the other direction.
The integration platform does not support transformation of such binary graphical formats from one to the other, but there are many commercial and free, open, and shareware tools that can transform these formats. Then you can return to the integration platform and through map logic and use of EZscript, call external code modules packaged as Java, ActiveX, or DLLs.
Proprietary Application Data
Often programmers create a proprietary file format for storing their application data rather than use a standard DBMS (for example, Oracle) or a file format (for example, XML) for their data storage needs. If their proprietary file format is all text, then the integration platform provides many methods to read and write the data. However, if the file includes binary data, sometimes there is little that the integration platform can do directly with the data.
On the Target Side
Can write file format directly with the Binary connector if it is a pure fixed-length sequential file (uncommon).
Can create any kind of intermediate flat file (such as COBOL, ASCII, XML, Excel) for the target application to import.
On the Source Side
Try the Binary connector on the live data. This works if the fixed-length records are in the file and need to be extracted. Note that this would be possible regardless of underlying programming language used (such as C, COBOL, RPG, Pascal, Basic), since the integration platform supports virtually all underlying data types used by any language on any platform (for example, AS400, VAX, Mainframe).
If the source application is still working, then try to export the data to an intermediate file format that the integration platform can read.
If the source application is still working, then export the data by printing to a file and using Content Extractor Editor to flush out the data.
Compressed Data
Often in binary data, larger (typically text) streams are compressed to save space; the result is intended to be a smaller binary stream. While compression is often done at the file level (for example, ZIP) it can also be done at the field level to save space (although complex issues of fixed-length fields arise).
While compressed data is not technically "private" (compacted to save space), there is no way that the integration platform can expand compressed binary data, unless we have the compression algorithms. While there may be some standards in the compression area, the integration platform does not supply any standard compression and decompression functions in EZscript. EZscript allows you to invoke enables use of external compression and uncompression libraries, even if the integration platform does not work with them directly.