User Guide : Best Practices : Bulk or Batch Pattern Integrations : Large File Splitting
 
Share this page             
Large File Splitting
When datasets are too large to store in message objects, you can use a process design workflow to split the source file into smaller datasets. Those datasets can be managed within message objects while the original file, and possibly the output of the integration, is read from and written to the disk only once for persistence. Make sure to copy large files to a working directory on the same file system as the Runtime Engine to reduce network latency.
There are specialized iterators to help with EDI X12 format files but the File Transfer Iterator is a good general-purpose tool for splitting files. It has a corresponding File Transfer Aggregator for use downstream if you need to combine the pieces back into a single file.
Wide Files
The files with a large number of fields tend to cause processing performance issues that may not necessarily happen. In legacy systems with interchange file formats such as COBOL, Binary, or Fixed Width ASCII, there was a tendency to include all the fields every time data was exported. If you are not mapping source fields to the target when using these connector types, do not parse them for inclusion in the source schema. Instead combine unused, contiguous fields together.