Getting Started with the API
Let’s assume you are using an IDE to create your first DataFlow application. Follow these steps to get started.
1. Create a new development project in your IDE.
2. Add the DataFlow .jar files to the build path of your project.
3. Associate the DataFlow JavaDoc with the DataFlow .jar files.
4. Create a new class that will contain your DataFlow application (include a main method).
5. Use DataFlow API calls to compose and execute an application (details to follow).
When you create your development project, ensure it is a Java project. This will enable your IDE to build the application correctly. Associate the DataFlow .jar files found in the lib/ directory of the DataFlow SDK installation with your project. Follow the instructions for your IDE to add the DataFlow .jar files to your project’s build path.
Note: The DataFlow install lib directory contains DataFlow-specific .jar files and .jar files that DataFlow depends upon. Depending on which parts of DataFlow you are using, you may not need all of the DataFlow or dependent .jar files. To begin with, add all of the .jar files found in the lib directory to your build path. You can weed out unneeded .jar files later.
The DataFlow API
JavaDocs will be extremely helpful when starting to use the API. Associate the JavaDocs found in the
docs/apidocs directory of the DataFlow installation with the DataFlow .jar files. Follow the instructions for your IDE on how to make this association. After this association is made, your IDE will be able to display JavaDocs for the DataFlow API elements.
The DataFlow API can be used in various ways within your code. The easiest way to get started is to create a Java class with a main() method. Remember that main() methods in Java must be static and take a String[] argument. You can then add invocations of the Java API within the main() method to compose and execute your DataFlow application.
Later on, as you become familiar with the API, you may want to build more complex applications. However a simple main() method is a good way to get started as you learn the API.
The DataFlow API can also be used to extend DataFlow functionality by creating custom operators, functions, and aggregators. Creating custom functionality for DataFlow is covered extensively in a later section.
The next few sections of the document focus on application composition and execution. They will provide details on using the API to compose and execute DataFlow applications.