DataStage Performance Tuning can be done at 3 levels.
- Job level
- Sequence level
- SQL level
- Use suitable configuration file (no.of nodes, hardware config, data volume)
- Pickup proper Partitioning Algorithm for the stages and avoid re-partition the data again (Use Same partition algorithm).
- Sort data before using some stages like aggregate, join, merge.
- Remove unwanted columns, filter rows at earliest/source level.
- Use DB stage SQL to sort, filter and join tables.
- Choose Join, Merge and Lookkup stages based on Data volume.
- Minimize the use Transfer stage.
- User Buffer parameters if required (APT_BUFFER_MAXIMUM_MEMORY (3MB Default- incr upto 30mb), APT_BUFFER_DISK_WRITE_INCREMENT, APT_BUFFER_FREE_RUN).
- Don’t use Run time Column Propagation if not required.
- If there is no dependency, run the jobs in Parallel i.e., create Job Activity (for jobs in Sequencer) in parallel without giving trigger condition.
- Use the Terminator and Exceptional Handler for better terminating the seq.