Tuesday 21 August 2012

Transformation Tips

Tips for Aggregator Transformations
Use sorted input to decrease the use of aggregate caches.
Sorted input reduces the amount of data cached during the session and improves session performance. Use this option with the Sorter transformation to pass sorted data to the Aggregator transformation.
Limit connected input/output or output ports.
Limit the number of connected input/output or output ports to reduce the amount of data the Aggregator transformation stores in the data cache.
Filter the data before aggregating it.
If you use a Filter transformation in the mapping, place the transformation before the Aggregator transformation to reduce unnecessary aggregation
Tips for Filter Transformations
Use the Filter transformation early in the mapping.
Steps to Create a Filter Transformation 157 To maximize session performance, keep the Filter transformation as close as possible to the sources in the mapping. Rather than passing rows that you plan to discard through the mapping, you can filter out unwanted data early in the flow of data from sources to targets.
Use the Source Qualifier transformation to filter.
The Source Qualifier transformation provides an alternate way to filter rows. Rather than filtering rows from within a mapping, the Source Qualifier transformation filters rows when read from a source. The main difference is that
the source qualifier limits the row set extracted from a source, while the Filter transformation limits the row set sent to a target. Since a source qualifier reduces the number of rows used throughout the mapping, it provides better performance.
Tips for Joiner Transformations
Perform joins in a database when possible.
Performing a join in a database is faster than performing a join in the session. In some cases, this is not possible, such as joining tables from two different databases or flat file systems. If you want to perform a join in a database,
use the following options:
¨ Create a pre-session stored procedure to join the tables in a database.
¨ Use the Source Qualifier transformation to perform the join.
Join sorted data when possible.
You can improve session performance by configuring the Joiner transformation to use sorted input. When you configure the Joiner transformation to use sorted data, the Integration Service improves performance by minimizing disk input and output. You see the greatest performance improvement when you work with large datasets.
For an unsorted Joiner transformation, designate the source with fewer rows as the master source.
For optimal performance and disk storage, designate the source with the fewer rows as the master source. During a session, the Joiner transformation compares each row of the master source against the detail source. The fewer unique rows in the master, the fewer iterations of the join comparison occur, which speeds the join process.
For a sorted Joiner transformation, designate the source with fewer duplicate key values as the master source.
For optimal performance and disk storage, designate the source with fewer duplicate key values as the master source. When the Integration Service processes a sorted Joiner transformation, it caches rows for one hundred keys at a time. If the master source contains many rows with the same key value, the Integration Service must cache more rows, and performance can be slowed.
Tips for Lookup Transformations
Add an index to the columns used in a lookup condition.
If you have privileges to modify the database containing a lookup table, you can improve performance for both cached and uncached lookups. This is important for very large lookup tables. Since the Integration Service needs to query, sort, and compare values in these columns, the index needs to include every column used in a lookup condition.
Place conditions with an equality operator (=) first.
If you include more than one lookup condition, place the conditions in the following order to optimize lookup performance:
¨ Equal to (=)
¨ Less than (<), greater than (>), less than or equal to (<=), greater than or equal to (>=)
¨ Not equal to (!=)
Cache small lookup tables.
Improve session performance by caching small lookup tables. The result of the lookup query and processing is the same, whether or not you cache the lookup table.
Join tables in the database.
If the lookup table is on the same database as the source table in the mapping and caching is not feasible, join the tables in the source database rather than using a Lookup transformation.
Use a persistent lookup cache for static lookups.
If the lookup source does not change between sessions, configure the Lookup transformation to use a persistent lookup cache. The Integration Service then saves and reuses cache files from session to session, eliminating the time required to read the lookup source.
Call unconnected Lookup transformations with the :LKP reference qualifier.
When you write an expression using the :LKP reference qualifier, you call unconnected Lookup transformations only. If you try to call a connected Lookup transformation, the Designer displays an error and marks the mapping invalid.
Configure a pipeline Lookup transformation to improve performance when processing a relational or flat file lookup source.
You can create partitions to process a relational or flat file lookup source when you define the lookup source as a source qualifier. Configure a non-reusable pipeline Lookup transformation and create partitions in the partial pipeline that processes the lookup source.
Tips for Lookup Caches
Cache small lookup tables.
Improve session performance by caching small lookup tables. The result of the lookup query and processing is the same, whether or not you cache the lookup table.
Use a persistent lookup cache for static lookup tables.
If the lookup table does not change between sessions, configure the Lookup transformation to use a persistent lookup cache. The Integration Service then saves and reuses cache files from session to session, eliminating the time required to read the lookup table.
Tips for Stored Procedure Transformations
Do not run unnecessary instances of stored procedures.
Each time a stored procedure runs during a mapping, the session must wait for the stored procedure to complete
in the database. You have two possible options to avoid this:
  • Reduce the row count. Use an active transformation prior to the Stored Procedure transformation to reduce the number of rows that must be passed the stored procedure. Or, create an expression that tests the values before passing them to the stored procedure to make sure that the value does not really need to be passed.
  • Create an expression. Most of the logic used in stored procedures can be easily replicated using expressions in the Designer.

No comments:

Post a Comment