- The Aggregator transformation allows us to perform aggregate calculations, such as averages and sums.
- Aggregator transformation allows us to perform calculations on groups.
- Aggregate expression
- Group by port
- Sorted Input
- Aggregate cache
- Entered in an output port.
- Can include non-aggregate expressions and conditional clauses.
- AVG, COUNT, MAX, MIN, SUM
- FIRST, LAST
- MEDIAN, PERCENTILE, STDDEV, VARIANCE
Nested Aggregate Function: MAX( COUNT( ITEM ))
Nested Aggregate Functions
- In Aggregator transformation, there can be multiple single level functions or multiple nested functions.
- An Aggregator transformation cannot have both types of functions together.
- MAX( COUNT( ITEM )) is correct.
- MIN(MAX( COUNT( ITEM ))) is not correct. It can also include one aggregate function nested within another aggregate function
We can use conditional clauses in the aggregate expression to reduce the number of rows used in the aggregation. The conditional clause can be any clause that evaluates to TRUE or FALSE.
- SUM( COMMISSION, COMMISSION > QUOTA )
We can also use non-aggregate functions in the aggregate expression.
- IIF( MAX( QUANTITY ) > 0, MAX( QUANTITY ), 0))
- Indicates how to create groups.
- When grouping data, the Aggregator transformation outputs the last row of each group unless otherwise specified.
For example, we can find Maximum Salary for every Department.
- In Aggregator Transformation, Open Ports tab and select Group By as needed.
- Use to improve session performance.
- To use sorted input, we must pass data to the Aggregator transformation sorted by group by port, in ascending or descending order.
- When we use this option, we tell Aggregator that data coming to it is already sorted.
- We check the Sorted Input Option in Properties Tab of the transformation.
- If the option is checked but we are not passing sorted data to the transformation, then the session fails.
- The Power Center Server stores data in the aggregate cache until it completes Aggregate calculations.
- It stores group values in an index cache and row data in the data cache. If the Power Center Server requires more space, it stores overflow values in cache files.
No comments:
Post a Comment