Code for an Aggregate Function

9. Writing Aggregate Functions : Code for an Aggregate Function

Share this page

The code generated into a query plan by Ingres for the evaluation of an aggregate function consists of three parts.

The first part is executed for each new set of GROUP BY column values. In the current implementation, Ingres builds a work field that contains either the “empty” value (as generated by the “getempty” method of the type definition) or if the function is max or min, the minimum or maximum value for the type as generated by the “minmaxdv” method. Each successive set of GROUP BY column values calls this code again to reset the work field to the initialization value.

The second part of code is executed for each row of a particular set of GROUP BY column values and invokes the function variable defined in the aggregate function instance definition. This function variable is passed the parameters defined for the function instance. The first parameter defines the result of the function execution and the second-through-nth-parameters describe the parameters of the aggregate function invocation syntax. Each parameter to the function variable is a pointer to a II_DATA_VALUE structure instance that describes and addresses the corresponding value. Since the function variable is called with each row to be aggregated, it is assumed that it will perform the aggregation into the result parameter. The result parameter is the same work field whose initialization is described in the preceding paragraph.

So, for example, an implementation of the function variable for the “max” aggregate might simply compare the current value of the aggregate parameter with the value in the work field, replacing it if the new value is “larger” (remember, that the work field will be initialized to the minimum value for the data type for each new group of rows). Likewise, an implementation of the function variable for the “sum” aggregate might add the current parameter value to the value in the work field, accumulating the sum in the work field.

The last part of code is executed after each group of rows (defined by a distinct set of GROUP BY column values) is processed. Ingres simply copies the current contents of the work field to the result location (based on the assumption that the aggregate is accumulated in the work field).

Example--Function to perform the “sum” operation on the ORD_PAIR type

** Name: usop_sum() - sum a set of ord_pair's (just sums each element).

** Description:

** Inputs:

** scb Pointer to a session control block.

** rdv Pointer to II_DATA_VALUE to hold

** resulting summed result.

** dv1 Pointer to II_DATA_VALUE of the first

** operand, which is a ORD_PAIR datatype.

** Outputs:

** rdv

** .db_data Pointer to resulting currency value.

** Returns:

** II_STATUS

## History:

## 19-oct-05 (inkdo01)

## Written as proof of concept for UDT aggregation.

II_STATUS

usop_sum(

II_SCB *scb,

II_DATA_VALUE *rdv,

II_DATA_VALUE *dv1)

{

ORD_PAIR *ival, *rval;

ival = (ORD_PAIR *)dv1->db_data;

rval = (ORD_PAIR *)rdv->db_data;

/* Simply accumulate the sums of the x & y values

in the result work field. */

rval->op_x += ival->op_x;

rval->op_y += ival->op_y;

return( II_OK );

}

Note: Ingres currently does not support the AVG operator for user-defined types because Ingres assumes a division operator is not generally available for user-defined types. (AVG is compiled as a SUM divided by a COUNT.) Users, however, can implement a SUM operator and explicitly code “sum(abc) / count(abc) as “avg(abc)” in a query. Also, users can code type-specific functions to perform AVG (for example, avg_op, to compute the average of a set of ordered pairs) using an algorithm appropriate to the type.