Aggregate Functions Using the Where and By Clauses
Aggregate functions typically evaluate a column and return a single value (for example, avg(e.age) returns the average of all values in the "age" column of table "e"). This section describes how you can use the where and by clauses to modify the results returned by aggregate functions.
The where clause enables you to qualify (filter) the set of values used to determine the result of the aggregate function. For example,
sum(j.salary where j.salary > 1500
returns the sum of all salaries from table j that exceed 1500.
The by clause causes the function to return a set of results, as opposed to a single result. One result is returned for each grouping specified by the by clause. Think of by as meaning "for each." For example,
avg(e.age by e.dept)
returns an average age for each department in table e.
You can combine the by and where clauses:
avg(e.age by e.dept where e.job=1023)
returns the average age, by department, for employees who have a job code of 1023.
You can use the only where format to skip zero results. For example,
count(emp.salary by emp.dept where emp.salary > 10000)
returns a value for every department, but
count(emp.salary by emp.dept
only where emp.salary > 10000)
returns a value only when there are departments containing employees earning more than 10000.
If you use a by clause on a column that contains nulls, the DBMS Server returns a single result for the rows that contain null in the column specified in the by clause–in other words, nulls are grouped.
The result of the only where clause is affected by the set aggregate project|noproject statement. For more information, see
Set Statement--Set Session Options.
When an aggregate is applied to a nullable column, any nulls are disregarded in computing the aggregate. For example, for the following table "temp":
The statement
retrieve (c = countu(temp.x))
yields
Several variables can appear within a single aggregate function. For example,
avg(j.salary by e.dept where e.job=j.jid)