Optimizing maximum/minimum queries (yet again) - Mailing list pgsql-hackers

We've been talking about this long enough ... let's try to actually do it ...

In the last go-round, the thread starting here,
http://archives.postgresql.org/pgsql-hackers/2004-11/msg00371.php
we spent a lot of time agonizing over GROUP BY and whether the
optimization is usable for anything beside MAX/MIN.  However, no one
presented any concrete reasons for thinking we could use it for any
other aggregates.  And using it with GROUP BY would require some
fundamental restructuring of the way we do grouping --- right now we
have to look at every row anyway to compute the grouping sets, so
there's no possibility of gain from optimizing the aggregates.

But the real issue is that no one seems willing to tackle the complete
problem.  So, let's forget those things and see if there is something
within reach if we just focus on MAX/MIN without grouping.

I realized today that this may not be as hard as I thought.
Specifically, I'm imagining that we could convert
SELECT min(x), max(y) FROM tab WHERE ...

into sub-selects in a one-row outer query:
SELECT (SELECT x FROM tab WHERE ... ORDER BY x LIMIT 1),       (SELECT y FROM tab WHERE ... ORDER BY y DESC LIMIT 1);

Breaking it down like that means we can consider each aggregate
independently, which definitely simplifies matters.  This still allows
us to handle complex combinations like
SELECT min(x), max(y) - min(y), max(z) FROM tab WHERE ...

as well as aggregates in the HAVING clause --- the restriction is just
that each aggregate in the query has to be an optimizable MIN or MAX.
(If any are not, we're going to end up scanning the table anyway, and I
see no point in doing separate subqueries for some of the aggregates in
that case.)  We replace each aggregate call with a sub-SELECT, drop the
outer FROM and WHERE clauses, and there we are.

So I propose the following restrictions:

1. We will consider only aggregate queries with no GROUP BY clause.

2. The query can only reference one table.  (There doesn't seem to be
any way to handle join conditions reasonably.  We could possibly handle
Cartesian-product cases, but I don't see the value.)

When we have such a query, we'll scan the targetlist (and HAVING clause
if any) to examine each aggregate function.  We can only optimize if every
aggregate is convertible into an index scan, which requires the following
conditions:

1. The aggregate has an associated sort operator according to pg_aggregate
(see below).

2. The aggregate's argument matches a column of some btree index of the
table, and the sort operator matches either the "<" or ">" member of the
column's opclass.

3. If the column is not the first one of its index, then we must have an
indexable equality constraint (x = something) in WHERE for each earlier
column.  We can also make use of an inequality on the target column
itself, if it is of the opposite direction from the aggregate operator
(for example, if the aggregate operator matches the opclass "<", we can
use x > something or x >= something).  Such an inequality will form a
starting boundary for the index scan.

This means that we can for example optimize cases like
SELECT MIN(x) FROM tab WHERE x > 42

which turns into a forward indexscan starting at the x > 42 boundary, or
SELECT MIN(x) FROM tab WHERE y = 42

which needs an index on (y,x) and starts at the first y = 42 entry.
(Note: this doesn't work with a constraint like y >= 42 ...)

The fact that an indexscan conversion is possible doesn't necessarily
make it a good idea.  For example consider the identical query
SELECT MIN(x) FROM tab WHERE y = 42

when we have separate indexes on x and y.  If the first row with y = 42
is far into the x index, the "optimized" version could be far slower than
our normal implementation (which would probably do an indexscan on y,
instead).  In general, any constraints that don't fit into our indexscan
plan have to be looked at with suspicion.  So AFAICS we have to plan the
query both ways and estimate which is cheaper.  But we are only going to
do this for single-table queries, so the extra planner overhead won't be
large.

Catalog representation:

We must have a way to identify an aggregate as being related to an
index's sort order.  I am envisioning adding a column to pg_aggregate
that is either 0 (for a non-max/min aggregate) or the OID of an operator
that represents the sort order associated with the aggregate.
The rule is that these two queries must give the same answer:
SELECT agg(col) FROM tab;
SELECT col FROM tab ORDER BY col USING operator LIMIT 1;

that is, the aggregate value is equivalent to the first value in the sort
order induced by the operator.  For example, we associate min(int) with
int4lt and max(int) with int4gt.  (We also assume the aggregate must return
NULL for no input.)

The reason for associating with operators, and not directly to opclasses,
is that this will successfully handle both regular and reverse-sort
opclasses.

Adding such a column is just a small matter of programming.  We'd need to
add a clause to CREATE AGGREGATE to let user-defined min/max aggregates
in on the fun, but that's no big deal.

Comments?  Anyone see anything I missed?
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Sergey E. Koposov"
Date:
Subject: Tab-completion feature ?
Next
From: Bruno Wolff III
Date:
Subject: Re: Optimizing maximum/minimum queries (yet again)