Thread: Re: [PERFORM] not using index for select min(...)

Re: [PERFORM] not using index for select min(...)

From

Josh Berkus

Date:

02 February 2003, 15:25:36

Tom,

> In the end, the only reasonable way to handle this kind of thing is
> to teach the query planner about it. Considering the small number
> of cases that are usefully optimizable (basically only MIN and MAX
> on a single table without any WHERE or GROUP clauses), and the ready
> availability of a SQL-level workaround, it strikes me as a very
> low-priority TODO item.

Low priority for you, Tom. For some of us, it's one of the three most
high-priority "bugs" in PostgreSQL.

I constantly try to sell my clients, and potential clients, on PostgreSQL.
And the two things that trip me up the most frequently are lack of
replication and our dog-slow aggregates. I can usually sell Postgres on our
strong points, but the aggregate issue is *always* a problem. And the "slow
aggregate" problem comes up about twice a week on Performance and three times
a week on SQL.

Regardless of the technical reason, among MSSQL, Oracle, MySQL and PostgreSQL,
we have the slowest performing simple aggregates. It's very well to explain
this is due to our system of extensible aggregates, but if a potential
Postgres developer doesn't want to create custom aggregates, but does want to
use MIN() in a correlated subquery, then they will go to a different RDBMS.

As I said before, I'm absolutely thrilled that you came up with a solution for
COUNT(*) ... GROUP BY queries through Hash Aggregates. That's half the
picture, now we need a way to speed up MIN() and MAX() for simple one-column
expressions. While there is a "workaround" using ORDER BY & LIMIT, this
doesn't work for correlated subqueries or if one wants to evaluate the result
of MAX() in the query. For example, the following query is not possible to
"workaround" in PostgreSQL:

select teams_desc.team_id, team_name, team_code, notes,
min(teams_tree.treeno) as lnode, max(teams_tree.treeno) as rnode,
parent.team_id as parent_id, count(*)/2 as tlevel
from teams_desc JOIN teams_tree USING (team_id)
join teams_tree parent ON parent.treeno < teams_tree.treeno
join teams_tree parents on parents.treeno < teams_tree.treeno
WHERE parent.treeno = (SELECT max(p1.treeno) from teams_tree p1 where p1.treeno < teams_tree.treeno and exists
(selecttreeno from teams_tree p2 where p2.treeno > teams_tree.treeno and p2.team_id = p1.team_id))
AND EXISTS (select parents2.team_id from teams_tree parents2where parents2.treeno > teams_tree.treenoAND
parents2.team_id= parents.team_id)
group by teams_desc.team_id, team_name, team_code, notes, parent.team_id;

While one would hardly expect the above query to be fast, it is dissapointing
that it takes about 8-10 times as long to execute on PostgreSQL as on MSSQL,
since MSSQL seems to be able to use indexes to evaluate all three MIN() and
MAX() expressions.

Further, assigning such a common query function to a Postgres-specific
workaround hardly upholds our project's dedication to standards. The fact
that we are telling new users to use non-SQL-compliant code to do a query
type present in 90% of databases bothers me every single time I give a newbie
that advice.

It still seems to me that if a query's WHERE expression can be evaluated using
an index, then any related MIN() or MAX() expression should be evaluable
using an index. That is, if you are selecting:
SELECT MAX(team_id) FROM teams WHERE team_id BETWEEN 100 and 200;
... with an index on team_id then this entire query should be able to return
trough an index scan. We've discussed the particular planner problems this
presents for PostgreSQL, but I still believe that these are solvable ... and
moreover, that we *need* to solve them if we're going to be competitive with
other SQL RDBMSes.

I do realize that it's my job to find something to do about this issue since
I'm the one so worked up about it. What I'm concerned about is the
possibility of having any idea or fix I come up with dismissed out of hand
because it's a "low-priority todo". Please add up the questions and
complaints of the users on SQL, NOVICE, and PERFORMANCE ... I know you read
them.

Thanks for reading, Tom.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: [PERFORM] not using index for select min(...)

From

Tom Lane

Date:

02 February 2003, 15:36:17

Josh Berkus <josh@agliodbs.com> writes:
> For example, the following query is not possible to 
> "workaround" in PostgreSQL:

> select teams_desc.team_id, team_name, team_code, notes,
> min(teams_tree.treeno) as lnode, max(teams_tree.treeno) as rnode,
> parent.team_id as parent_id, count(*)/2 as tlevel
> from teams_desc JOIN teams_tree USING (team_id)
> join teams_tree parent ON parent.treeno < teams_tree.treeno
> join teams_tree parents on parents.treeno < teams_tree.treeno
> WHERE parent.treeno = (SELECT max(p1.treeno) from teams_tree p1
>         where p1.treeno < teams_tree.treeno
>         and exists (select treeno from teams_tree p2
>             where p2.treeno > teams_tree.treeno
>             and p2.team_id = p1.team_id))
> AND EXISTS (select parents2.team_id from teams_tree parents2
>     where parents2.treeno > teams_tree.treeno
>     AND parents2.team_id = parents.team_id)
> group by teams_desc.team_id, team_name, team_code, notes, parent.team_id;

> While one would hardly expect the above query to be fast, it is dissapointing
> that it takes about 8-10 times as long to execute on PostgreSQL as on MSSQL, 
> since MSSQL seems to be able to use indexes to evaluate all three MIN() and 
> MAX() expressions.

I think you are leaping to conclusions about why there's a speed
difference.  Or maybe I'm too dumb to see how an index could be used
to speed these min/max operations --- but I don't see that one would
be useful.  Certainly not an index on treeno alone.  Would you care to
explain exactly how it's done?
        regards, tom lane

Re: [PERFORM] not using index for select min(...)

From

Josh Berkus

Date:

02 February 2003, 16:25:51

Tom,

> I think you are leaping to conclusions about why there's a speed
> difference.  Or maybe I'm too dumb to see how an index could be used
> to speed these min/max operations --- but I don't see that one would
> be useful.  Certainly not an index on treeno alone.  Would you care to
> explain exactly how it's done?

If I knew that, I'd have proposed a patch already, yes?

I'm working on it.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: [PERFORM] not using index for select min(...)

From

Kevin Brown

Date:

02 February 2003, 19:28:33

Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
> > For example, the following query is not possible to 
> > "workaround" in PostgreSQL:
> 
> > select teams_desc.team_id, team_name, team_code, notes,
> > min(teams_tree.treeno) as lnode, max(teams_tree.treeno) as rnode,
> > parent.team_id as parent_id, count(*)/2 as tlevel
> > from teams_desc JOIN teams_tree USING (team_id)
> > join teams_tree parent ON parent.treeno < teams_tree.treeno
> > join teams_tree parents on parents.treeno < teams_tree.treeno
> > WHERE parent.treeno = (SELECT max(p1.treeno) from teams_tree p1
> >         where p1.treeno < teams_tree.treeno
> >         and exists (select treeno from teams_tree p2
> >             where p2.treeno > teams_tree.treeno
> >             and p2.team_id = p1.team_id))
> > AND EXISTS (select parents2.team_id from teams_tree parents2
> >     where parents2.treeno > teams_tree.treeno
> >     AND parents2.team_id = parents.team_id)
> > group by teams_desc.team_id, team_name, team_code, notes, parent.team_id;
> 
> > While one would hardly expect the above query to be fast, it is dissapointing
> > that it takes about 8-10 times as long to execute on PostgreSQL as on MSSQL, 
> > since MSSQL seems to be able to use indexes to evaluate all three MIN() and 
> > MAX() expressions.
> 
> I think you are leaping to conclusions about why there's a speed
> difference.  Or maybe I'm too dumb to see how an index could be used
> to speed these min/max operations --- but I don't see that one would
> be useful.  Certainly not an index on treeno alone.  Would you care to
> explain exactly how it's done?

Intuitively, it seems that an index on treeno is exactly what would
make the difference -- but min() and max() have to be smart enough to
use them when necessary.

I have a strong suspicion that min() and max() in MSSQL and other
databases are integrated into the parser, planner, and executor
directly.  It's the only way I can think of that would make it
possible for those functions to make use of indexes and other
advantages to the fullest extent possible.  For instance, in the above
query, the max() operation in the subselect would tell the planner and
executor to use the index on p1.treeno for two comparisons
simultaneously: p1.treeno < teams_tree.treeno and max(p1.treeno).
That means that the executor would descend the tree of the (btree)
index and instead of just comparing whether a branch is less than
teams_tree.treeno and following *all* of the branches that qualify, it
would follow the *largest* branch that qualified and nothing else.
That's a very significant optimization of the search, because instead
of eliminating an average of 50% of the branches to follow at each
node, it eliminates all but one.  But it's not something that a naive
aggregate function would be able to do: the min() and max() aggregates
would (I expect) have to become first class objects in the parser,
planner, and executor just as the WHERE clause and its conditions are.

There may be other aggregate functions that can make use of indexes to
the same extent that min() and max() should be able to, but I don't
know what they are offhand, and I certainly doubt that they would be
used nearly as often as min() and max().

Even with our type system, I'd think that min() and max() would be
relatively straightforward as first class objects (well, as
straightforward as any first class object that gets implemented in all
three stages, at any rate!): they work efficiently (that is, can use
an index scan) when the column in question has a btree index on it,
and fall back to sequential scans (and use the appropriate operator,
">" or "<") when the column in question doesn't.  It might even be
reasonable to allow a type to "overload" these functions so that the
planner and executor use the type-provided functions when available
(with the limitation that such type-provided functions would always
require a sequential scan as they do now) and fall back to the builtin
ones when the type doesn't provide them.  I imagine this might
complicate the parser, planner, and executor quite a bit, however.

So the interesting question that arises from the above is: are there
any types that define a min() and max() but which *do not* define "<"
and ">"?  I can't think of such types myself but can imagine that some
esoteric data types might qualify.  For the purposes of optimizing the
common case, however, such esoteric types could easily be ignored, but
it's for their sake that it would be useful to be able to use a
type-defined function in place of min() or max().

-- 
Kevin Brown                          kevin@sysexperts.com