Home > mailing lists

Re: [PATCH] Lazy hashaggregate when no aggregation is needed - Mailing list pgsql-hackers

From	Ants Aasma
Subject	Re: [PATCH] Lazy hashaggregate when no aggregation is needed
Date	June 15, 2012 10:24:07
Msg-id	CA+CSw_vMYNKU9kAEO0w+TZBrFxLMQfz6gdO4hNq+pR96d4DS5w@mail.gmail.com Whole thread
In response to	Re: [PATCH] Lazy hashaggregate when no aggregation is needed (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: [PATCH] Lazy hashaggregate when no aggregation is needed
List	pgsql-hackers

Tree view

On Fri, Jun 15, 2012 at 3:13 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> However, as Ants points out, we could make it work better for the
> special case where we're not actually doing any aggregation, because
> in that case we can emit the row for each group when the group is
> created, rather than waiting until end-of-input.  This is only going
> to help when there is a LIMIT, though.  Moreover, if there happens to
> be an ORDER BY, then the data will have to be pre-sorted, in which
> case you may as well use a sorted aggregate.  So the use case for this
> optimization is basically DISTINCT plus LIMIT but not ORDER BY.

Exactly. I think the first question for this patch should be whether
this use-case is worth the complexity of the patch. I can't imagine
any really compelling use cases that need an arbitrary distinct subset
of results. The original complaint on -performance [1], didn't specify
a real world use case, but it seemed to be a case of an ORM generating
suboptimal queries. On the other hand, the patch itself is in my
opinion rather simple, so it might be worth it.

It has one outstanding issue, query_planner chooses the cheapest path
based on total cost. This can be suboptimal when that path happens to
have high startup cost. It seems to me that enabling the query_planner
to find the cheapest unsorted path returning a limited amount of
tuples would require some major surgery to the planner. To be clear,
this is only a case of missed optimization, not a regression.

It won't help set returning functions because the tuplestore for those
is fully materialized when the first row is fetched.

[1] http://archives.postgresql.org/message-id/16737833.463.1332881676120.JavaMail.geo-discussion-forums%40pbcpw7

Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

pgsql-hackers by date:

From: Merlin Moncure
Date: 15 June 2012, 09:18:50
Subject: Re: libpq compression

From: Merlin Moncure
Date: 15 June 2012, 10:43:26
Subject: Re: measuring spinning

Re: [PATCH] Lazy hashaggregate when no aggregation is needed - Mailing list pgsql-hackers

Previous

Next