Home > mailing lists

Re: The Future of Aggregation - Mailing list pgsql-hackers

From	David Rowley
Subject	Re: The Future of Aggregation
Date	June 10, 2015 05:51:10
Msg-id	CAKJS1f-cMeRGi5sw48svaPLzsady5zXrwCd6kWh9P3a=wHBDcA@mail.gmail.com Whole thread Raw
In response to	Re: The Future of Aggregation (Kevin Grittner <kgrittn@ymail.com>)
Responses	Re: The Future of Aggregation (Kevin Grittner <kgrittn@ymail.com>)
List	pgsql-hackers

Tree view

On 10 June 2015 at 02:52, Kevin Grittner <kgrittn@ymail.com> wrote:

David Rowley <david.rowley@2ndquadrant.com> wrote:

> The idea I discussed in the link in item 5 above gets around this
> problem, but it's a perhaps more surprise filled implementation
> as it will mean "select avg(x),sum(x),count(x) from t" is
> actually faster than "select sum(x),count(x) from t" as the agg
> state for avg() will satisfy sum and count too.

I'm skeptical that it will be noticeably faster. It's easy to see
why this optimization will make a query *with all three* faster,
but I would not expect the process of accumulating the sum and
count to be about the same speed whether performed by one
transition function or two. Of course I could be persuaded by a
benchmark showing otherwise.

Thanks for looking at this.

Assuming that if we reuse the avg(x) state for count(x) and sum(x) then it will perform almost exactly like a query containing just avg(x), the only additional overhead is the call to the final functions per group, so in the following case that's likely immeasurable:

/* setup */ create table millionrowtable as select
generate_series(1,1000000)::numeric as x;

/* test 1 */ SELECT sum(x) / count(x) from millionrowtable;
/* test 2 */ SELECT avg(x) from millionrowtable;

Test 1:

274.979 ms

272.104 ms

269.915 ms

Test 2:

229.619 ms

220.703 ms

234.743 ms

(About 19% slower)

The good news is that it's not slower than before, so should be acceptable, though hard to explain to people.

Regards

David Rowley

pgsql-hackers by date:

From: David Rowley
Date: 10 June 2015, 05:32:21
Subject: Re: Aggregate Supporting Functions

From: David Rowley
Date: 10 June 2015, 05:54:00
Subject: Re: The Future of Aggregation

Re: The Future of Aggregation - Mailing list pgsql-hackers

Previous

Next