Home > mailing lists

Re: Parallel Aggregate - Mailing list pgsql-hackers

From	Haribabu Kommi
Subject	Re: Parallel Aggregate
Date	February 8, 2016 09:44:28
Msg-id	CAJrrPGcY3ObCheQXorbd381Y6WFPA9L2YPWgZNpmQwgsCbybiQ@mail.gmail.com Whole thread Raw
In response to	Re: Parallel Aggregate (Robert Haas <robertmhaas@gmail.com>)
List	pgsql-hackers

Tree view

On Mon, Feb 8, 2016 at 9:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>  On Thu, Jan 21, 2016 at 11:25 PM, Haribabu Kommi
> <kommi.haribabu@gmail.com> wrote:
>>  [ new patch ]
>
> This patch contains a number of irrelevant hunks that really ought not
> to be here and make the patch harder to understand, like this:
>
> -                        * Generate appropriate target list for
> scan/join subplan; may be
> -                        * different from tlist if grouping or
> aggregation is needed.
> +                        * Generate appropriate target list for
> subplan; may be different from
> +                        * tlist if grouping or aggregation is needed.
>
> Please make a habit of getting rid of that sort of thing before submitting.

sure. I will take of such things in future.

> Generally, I'm not quite sure I understand the code here.  It seems to
> me that what we ought to be doing is that grouping_planner, right
> after considering using a presorted path (that is, just after the if
> (sorted_path) block between lines 1822-1850), ought to then consider
> using a partial path.  For the moment, it need not consider the
> possibility that there may be a presorted partial path, because we
> don't have any way to generate those yet.  (I have plans to fix that,
> but not in time for 9.6.)  So it can just consider doing a Partial
> Aggregate on the cheapest partial path using an explicit sort, or
> hashing; then, above the Gather, it can finalize either by hashing or
> by sorting and grouping.
>
> The trick is that there's no path representation of an aggregate, and
> there won't be until Tom finishes his upper planner path-ification
> work.  But it seems to me we can work around that.  Set best_path to
> the cheapest partial path, add a partial aggregate rather than a
> regular one around where it says "Insert AGG or GROUP node if needed,
> plus an explicit sort step if necessary", and then push a Gather node
> and a Finalize Aggregate onto the result.

Thanks, i will update the patch accordingly. Along with those changes,
I will try to calculate the cost involved in normal aggregate without
generating the plan and compare it against the parallel cost plan before
generating the actual plan. Because with less number of groups
normal aggregate is performing better than parallel aggregate in tests.


Regards,
Hari Babu
Fujitsu Australia

pgsql-hackers by date:

From: Vitaly Burovoy
Date: 08 February 2016, 09:40:49
Subject: Re: Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011

From: Haribabu Kommi
Date: 08 February 2016, 09:59:53
Subject: Re: backpatch for REL9_4_STABLE of commit 40482e606733675eb9e5b2f7221186cf81352da1

Re: Parallel Aggregate - Mailing list pgsql-hackers

Previous

Next