Re: Parallel Aggregate - Mailing list pgsql-hackers

From Haribabu Kommi
Subject Re: Parallel Aggregate
Date
Msg-id CAJrrPGcY3ObCheQXorbd381Y6WFPA9L2YPWgZNpmQwgsCbybiQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Aggregate  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Feb 8, 2016 at 9:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>  On Thu, Jan 21, 2016 at 11:25 PM, Haribabu Kommi
> <kommi.haribabu@gmail.com> wrote:
>>  [ new patch ]
>
> This patch contains a number of irrelevant hunks that really ought not
> to be here and make the patch harder to understand, like this:
>
> -                        * Generate appropriate target list for
> scan/join subplan; may be
> -                        * different from tlist if grouping or
> aggregation is needed.
> +                        * Generate appropriate target list for
> subplan; may be different from
> +                        * tlist if grouping or aggregation is needed.
>
> Please make a habit of getting rid of that sort of thing before submitting.

sure. I will take of such things in future.

> Generally, I'm not quite sure I understand the code here.  It seems to
> me that what we ought to be doing is that grouping_planner, right
> after considering using a presorted path (that is, just after the if
> (sorted_path) block between lines 1822-1850), ought to then consider
> using a partial path.  For the moment, it need not consider the
> possibility that there may be a presorted partial path, because we
> don't have any way to generate those yet.  (I have plans to fix that,
> but not in time for 9.6.)  So it can just consider doing a Partial
> Aggregate on the cheapest partial path using an explicit sort, or
> hashing; then, above the Gather, it can finalize either by hashing or
> by sorting and grouping.
>
> The trick is that there's no path representation of an aggregate, and
> there won't be until Tom finishes his upper planner path-ification
> work.  But it seems to me we can work around that.  Set best_path to
> the cheapest partial path, add a partial aggregate rather than a
> regular one around where it says "Insert AGG or GROUP node if needed,
> plus an explicit sort step if necessary", and then push a Gather node
> and a Finalize Aggregate onto the result.

Thanks, i will update the patch accordingly. Along with those changes,
I will try to calculate the cost involved in normal aggregate without
generating the plan and compare it against the parallel cost plan before
generating the actual plan. Because with less number of groups
normal aggregate is performing better than parallel aggregate in tests.


Regards,
Hari Babu
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Vitaly Burovoy
Date:
Subject: Re: Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011
Next
From: Haribabu Kommi
Date:
Subject: Re: backpatch for REL9_4_STABLE of commit 40482e606733675eb9e5b2f7221186cf81352da1