Re: Parallel Aggregate - Mailing list pgsql-hackers

From David Rowley
Subject Re: Parallel Aggregate
Date
Msg-id CAKJS1f_vRdjpL-CDHvkPWtXE9GN5v8TcBxNpUAejRJ872RgWAA@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Aggregate  (Haribabu Kommi <kommi.haribabu@gmail.com>)
Responses Re: Parallel Aggregate
Re: Parallel Aggregate
List pgsql-hackers
On 17 February 2016 at 17:50, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
> Here I attached a draft patch based on previous discussions. It still needs
> better comments and optimization.

Over in [1] Tom posted a large change to the grouping planner which
causes large conflict with the parallel aggregation patch. I've been
looking over Tom's patch and reading the related thread and I've
observed 3 things:

1. Parallel Aggregate will be much easier to write and less code to
base it up top of Tom's upper planner changes. The latest patch does
add a bit of cruft (e.g create_gather_plan_from_subplan()) which won't
be required after Tom pushes the changes to the upper planner.
2. If we apply parallel aggregate before Tom's upper planner changes
go in, then Tom needs to reinvent it again when rebasing his patch.
This seems senseless, so this is why I did this work.
3. Based on the thread, most people are leaning towards getting Tom's
changes in early to allow a bit more settle time before beta, and
perhaps also to allow other patches to go in after (e.g this)

So, I've done a bit of work and I've rewritten the parallel aggregate
code to base it on top of Tom's patch posted in [1]. There's a few
things that are left unsolved at this stage.

1. exprType() for Aggref still returns the aggtype, where it needs to
return the trans type for partial agg nodes, this need to return the
trans type rather than the aggtype. I had thought I might fix this by
adding a proxy node type that sits in the targetlist until setrefs.c
where it can be plucked out and replaced by the Aggref. I need to
investigate this further.
2. There's an outstanding bug relating to HAVING clause not seeing the
right state of aggregation and returning wrong results. I've not had
much time to look into this yet, but I suspect its an existing bug
that's already in master from my combine aggregate patch. I will
investigate this on Sunday.

In regards to the patch, there's a few things worth mentioning here:

1. I've had to add a parallel_degree parameter to create_group_path()
and create_agg_path(). I think Tom is going to make changes to his
patch so that the Path's parallel_degree is propagated to subnodes,
this should allow me to remove this parameter and just use
parallel_degree the one from the subpath.
2. I had to add a new parameter to pass an optional row estimate to
cost_gather() as I don't have a RelOptInfo available to get a row
estimate from which represents the state after partial aggregation. I
thought this change was ok, but I'll listen to anyone who thinks of a
better way to do it.
3. The code never attempts to mix and match Grouping Agg and Hash Agg
plans. e.g it could be an idea to perform Partial Hash Aggregate ->
Gather -> Sort -> Finalize Group Aggregate, or hash as in the Finalize
stage. I just thought doing this is more complex than what's really
needed, but if someone can think of a case where this would be a great
win then I'll listen, but you have to remember we don't have any
pre-sorted partial paths at this stage, so an explicit sort is
required *always*. This might change if someone invented partial btree
index scans... but until then...

Due to the existence of the outstanding issues above, I feel like I
might be posting the patch a little earlier, but wanted to do so since
this is quite a hot area in the code at the moment and I wanted to
post for transparency.

To apply the patch please apply [1] first.

[1] http://www.postgresql.org/message-id/3795.1456689808@sss.pgh.pa.us

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: pgbench stats per script & other stuff
Next
From: Amit Kapila
Date:
Subject: Re: RFC: replace pg_stat_activity.waiting with something more descriptive