Thread: Parallel Queries and PostGIS

Parallel Queries and PostGIS

From

Paul Ramsey

Date:

28 March 2016, 16:19:02

I spent some time over the weekend trying out the different modes of
parallel query (seq scan, aggregate, join) in combination with PostGIS
and have written up the results here:

http://blog.cleverelephant.ca/2016/03/parallel-postgis.html

The TL:DR; is basically

* With some adjustments to function COST both parallel sequence scan
and parallel aggregation deliver very good parallel performance
results.
* The cost adjustments for sequence scan and aggregate scan are not
consistent in magnitude.
* Parallel join does not seem to work for PostGIS indexes yet, but
perhaps there is some magic to learn from PostgreSQL core on that.

The two findings at the end are ones that need input from parallel
query masters...

We recognize we'll have to adjust costs to that our particular use
case (very CPU-intensive calculation per function) is planned better,
but it seems like different query modes are interpreting costs in
order-of-magnitude different ways in building plans.

Parallel join would be a huge win, so some help/pointers on figuring
out why it's not coming into play when our gist operators are in
effect would be helpful.

Happy Easter to you all,
P

Re: Parallel Queries and PostGIS

From

Stephen Frost

Date:

28 March 2016, 16:45:48

Paul,

* Paul Ramsey (pramsey@cleverelephant.ca) wrote:
> I spent some time over the weekend trying out the different modes of
> parallel query (seq scan, aggregate, join) in combination with PostGIS
> and have written up the results here:
>
> http://blog.cleverelephant.ca/2016/03/parallel-postgis.html

Neat!

Regarding aggregate parallelism and the cascaded union approach, though
I imagine in other cases as well, it seems like having a
"final-per-worker" function for aggregates would be useful.

Without actually looking at the code at all, it seems like that wouldn't
be terribly difficult to add.

Would you agree that it'd be helpful to have for making the st_union()
work better in parallel?

Though I do wonder if you would end up wanting to have a different
final() function in that case..

Thanks!

Stephen

Re: Parallel Queries and PostGIS

From

Paul Ramsey

Date:

28 March 2016, 17:01:50

On Mon, Mar 28, 2016 at 9:45 AM, Stephen Frost <sfrost@snowman.net> wrote:
> Paul,
>
> * Paul Ramsey (pramsey@cleverelephant.ca) wrote:
>> I spent some time over the weekend trying out the different modes of
>> parallel query (seq scan, aggregate, join) in combination with PostGIS
>> and have written up the results here:
>>
>> http://blog.cleverelephant.ca/2016/03/parallel-postgis.html
>
> Neat!
>
> Regarding aggregate parallelism and the cascaded union approach, though
> I imagine in other cases as well, it seems like having a
> "final-per-worker" function for aggregates would be useful.
>
> Without actually looking at the code at all, it seems like that wouldn't
> be terribly difficult to add.
>
> Would you agree that it'd be helpful to have for making the st_union()
> work better in parallel?

For our particular situation w/ ST_Union, yes, it would be ideal to be
able to run a worker-side combine function as well as the master-side
one. Although the cascaded union would be less effective spread out
over N nodes, doing it only once per worker, rather than every N
records would minimize the loss of effectiveness.

P

Re: Parallel Queries and PostGIS

From

Paul Ramsey

Date:

29 March 2016, 19:06:30

On Mon, Mar 28, 2016 at 9:18 AM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:

> Parallel join would be a huge win, so some help/pointers on figuring
> out why it's not coming into play when our gist operators are in
> effect would be helpful.

Robert, do you have any pointers on what I should look for to figure
out why the parallel join code doesn't fire if I add a GIST operator
to my join condition?

Thanks,

P

Re: Parallel Queries and PostGIS

From

Robert Haas

Date:

29 March 2016, 19:42:03

On Mon, Mar 28, 2016 at 12:18 PM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
> I spent some time over the weekend trying out the different modes of
> parallel query (seq scan, aggregate, join) in combination with PostGIS
> and have written up the results here:
>
> http://blog.cleverelephant.ca/2016/03/parallel-postgis.html
>
> The TL:DR; is basically
>
> * With some adjustments to function COST both parallel sequence scan
> and parallel aggregation deliver very good parallel performance
> results.
> * The cost adjustments for sequence scan and aggregate scan are not
> consistent in magnitude.
> * Parallel join does not seem to work for PostGIS indexes yet, but
> perhaps there is some magic to learn from PostgreSQL core on that.
>
> The two findings at the end are ones that need input from parallel
> query masters...
>
> We recognize we'll have to adjust costs to that our particular use
> case (very CPU-intensive calculation per function) is planned better,
> but it seems like different query modes are interpreting costs in
> order-of-magnitude different ways in building plans.
>
> Parallel join would be a huge win, so some help/pointers on figuring
> out why it's not coming into play when our gist operators are in
> effect would be helpful.

First, I beg to differ with this statement: "Some of the execution
results output are wrong! They say that only 1844 rows were removed by
the filter, but in fact 7376 were (as we can confirm by running the
queries without the EXPLAIN ANALYZE). This is a known limitation,
reporting on the results of only one parallel worker, which (should)
maybe, hopefully be fixed before 9.6 comes out."  The point is that
line has loops=4, so as in any other case where loops>1, you're seeing
the number of rows divided by the number of loops.  It is the
*average* number of rows that were processed by each loop - one loop
per worker, in this case.

I am personally of the opinion that showing rowcounts divided by loops
instead of total rowcounts is rather stupid, and that we should change
it regardless.  But it's not parallel query's fault, and changing it
would affect the output of every EXPLAIN ANALYZE involving a nested
loop, probably confusing a lot of people until they figured out what
we'd changed, after which - I *think* they'd realize that they
actually liked the new way much better.

Now, on to your actual question:

I have no idea why the cost adjustments that you need are different
for the scan case and the aggregate case.  That does seem problematic,
but I just don't know why it's happening.

On the join case, I wonder if it's possible that _st_intersects is not
marked parallel-safe?  If that's not the problem, I don't have a
second guess, but the thing to do would be to figure out whether
consider_parallel is false for the RelOptInfo corresponding to either
of pd and pts, or whether it's true for both but false for the
joinrel's RelOptInfo, or whether it's true for all three of them but
you don't get the desired path anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Parallel Queries and PostGIS

From

Paul Ramsey

Date:

29 March 2016, 19:49:03

> First, I beg to differ with this statement: "Some of the execution
> results output are wrong! ...."  The point is that
> line has loops=4, so as in any other case where loops>1, you're seeing
> the number of rows divided by the number of loops.  It is the
> *average* number of rows that were processed by each loop - one loop
> per worker, in this case.

Thanks for the explanation, let my reaction be a guide to what the
other unwashed will think :)

> Now, on to your actual question:
>
> I have no idea why the cost adjustments that you need are different
> for the scan case and the aggregate case.  That does seem problematic,
> but I just don't know why it's happening.

What might be a good way to debug it? Is there a piece of code I can
look at to try and figure out the contribution of COST in either case?

> On the join case, I wonder if it's possible that _st_intersects is not
> marked parallel-safe?  If that's not the problem, I don't have a
> second guess, but the thing to do would be to figure out whether
> consider_parallel is false for the RelOptInfo corresponding to either
> of pd and pts, or whether it's true for both but false for the
> joinrel's RelOptInfo, or whether it's true for all three of them but
> you don't get the desired path anyway.

_st_intersects is definitely marked parallel safe, and in fact will
generate a parallel plan if used alone (without the operator though,
it's impossibly slow). It's the && operator that is the issue... and I
just noticed that the PROCEDURE bound to the && operator
(geometry_overlaps) is *not* marked parallel safe: could be the
problem?

Thanks,

P

Re: Parallel Queries and PostGIS

From

Paul Ramsey

Date:

29 March 2016, 19:52:07

On Tue, Mar 29, 2016 at 12:48 PM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:

>> On the join case, I wonder if it's possible that _st_intersects is not
>> marked parallel-safe?  If that's not the problem, I don't have a
>> second guess, but the thing to do would be to figure out whether
>> consider_parallel is false for the RelOptInfo corresponding to either
>> of pd and pts, or whether it's true for both but false for the
>> joinrel's RelOptInfo, or whether it's true for all three of them but
>> you don't get the desired path anyway.
>
> _st_intersects is definitely marked parallel safe, and in fact will
> generate a parallel plan if used alone (without the operator though,
> it's impossibly slow). It's the && operator that is the issue... and I
> just noticed that the PROCEDURE bound to the && operator
> (geometry_overlaps) is *not* marked parallel safe: could be the
> problem?

Asked and answered: marking the geometry_overlaps as parallel safe
gets me a parallel plan! Now to play with costs and see how it behaves
when force_parallel_mode is not set.

P.

>
> Thanks,
>
> P

Re: Parallel Queries and PostGIS

From

Robert Haas

Date:

29 March 2016, 20:14:09

On Tue, Mar 29, 2016 at 3:48 PM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
>> I have no idea why the cost adjustments that you need are different
>> for the scan case and the aggregate case.  That does seem problematic,
>> but I just don't know why it's happening.
>
> What might be a good way to debug it? Is there a piece of code I can
> look at to try and figure out the contribution of COST in either case?

Well, the cost calculations are mostly in costsize.c, but I dunno how
much that helps.  Maybe it would help if you posted some EXPLAIN
ANALYZE output for the different cases, with and without parallelism?

One thing I noticed about this output (from your blog)...

Finalize Aggregate(cost=16536.53..16536.79 rows=1 width=8)(actual time=2263.638..2263.639 rows=1 loops=1)  ->  Gather
(cost=16461.22..16461.53rows=3 width=32)  (actual time=754.309..757.204 rows=4 loops=1)        Number of Workers: 3
  ->  Partial Aggregate        (cost=15461.22..15461.23 rows=1 width=32)        (actual time=676.738..676.739 rows=1
loops=4)             ->  Parallel Seq Scan on pd              (cost=0.00..13856.38 rows=64 width=2311)
(actualtime=3.009..27.321 rows=42 loops=4)                    Filter: (fed_num = 47005)                    Rows Removed
byFilter: 17341Planning time: 0.219 msExecution time: 2264.684 ms
 

...is that the finalize aggregate phase is estimated to be very cheap,
but it's actually wicked expensive.  We get the results from the
workers in only 750 ms, but it takes another second and a half to
aggregate those 4 rows???

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Parallel Queries and PostGIS

From

Paul Ramsey

Date:

29 March 2016, 20:33:09

On Tue, Mar 29, 2016 at 1:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Mar 29, 2016 at 3:48 PM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
>>> I have no idea why the cost adjustments that you need are different
>>> for the scan case and the aggregate case.  That does seem problematic,
>>> but I just don't know why it's happening.
>>
>> What might be a good way to debug it? Is there a piece of code I can
>> look at to try and figure out the contribution of COST in either case?
>
> Well, the cost calculations are mostly in costsize.c, but I dunno how
> much that helps.  Maybe it would help if you posted some EXPLAIN
> ANALYZE output for the different cases, with and without parallelism?
>
> One thing I noticed about this output (from your blog)...
>
> Finalize Aggregate
>  (cost=16536.53..16536.79 rows=1 width=8)
>  (actual time=2263.638..2263.639 rows=1 loops=1)
>    ->  Gather
>    (cost=16461.22..16461.53 rows=3 width=32)
>    (actual time=754.309..757.204 rows=4 loops=1)
>          Number of Workers: 3
>          ->  Partial Aggregate
>          (cost=15461.22..15461.23 rows=1 width=32)
>          (actual time=676.738..676.739 rows=1 loops=4)
>                ->  Parallel Seq Scan on pd
>                (cost=0.00..13856.38 rows=64 width=2311)
>                (actual time=3.009..27.321 rows=42 loops=4)
>                      Filter: (fed_num = 47005)
>                      Rows Removed by Filter: 17341
>  Planning time: 0.219 ms
>  Execution time: 2264.684 ms
>
> ...is that the finalize aggregate phase is estimated to be very cheap,
> but it's actually wicked expensive.  We get the results from the
> workers in only 750 ms, but it takes another second and a half to
> aggregate those 4 rows???

This is probably a vivid example of the bad behaviour of the naive
union approach. If we have worker states 1,2,3,4 and we go

combine(combine(combine(1,2),3),4)

Then we get kind of a worst case complexity situation where we three
times union an increasingly complex object on the left with a simpler
object on the right. Also, if the objects went into the transfer
functions in relatively non-spatially correlated order, the polygons
coming out of the transfer functions could be quite complex, and each
merge would only add complexity to the output until the final merge
which melts away all the remaining internal boundaries.

I'm surprised it's quite so awful at the end though, and less awful in
the worker stage... how do the workers end up getting rows to work on?
1,2,3,4,1,2,3,4,1,2,3,4? or 1,1,1,2,2,2,3,3,3,4,4,4? The former could
result in optimally inefficient unions, given a spatially correlated
input (surprisingly common in load-once GIS tables)

P.

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

Re: Parallel Queries and PostGIS

From

Paul Ramsey

Date:

31 March 2016, 19:19:40

On Tue, Mar 29, 2016 at 12:51 PM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
> On Tue, Mar 29, 2016 at 12:48 PM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
>
>>> On the join case, I wonder if it's possible that _st_intersects is not
>>> marked parallel-safe?  If that's not the problem, I don't have a
>>> second guess, but the thing to do would be to figure out whether
>>> consider_parallel is false for the RelOptInfo corresponding to either
>>> of pd and pts, or whether it's true for both but false for the
>>> joinrel's RelOptInfo, or whether it's true for all three of them but
>>> you don't get the desired path anyway.
>>
>> _st_intersects is definitely marked parallel safe, and in fact will
>> generate a parallel plan if used alone (without the operator though,
>> it's impossibly slow). It's the && operator that is the issue... and I
>> just noticed that the PROCEDURE bound to the && operator
>> (geometry_overlaps) is *not* marked parallel safe: could be the
>> problem?
>
> Asked and answered: marking the geometry_overlaps as parallel safe
> gets me a parallel plan! Now to play with costs and see how it behaves
> when force_parallel_mode is not set.

For the record I can get a non-forced parallel join plan, *only* if I
reduce the parallel_join_cost by a factor of 10, from 0.1 to 0.01.

http://blog.cleverelephant.ca/2016/03/parallel-postgis-joins.html

This seems non-optimal. No amount of cranking up the underlying
function COST seems to change this, perhaps because the join cost is
entirely based on the number of expected tuples in the join relation?

In general it seems like function COST values have been considered a
relatively unimportant input to planning in the past, but with
parallel processing it seems like they are now a lot more
determinative about what makes a good plan.

P.

Re: Parallel Queries and PostGIS

From

Amit Kapila

Date:

01 April 2016, 03:31:56

On Fri, Apr 1, 2016 at 12:49 AM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
>
> On Tue, Mar 29, 2016 at 12:51 PM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
> > On Tue, Mar 29, 2016 at 12:48 PM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
> >
> >>> On the join case, I wonder if it's possible that _st_intersects is not
> >>> marked parallel-safe? If that's not the problem, I don't have a
> >>> second guess, but the thing to do would be to figure out whether
> >>> consider_parallel is false for the RelOptInfo corresponding to either
> >>> of pd and pts, or whether it's true for both but false for the
> >>> joinrel's RelOptInfo, or whether it's true for all three of them but
> >>> you don't get the desired path anyway.
> >>
> >> _st_intersects is definitely marked parallel safe, and in fact will
> >> generate a parallel plan if used alone (without the operator though,
> >> it's impossibly slow). It's the && operator that is the issue... and I
> >> just noticed that the PROCEDURE bound to the && operator
> >> (geometry_overlaps) is *not* marked parallel safe: could be the
> >> problem?
> >
> > Asked and answered: marking the geometry_overlaps as parallel safe
> > gets me a parallel plan! Now to play with costs and see how it behaves
> > when force_parallel_mode is not set.
>
> For the record I can get a non-forced parallel join plan, *only* if I
> reduce the parallel_join_cost by a factor of 10, from 0.1 to 0.01.
>

I think here you mean parallel_tuple_cost.

>
> http://blog.cleverelephant.ca/2016/03/parallel-postgis-joins.html
>
> This seems non-optimal. No amount of cranking up the underlying
> function COST seems to change this, perhaps because the join cost is
> entirely based on the number of expected tuples in the join relation?
>

Is the function cost not being considered when given as join clause or you wanted to point in general for any parallel plan it is not considered? I think it should be considered when given as a clause for single table scan.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Parallel Queries and PostGIS

From

David Rowley

Date:

01 April 2016, 04:12:39

On 30 March 2016 at 09:14, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Mar 29, 2016 at 3:48 PM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
>>> I have no idea why the cost adjustments that you need are different
>>> for the scan case and the aggregate case.  That does seem problematic,
>>> but I just don't know why it's happening.
>>
>> What might be a good way to debug it? Is there a piece of code I can
>> look at to try and figure out the contribution of COST in either case?
>
> Well, the cost calculations are mostly in costsize.c, but I dunno how
> much that helps.  Maybe it would help if you posted some EXPLAIN
> ANALYZE output for the different cases, with and without parallelism?
>
> One thing I noticed about this output (from your blog)...
>
> Finalize Aggregate
>  (cost=16536.53..16536.79 rows=1 width=8)
>  (actual time=2263.638..2263.639 rows=1 loops=1)
>    ->  Gather
>    (cost=16461.22..16461.53 rows=3 width=32)
>    (actual time=754.309..757.204 rows=4 loops=1)
>          Number of Workers: 3
>          ->  Partial Aggregate
>          (cost=15461.22..15461.23 rows=1 width=32)
>          (actual time=676.738..676.739 rows=1 loops=4)
>                ->  Parallel Seq Scan on pd
>                (cost=0.00..13856.38 rows=64 width=2311)
>                (actual time=3.009..27.321 rows=42 loops=4)
>                      Filter: (fed_num = 47005)
>                      Rows Removed by Filter: 17341
>  Planning time: 0.219 ms
>  Execution time: 2264.684 ms
>
> ...is that the finalize aggregate phase is estimated to be very cheap,
> but it's actually wicked expensive.  We get the results from the
> workers in only 750 ms, but it takes another second and a half to
> aggregate those 4 rows???

hmm, actually I've just realised that create_grouping_paths() should
be accounting agg_costs differently depending if it's partial
aggregation, finalize aggregation, or just normal. count_agg_clauses()
needs to be passed the aggregate type information to allow the walker
function to cost the correct portions of the aggregate correctly based
on what type of aggregation the costs will be used for.  In short,
please don't bother to spend too much time tuning your costs until I
fix this.

As of now the Partial Aggregate is including the cost of the final
function... that's certainly broken, as it does not call that
function.

I will try to get something together over the weekend to fix this, but
I have other work to do until then.

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Parallel Queries and PostGIS

From

David Rowley

Date:

13 April 2016, 00:31:45

On 1 April 2016 at 17:12, David Rowley <david.rowley@2ndquadrant.com> wrote:
> On 30 March 2016 at 09:14, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Mar 29, 2016 at 3:48 PM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
>>>> I have no idea why the cost adjustments that you need are different
>>>> for the scan case and the aggregate case.  That does seem problematic,
>>>> but I just don't know why it's happening.
>>>
>>> What might be a good way to debug it? Is there a piece of code I can
>>> look at to try and figure out the contribution of COST in either case?
>>
>> Well, the cost calculations are mostly in costsize.c, but I dunno how
>> much that helps.  Maybe it would help if you posted some EXPLAIN
>> ANALYZE output for the different cases, with and without parallelism?
>>
>> One thing I noticed about this output (from your blog)...
>>
>> Finalize Aggregate
>>  (cost=16536.53..16536.79 rows=1 width=8)
>>  (actual time=2263.638..2263.639 rows=1 loops=1)
>>    ->  Gather
>>    (cost=16461.22..16461.53 rows=3 width=32)
>>    (actual time=754.309..757.204 rows=4 loops=1)
>>          Number of Workers: 3
>>          ->  Partial Aggregate
>>          (cost=15461.22..15461.23 rows=1 width=32)
>>          (actual time=676.738..676.739 rows=1 loops=4)
>>                ->  Parallel Seq Scan on pd
>>                (cost=0.00..13856.38 rows=64 width=2311)
>>                (actual time=3.009..27.321 rows=42 loops=4)
>>                      Filter: (fed_num = 47005)
>>                      Rows Removed by Filter: 17341
>>  Planning time: 0.219 ms
>>  Execution time: 2264.684 ms
>>
>> ...is that the finalize aggregate phase is estimated to be very cheap,
>> but it's actually wicked expensive.  We get the results from the
>> workers in only 750 ms, but it takes another second and a half to
>> aggregate those 4 rows???
>
> hmm, actually I've just realised that create_grouping_paths() should
> be accounting agg_costs differently depending if it's partial
> aggregation, finalize aggregation, or just normal. count_agg_clauses()
> needs to be passed the aggregate type information to allow the walker
> function to cost the correct portions of the aggregate correctly based
> on what type of aggregation the costs will be used for.  In short,
> please don't bother to spend too much time tuning your costs until I
> fix this.
>
> As of now the Partial Aggregate is including the cost of the final
> function... that's certainly broken, as it does not call that
> function.
>
> I will try to get something together over the weekend to fix this, but
> I have other work to do until then.

Hi Paul,

As of deb71fa, committed by Robert today, you should have a bit more
control over parallel aggregate costings. You can how raise the
transfn cost, or drop the combinefn cost to encourage parallel
aggregation. Keep in mind the derialfn and deserialfn costs are now
also accounted for too.

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Parallel Queries and PostGIS

From

Stephen Frost

Date:

22 April 2016, 18:44:41

Paul,

* Paul Ramsey (pramsey@cleverelephant.ca) wrote:
> On Mon, Mar 28, 2016 at 9:45 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > Would you agree that it'd be helpful to have for making the st_union()
> > work better in parallel?
>
> For our particular situation w/ ST_Union, yes, it would be ideal to be
> able to run a worker-side combine function as well as the master-side
> one. Although the cascaded union would be less effective spread out
> over N nodes, doing it only once per worker, rather than every N
> records would minimize the loss of effectiveness.

I chatted with Robert a bit about this and he had an interesting
suggestion.  I'm not sure that it would work for you, but the
serialize/deserialize functions are used to transfer the results from
the worker process to the main process.  You could possibly do the
per-worker finalize work in the serialize function to get the benefit of
running that in parallel.

You'll need to mark the aggtranstype as 'internal' to have the
serialize/deserialize code called.  Hopefully that's not too much of an
issue.

Thanks!

Stephen

Re: Parallel Queries and PostGIS

From

Paul Ramsey

Date:

26 April 2016, 19:17:34

On Fri, Apr 22, 2016 at 11:44 AM, Stephen Frost <sfrost@snowman.net> wrote:
> Paul,
>
> * Paul Ramsey (pramsey@cleverelephant.ca) wrote:
>> On Mon, Mar 28, 2016 at 9:45 AM, Stephen Frost <sfrost@snowman.net> wrote:
>> > Would you agree that it'd be helpful to have for making the st_union()
>> > work better in parallel?
>>
>> For our particular situation w/ ST_Union, yes, it would be ideal to be
>> able to run a worker-side combine function as well as the master-side
>> one. Although the cascaded union would be less effective spread out
>> over N nodes, doing it only once per worker, rather than every N
>> records would minimize the loss of effectiveness.
>
> I chatted with Robert a bit about this and he had an interesting
> suggestion.  I'm not sure that it would work for you, but the
> serialize/deserialize functions are used to transfer the results from
> the worker process to the main process.  You could possibly do the
> per-worker finalize work in the serialize function to get the benefit of
> running that in parallel.
>
> You'll need to mark the aggtranstype as 'internal' to have the
> serialize/deserialize code called.  Hopefully that's not too much of an
> issue.

Thanks Stephen. We were actually thinking that it might make more
sense to just do the parallel processing in our own threads in the
finalfunc. Not as elegant and magical as bolting into the PgSQL infra,
but if we're doing something hacky anyways, might as well be our own
hacky.

ATB,
P

>
> Thanks!
>
> Stephen