Re: Parallel Queries and PostGIS - Mailing list pgsql-hackers

From Paul Ramsey
Subject Re: Parallel Queries and PostGIS
Date
Msg-id CACowWR2xM-f3V1k-4pCB8GgqTykbZYMTL6YhEw4SNWPaoQzqSg@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Queries and PostGIS  (Paul Ramsey <pramsey@cleverelephant.ca>)
Responses Re: Parallel Queries and PostGIS  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Tue, Mar 29, 2016 at 12:51 PM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
> On Tue, Mar 29, 2016 at 12:48 PM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
>
>>> On the join case, I wonder if it's possible that _st_intersects is not
>>> marked parallel-safe?  If that's not the problem, I don't have a
>>> second guess, but the thing to do would be to figure out whether
>>> consider_parallel is false for the RelOptInfo corresponding to either
>>> of pd and pts, or whether it's true for both but false for the
>>> joinrel's RelOptInfo, or whether it's true for all three of them but
>>> you don't get the desired path anyway.
>>
>> _st_intersects is definitely marked parallel safe, and in fact will
>> generate a parallel plan if used alone (without the operator though,
>> it's impossibly slow). It's the && operator that is the issue... and I
>> just noticed that the PROCEDURE bound to the && operator
>> (geometry_overlaps) is *not* marked parallel safe: could be the
>> problem?
>
> Asked and answered: marking the geometry_overlaps as parallel safe
> gets me a parallel plan! Now to play with costs and see how it behaves
> when force_parallel_mode is not set.

For the record I can get a non-forced parallel join plan, *only* if I
reduce the parallel_join_cost by a factor of 10, from 0.1 to 0.01.

http://blog.cleverelephant.ca/2016/03/parallel-postgis-joins.html

This seems non-optimal. No amount of cranking up the underlying
function COST seems to change this, perhaps because the join cost is
entirely based on the number of expected tuples in the join relation?

In general it seems like function COST values have been considered a
relatively unimportant input to planning in the past, but with
parallel processing it seems like they are now a lot more
determinative about what makes a good plan.

P.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: improving GROUP BY estimation
Next
From: Alvaro Herrera
Date:
Subject: Re: [PATCH v9] GSSAPI encryption support