Re: Choosing parallel_degree - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Choosing parallel_degree
Date
Msg-id CANP8+jJThmrUTu6nAKbyrzeFjG86Nj0zYdL8n07Yc3Qr_cg3jQ@mail.gmail.com
Whole thread Raw
In response to Re: Choosing parallel_degree  (Paul Ramsey <pramsey@cleverelephant.ca>)
List pgsql-hackers
On 8 April 2016 at 17:27, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
 
PostGIS is just one voice...

We're listening.
 
>> Functions have very unequal CPU costs, and we're talking here about
>> using CPUs more effectively, why are costs being given the see-no-evil
>> treatment? This is as true in core as it is in PostGIS, even if our
>> case is a couple orders of magnitude more extreme: a filter based on a
>> complex combination of regex queries will use an order of magnitude
>> more CPU than one that does a little math, why plan and execute them
>> like they are the same?
>
> Functions have user assignable costs.

We have done a relatively bad job of globally costing our functions
thus far, because it mostly didn't make any difference. In my testing
[1], I found that costing could push better plans for parallel
sequence scans and parallel aggregates, though at very extreme cost
values (1000 for sequence scans and 10000 for aggregates)

Obviously, if costs can make a difference for 9.6 and parallelism
we'll rigorously ensure we have good, useful costs. I've already
costed many functions in my parallel postgis test branch [2]. Perhaps
the avoidance of cost so far is based on the relatively nebulous
definition it has: about the only thing in the docs is "If the cost is
not specified, 1 unit is assumed for C-language and internal
functions, and 100 units for functions in all other languages. Larger
values cause the planner to try to avoid evaluating the function more
often than necessary."

So what about C functions then? Should a string comparison be 5 and a
multiplication 1? An image histogram 1000?

We don't have a clear methodology for how to do this.

It's a single parameter to allow you to achieve the plans that work optimally. Hopefully that is simple enough for everyone to use and yet flexible enough to make a difference.

If its not what you need, show us and it may make the case for change.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Move PinBuffer and UnpinBuffer to atomics
Next
From: Simon Riggs
Date:
Subject: Re: Choosing parallel_degree