Re: increased max_parallel_workers_per_gather results in fewerworkers? - Mailing list pgsql-performance

From Philip Semanchuk
Subject Re: increased max_parallel_workers_per_gather results in fewerworkers?
Date
Msg-id 89423FD3-0F13-447D-8D9E-EB1722150F94@americanefficient.com
Whole thread Raw
In response to Re: increased max_parallel_workers_per_gather results in fewerworkers?  (Sebastian Dressler <sebastian@swarm64.com>)
Responses Re: increased max_parallel_workers_per_gather results in fewerworkers?  (Sebastian Dressler <sebastian@swarm64.com>)
List pgsql-performance

> On Jun 4, 2020, at 2:28 AM, Sebastian Dressler <sebastian@swarm64.com> wrote:
>
> Hi Philip,
>
>> On 4. Jun 2020, at 00:23, Philip Semanchuk <philip@americanefficient.com> wrote:
>>
>>> I guess you should show an explain analyze, specifically "Workers
>>> Planned/Launched", maybe by linking to explain.depesz.com
>>
>> Out of an abundance of caution, our company has a policy of not pasting our plans to public servers. However, I can
confirmthat when I set max_parallel_workers_per_gather > 4 and the runtime increases, this is what’s in the EXPLAIN
ANALYZEoutput: 
>>
>>        Workers Planned: 1
>>        Workers Launched: 1
>
> Can you please verify the amount of max_parallel_workers and max_worker_processes? It should be roughly
max_worker_processes> max_parallel_workers > max_parallel_workers_per_gather, for instance: 
>
> max_worker_processes = 24
> max_parallel_workers = 18
> max_parallel_workers_per_gather = 6


I changed my settings to these exact values and can still recreate the situation where I unexpectedly get a single
workerquery. 


> Also, there are more configuration settings related to parallel queries you might want to look into. Most notably:
>
> parallel_setup_cost
> parallel_tuple_cost
> min_parallel_table_scan_size
>
> Especially the last one is a typical dealbreaker, you can try to set it to 0 for the beginning. Good starters for the
othersare 500 and 0.1 respectively. 

Aha! By setting min_parallel_table_scan_size=0, Postgres uses the 6 workers I expect, and the execution time decreases
nicely. 

I posted a clumsily-anonymized plan for the “bad” scenario here --
https://gist.github.com/osvenskan/ea00aa71abaa9697ade0ab7c1f3b705b

There are 3 sort nodes in the plan. When I get the “bad” behavior, the sorts have one worker, when I get the good
behavior,they have multiple workers (e.g. 6). 

This brings up a couple of questions —
1) I’ve read that this is Postgres’ formula for the max # of workers it will consider for a table —

   max_workers = log3(table size / min_parallel_table_scan_size)

Does that use the raw table size, or does the planner use statistics to estimate the size of the subset of the table
thatwill be read before allocating workers? 

2) There are 9 tables in this query ranging in size from 72Kb to 17Gb. Does Postgres decide on a table-by-table basis
toallocate multiple workers, or…? 

Thank you so much for the suggestion, I feel un-stuck now that I have an idea to experiment with.

Cheers
Philip




pgsql-performance by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: increased max_parallel_workers_per_gather results in fewer workers?
Next
From: Sebastian Dressler
Date:
Subject: Re: increased max_parallel_workers_per_gather results in fewerworkers?