Thread: increased max_parallel_workers_per_gather results in fewer workers?
Hi all, I’ve been experimenting with some performance tuning on a particular query, and I observed a result that I don’t understand. I’ve been setting max_parallel_workers_per_gather to values the range 1-6 and then running EXPLAIN ANALYZE to see how much benefit we get from more parallelization. My data is organized by year, so the year is a parameter in the query’s WHEREclause. For my 2018 data, Postgres launches as many workers as max_parallel_workers_per_gather permits, and the execution time decreasesnicely, from 280 seconds with 1 worker all the way down to 141s with 6 workers. So far, so good. When I run the same query for our 2022 data, I get the same behavior (improvement) for max_parallel_workers_per_gather valuesof 1-4. But with max_parallel_workers_per_gather set to 5 or 6, Postgres only uses 1 worker, and the execution timeincreases dramatically, even worse than when I deliberately limit the number of workers to 1 — - max_parallel_workers_per_gather=1, runtime = 1061s - max_parallel_workers_per_gather=2, runtime = 770s - max_parallel_workers_per_gather=3, runtime = 637s - max_parallel_workers_per_gather=4, runtime = 573s - max_parallel_workers_per_gather=5, runtime = 1468s - max_parallel_workers_per_gather=6, runtime = 1469s Our 2022 data set is several times larger than our 2018 data, so I suspect some resource is getting exhausted, but I’m notsure what. So far, this result has been 100% re-creatable. I’m on a dedicated test server with 16 virtual CPUs and 128GbRAM; no one else is competing with me for Postgres processes. max_worker_processes and max_parallel_workers are bothset to 12. Can anyone help me understand why this happens, or where I might look for clues? Thanks, Philip
On Wed, Jun 03, 2020 at 04:04:13PM -0400, Philip Semanchuk wrote: > Can anyone help me understand why this happens, or where I might look for clues? What version postgres ? Can you reproduce if you do: ALTER SYSTEM SET max_parallel_workers_per_gather=0; SELECT pg_reload_conf(); .. and then within the session do: SET max_parallel_workers_per_gather=12; I guess you should show an explain analyze, specifically "Workers Planned/Launched", maybe by linking to explain.depesz.com -- Justin
> On Jun 3, 2020, at 5:15 PM, Justin Pryzby <pryzby@telsasoft.com> wrote: > > On Wed, Jun 03, 2020 at 04:04:13PM -0400, Philip Semanchuk wrote: >> Can anyone help me understand why this happens, or where I might look for clues? > > What version postgres ? Sorry, I should have posted that in my initial email. select version(); +-----------------------------------------------------------------------------+ | version | |-----------------------------------------------------------------------------| | PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.9.3, 64-bit | +-----------------------------------------------------------------------------+ This is AWS’ version of Postgres 11.6 (“Aurora”) which of course might make a difference. > Can you reproduce if you do: > ALTER SYSTEM SET max_parallel_workers_per_gather=0; SELECT pg_reload_conf(); > .. and then within the session do: SET max_parallel_workers_per_gather=12; Unfortunately under Aurora I’m not superuser so I can’t run ALTER SYSTEM, but I can change the config via AWS’ config interface,so I set max_parallel_workers_per_gather=0 there. show max_parallel_workers_per_gather +-----------------------------------+ | max_parallel_workers_per_gather | |-----------------------------------| | 0 | +-----------------------------------+ SHOW Time: 0.034s postgres@philip-2020-05-19-cluster:wylan> SET max_parallel_workers_per_gather=12 SET Time: 0.028s postgres@philip-2020-05-19-cluster:wylan> show max_parallel_workers_per_gather +-----------------------------------+ | max_parallel_workers_per_gather | |-----------------------------------| | 12 | +-----------------------------------+ SHOW I then ran the EXPLAIN ANALYZE and got the same slow runtime (1473s) and 1 worker in the EXPLAIN ANALYZE output. > I guess you should show an explain analyze, specifically "Workers > Planned/Launched", maybe by linking to explain.depesz.com Out of an abundance of caution, our company has a policy of not pasting our plans to public servers. However, I can confirmthat when I set max_parallel_workers_per_gather > 4 and the runtime increases, this is what’s in the EXPLAIN ANALYZEoutput: Workers Planned: 1 Workers Launched: 1 FWIW, the Planning Time reported in EXPLAIN ANALYZE output doesn’t vary significantly, only from 411-443ms, and the variationwithin that range correlates only very weakly with max_parallel_workers_per_gather. thank you Philip
On Wed, Jun 03, 2020 at 06:23:57PM -0400, Philip Semanchuk wrote: > > On Jun 3, 2020, at 5:15 PM, Justin Pryzby <pryzby@telsasoft.com> wrote: > > What version postgres ? > > This is AWS’ version of Postgres 11.6 (“Aurora”) which of course might make a difference. > > I guess you should show an explain analyze, specifically "Workers > > Planned/Launched", maybe by linking to explain.depesz.com > > Out of an abundance of caution, our company has a policy of not pasting our plans to public servers. However, I can confirmthat when I set max_parallel_workers_per_gather > 4 and the runtime increases, this is what’s in the EXPLAIN ANALYZEoutput: > > Workers Planned: 1 > Workers Launched: 1 Are you referring to a parallel scan/aggregate/hash/?? Are you able to show a plan for a toy query like SELECT count(col) FROM tbl , preferably including a CREATE TABLE tbl AS... ; VACUUM ANALYZE tbl; Are you able to reproduce with an unpatched postgres ? -- Justin
Re: increased max_parallel_workers_per_gather results in fewerworkers?
From
Sebastian Dressler
Date:
Hi Philip, > On 4. Jun 2020, at 00:23, Philip Semanchuk <philip@americanefficient.com> wrote: > >> I guess you should show an explain analyze, specifically "Workers >> Planned/Launched", maybe by linking to explain.depesz.com > > Out of an abundance of caution, our company has a policy of not pasting our plans to public servers. However, I can confirmthat when I set max_parallel_workers_per_gather > 4 and the runtime increases, this is what’s in the EXPLAIN ANALYZEoutput: > > Workers Planned: 1 > Workers Launched: 1 Can you please verify the amount of max_parallel_workers and max_worker_processes? It should be roughly max_worker_processes> max_parallel_workers > max_parallel_workers_per_gather, for instance: max_worker_processes = 24 max_parallel_workers = 18 max_parallel_workers_per_gather = 6 Also, there are more configuration settings related to parallel queries you might want to look into. Most notably: parallel_setup_cost parallel_tuple_cost min_parallel_table_scan_size Especially the last one is a typical dealbreaker, you can try to set it to 0 for the beginning. Good starters for the othersare 500 and 0.1 respectively. > FWIW, the Planning Time reported in EXPLAIN ANALYZE output doesn’t vary significantly, only from 411-443ms, and the variationwithin that range correlates only very weakly with max_parallel_workers_per_gather. It can happen, that more parallelism does not help the query but slows it down beyond a specific amount of parallel workers.You can see this in EXPLAIN when there is for instance a BITMAP HEAP INDEX SCAN or similar involved. Cheers, Sebastian
Hi,
on top of the settings that Sebastian suggested, you can also try disabling the participation of the leader (i.e. the main backend process for your connection) in the distribution of the parallel workload:
SET parallel_leader_participation TO false
Depending on your workload the leader could be saturated if it has to do a share of the workload and aggregate the results of all the workers.
Cheers
Luis
Luis
On Wed, Jun 03, 2020 at 06:23:57PM -0400, Philip Semanchuk wrote: > > ... > >I then ran the EXPLAIN ANALYZE and got the same slow runtime (1473s) and 1 worker in the EXPLAIN ANALYZE output. > > >> I guess you should show an explain analyze, specifically "Workers >> Planned/Launched", maybe by linking to explain.depesz.com > >Out of an abundance of caution, our company has a policy of not pasting our plans to public servers. However, I can confirmthat when I set max_parallel_workers_per_gather > 4 and the runtime increases, this is what’s in the EXPLAIN ANALYZEoutput: > > Workers Planned: 1 > Workers Launched: 1 > >FWIW, the Planning Time reported in EXPLAIN ANALYZE output doesn’t vary significantly, only from 411-443ms, and the variationwithin that range correlates only very weakly with max_parallel_workers_per_gather. > Well, that policy is stupid and it makes it unnecessarily harder to answer your questions. We really need to see the plans, it's much harder to give you any advices without it. We can only speculate about what's going on. It's understandable there may be sensitive information in the plan (parameter values, ...) but that can be sanitized before posting. We need to see plans for the good and bad case, so that we can compare them, look at the plan general shapes, costs, etc. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Jun 4, 2020 at 12:24 AM Philip Semanchuk <philip@americanefficient.com> wrote:
> On Jun 3, 2020, at 5:15 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Wed, Jun 03, 2020 at 04:04:13PM -0400, Philip Semanchuk wrote:
>> Can anyone help me understand why this happens, or where I might look for clues?
>
> What version postgres ?
Sorry, I should have posted that in my initial email.
select version();
+-----------------------------------------------------------------------------+
| version |
|-----------------------------------------------------------------------------|
| PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.9.3, 64-bit |
+-----------------------------------------------------------------------------+
This is AWS’ version of Postgres 11.6 (“Aurora”) which of course might make a difference.
Yes, it definitely makes a difference. For Aurora questions you are more likely to get good answers in the AWS forums rather than the PostgreSQL ones. It's different from PostgreSQL in too many ways, and those differences are not fully known outside of AWS.
> On Jun 4, 2020, at 2:28 AM, Sebastian Dressler <sebastian@swarm64.com> wrote: > > Hi Philip, > >> On 4. Jun 2020, at 00:23, Philip Semanchuk <philip@americanefficient.com> wrote: >> >>> I guess you should show an explain analyze, specifically "Workers >>> Planned/Launched", maybe by linking to explain.depesz.com >> >> Out of an abundance of caution, our company has a policy of not pasting our plans to public servers. However, I can confirmthat when I set max_parallel_workers_per_gather > 4 and the runtime increases, this is what’s in the EXPLAIN ANALYZEoutput: >> >> Workers Planned: 1 >> Workers Launched: 1 > > Can you please verify the amount of max_parallel_workers and max_worker_processes? It should be roughly max_worker_processes> max_parallel_workers > max_parallel_workers_per_gather, for instance: > > max_worker_processes = 24 > max_parallel_workers = 18 > max_parallel_workers_per_gather = 6 I changed my settings to these exact values and can still recreate the situation where I unexpectedly get a single workerquery. > Also, there are more configuration settings related to parallel queries you might want to look into. Most notably: > > parallel_setup_cost > parallel_tuple_cost > min_parallel_table_scan_size > > Especially the last one is a typical dealbreaker, you can try to set it to 0 for the beginning. Good starters for the othersare 500 and 0.1 respectively. Aha! By setting min_parallel_table_scan_size=0, Postgres uses the 6 workers I expect, and the execution time decreases nicely. I posted a clumsily-anonymized plan for the “bad” scenario here -- https://gist.github.com/osvenskan/ea00aa71abaa9697ade0ab7c1f3b705b There are 3 sort nodes in the plan. When I get the “bad” behavior, the sorts have one worker, when I get the good behavior,they have multiple workers (e.g. 6). This brings up a couple of questions — 1) I’ve read that this is Postgres’ formula for the max # of workers it will consider for a table — max_workers = log3(table size / min_parallel_table_scan_size) Does that use the raw table size, or does the planner use statistics to estimate the size of the subset of the table thatwill be read before allocating workers? 2) There are 9 tables in this query ranging in size from 72Kb to 17Gb. Does Postgres decide on a table-by-table basis toallocate multiple workers, or…? Thank you so much for the suggestion, I feel un-stuck now that I have an idea to experiment with. Cheers Philip
Re: increased max_parallel_workers_per_gather results in fewerworkers?
From
Sebastian Dressler
Date:
Hi Philip,
AFAIK, it will do it per-table initially but then the final result depends on the chosen gather node.
On 4. Jun 2020, at 18:41, Philip Semanchuk <philip@americanefficient.com> wrote:[...]Also, there are more configuration settings related to parallel queries you might want to look into. Most notably:
parallel_setup_cost
parallel_tuple_cost
min_parallel_table_scan_size
Especially the last one is a typical dealbreaker, you can try to set it to 0 for the beginning. Good starters for the others are 500 and 0.1 respectively.
Aha! By setting min_parallel_table_scan_size=0, Postgres uses the 6 workers I expect, and the execution time decreases nicely.
I posted a clumsily-anonymized plan for the “bad” scenario here --
https://gist.github.com/osvenskan/ea00aa71abaa9697ade0ab7c1f3b705b
There are 3 sort nodes in the plan. When I get the “bad” behavior, the sorts have one worker, when I get the good behavior, they have multiple workers (e.g. 6).
I also think, what Luis pointed out earlier might be a good option for you, i.e. setting
parallel_leader_participation = off;
And by the way, this 1 worker turns actually into 2 workers in total with leader participation enabled.
This brings up a couple of questions —
1) I’ve read that this is Postgres’ formula for the max # of workers it will consider for a table —
max_workers = log3(table size / min_parallel_table_scan_size)
Does that use the raw table size, or does the planner use statistics to estimate the size of the subset of the table that will be read before allocating workers?
"table size" is the number of PSQL pages, i.e. relation-size / 8 kB. This comes from statistics.
2) There are 9 tables in this query ranging in size from 72Kb to 17Gb. Does Postgres decide on a table-by-table basis to allocate multiple workers, or…?
Thank you so much for the suggestion, I feel un-stuck now that I have an idea to experiment with.
You are welcome, we are actually about to publish a blog post which has some more suggestions for parallelism in.
Cheers,
Sebastian
> On Jun 4, 2020, at 1:45 PM, Sebastian Dressler <sebastian@swarm64.com> wrote: > > Hi Philip, > >> On 4. Jun 2020, at 18:41, Philip Semanchuk <philip@americanefficient.com> wrote: >> [...] >> >>> Also, there are more configuration settings related to parallel queries you might want to look into. Most notably: >>> >>> parallel_setup_cost >>> parallel_tuple_cost >>> min_parallel_table_scan_size >>> >>> Especially the last one is a typical dealbreaker, you can try to set it to 0 for the beginning. Good starters for theothers are 500 and 0.1 respectively. >> >> Aha! By setting min_parallel_table_scan_size=0, Postgres uses the 6 workers I expect, and the execution time decreasesnicely. >> >> I posted a clumsily-anonymized plan for the “bad” scenario here -- >> https://gist.github.com/osvenskan/ea00aa71abaa9697ade0ab7c1f3b705b >> >> There are 3 sort nodes in the plan. When I get the “bad” behavior, the sorts have one worker, when I get the good behavior,they have multiple workers (e.g. 6). > > I also think, what Luis pointed out earlier might be a good option for you, i.e. setting > > parallel_leader_participation = off; > > And by the way, this 1 worker turns actually into 2 workers in total with leader participation enabled. I’ll try that out, thanks. > >> This brings up a couple of questions — >> 1) I’ve read that this is Postgres’ formula for the max # of workers it will consider for a table — >> >> max_workers = log3(table size / min_parallel_table_scan_size) >> >> Does that use the raw table size, or does the planner use statistics to estimate the size of the subset of the table thatwill be read before allocating workers? > > "table size" is the number of PSQL pages, i.e. relation-size / 8 kB. This comes from statistics. OK, so it sounds like the planner does *not* use the values in pg_stats when planning workers, true? I’m still trying to understand one thing I’ve observed. I can run the query that produced the plan in the gist I linked toabove with max_parallel_workers_per_gather=6 and the year param = 2018, and I get 6 workers. When I set the year param=2022I get only one worker. Same tables, same query, different parameter. That suggests to me that the planner is usingpg_stats when allocating workers, but I can imagine there might be other things going on that I don’t understand. (Ihaven’t ruled out that this might be an AWS-specific quirk, either.) Cheers Philip
Re: increased max_parallel_workers_per_gather results in fewer workers?
From
Sebastian Dressler
Date:
Hi Philip,
On 4. Jun 2020, at 20:37, Philip Semanchuk <philip@americanefficient.com> wrote:[...]This brings up a couple of questions —
1) I’ve read that this is Postgres’ formula for the max # of workers it will consider for a table —
max_workers = log3(table size / min_parallel_table_scan_size)
Does that use the raw table size, or does the planner use statistics to estimate the size of the subset of the table that will be read before allocating workers?
"table size" is the number of PSQL pages, i.e. relation-size / 8 kB. This comes from statistics.
OK, so it sounds like the planner does *not* use the values in pg_stats when planning workers, true?
Full disclosure: I am not too deep into these internals, likely others on the list know much more about it. But with respect to the relation size, I think this is tracked elsewhere, it might be affected by other parameters though like vacuuming and probably, the estimated amount of how much of the table is scanned also plays a role.
I’m still trying to understand one thing I’ve observed. I can run the query that produced the plan in the gist I linked to above with max_parallel_workers_per_gather=6 and the year param = 2018, and I get 6 workers. When I set the year param=2022 I get only one worker. Same tables, same query, different parameter. That suggests to me that the planner is using pg_stats when allocating workers, but I can imagine there might be other things going on that I don’t understand. (I haven’t ruled out that this might be an AWS-specific quirk, either.)
I think it would be helpful, if you could post again both plans. The ideal would be to use https://explain.dalibo.com/ and share the links. You will have to generate them with JSON format, but still can anonymize them.
Obviously, the plan changes when changing these two parameters, comparing both plans very likely unveils why that is the case. My guess would be, that something in the estimated amount of rows changes causing PG to prefer a different plan with lower cost.
Also, maybe on that occasion, check the default_statistics_target parameter which is default wise at 100, but for analytical case like - I assume - yours higher values tend to improve the planning. You can try with for instance 1000 or 2500. In contrast to changing this parameter globally, you can also adjust it per table (ALTER TABLE SET STATISTICS).
Cheers,
Sebastian
--
Sebastian Dressler, Solution Architect
+49 30 994 0496 72 | sebastian@swarm64.com
Swarm64 AS
Parkveien 41 B | 0258 Oslo | Norway
Registered at Brønnøysundregistrene in Norway under Org.-Number 911 662 787
CEO/Geschäftsführer (Daglig Leder): Thomas Richter; Chairman/Vorsitzender (Styrets Leder): Dr. Sverre Munck
Swarm64 AS Zweigstelle Hive
Ullsteinstr. 120 | 12109 Berlin | Germany
Registered at Amtsgericht Charlottenburg - HRB 154382 B
Sebastian Dressler, Solution Architect
+49 30 994 0496 72 | sebastian@swarm64.com
Swarm64 AS
Parkveien 41 B | 0258 Oslo | Norway
Registered at Brønnøysundregistrene in Norway under Org.-Number 911 662 787
CEO/Geschäftsführer (Daglig Leder): Thomas Richter; Chairman/Vorsitzender (Styrets Leder): Dr. Sverre Munck
Swarm64 AS Zweigstelle Hive
Ullsteinstr. 120 | 12109 Berlin | Germany
Registered at Amtsgericht Charlottenburg - HRB 154382 B
Attachment
> On Jun 4, 2020, at 3:03 PM, Sebastian Dressler <sebastian@swarm64.com> wrote: > > Hi Philip, > >> On 4. Jun 2020, at 20:37, Philip Semanchuk <philip@americanefficient.com> wrote: >> >> [...] >>> >>>> This brings up a couple of questions — >>>> 1) I’ve read that this is Postgres’ formula for the max # of workers it will consider for a table — >>>> >>>> max_workers = log3(table size / min_parallel_table_scan_size) >>>> >>>> Does that use the raw table size, or does the planner use statistics to estimate the size of the subset of the tablethat will be read before allocating workers? >>> >>> "table size" is the number of PSQL pages, i.e. relation-size / 8 kB. This comes from statistics. >> >> OK, so it sounds like the planner does *not* use the values in pg_stats when planning workers, true? > > Full disclosure: I am not too deep into these internals, likely others on the list know much more about it. But with respectto the relation size, I think this is tracked elsewhere, it might be affected by other parameters though like vacuumingand probably, the estimated amount of how much of the table is scanned also plays a role. I’m not too familiar with the internals either, but if I interpret this line of code correctly, it’s seems that pg_statsis not involved, and the worker allocation is based strictly on pages in the relation -- https://github.com/postgres/postgres/blob/master/src/backend/optimizer/path/allpaths.c#L800 That means I still don’t have a reason for why this query gets a different number of workers depending on the WHERE clause,but I can experiment with that more on my own. >> I’m still trying to understand one thing I’ve observed. I can run the query that produced the plan in the gist I linkedto above with max_parallel_workers_per_gather=6 and the year param = 2018, and I get 6 workers. When I set the yearparam=2022 I get only one worker. Same tables, same query, different parameter. That suggests to me that the planneris using pg_stats when allocating workers, but I can imagine there might be other things going on that I don’t understand.(I haven’t ruled out that this might be an AWS-specific quirk, either.) > > I think it would be helpful, if you could post again both plans. The ideal would be to use https://explain.dalibo.com/and share the links. You will have to generate them with JSON format, but still can anonymizethem. I really appreciate all the help you and others have already given. I think I’m good for now. Thank you so much, Philip