Thread: Slow statement using parallelism after 9.6>11 upgrade

Slow statement using parallelism after 9.6>11 upgrade

From
"Arnaud L."
Date:
Hi list

We have upgraded our database from 9.6 to 11 (and updated PostGIS from 
2.3 to 2.5 as well).
We are using it among other things to store an OpenStreetMap database, 
populated by the osm2pgsql program and updated on a daily basis.

The process used to take ~1h30 minutes before the upgrade, but after the 
upgrade it looks like it will never run to completion in a 24h time frame.

The problem is apparently that the planner choses to run some statements 
in parallel, but these take a lot longer than when run without it.
Here are the to explain analyze of the same statement, with parallelism 
on and off :

Statement :
PREPARE mark_ways_by_node(int8) AS select id from planet_osm_ways WHERE 
nodes && ARRAY[$1];
EXPLAIN EXECUTE mark_ways_by_node(1);


max_parallel_workers_per_gather = 2 :

Gather  (cost=12545.61..2357352.25 rows=1420982 width=8) (actual 
time=2.577..64.028 rows=1 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Bitmap Heap Scan on planet_osm_ways 
(cost=11545.61..2214254.05 rows=592076 width=8) (actual 
time=0.213..0.213 rows=0 loops=3)
         Recheck Cond: (nodes && '{1}'::bigint[])
         Heap Blocks: exact=1
         ->  Bitmap Index Scan on planet_osm_ways_nodes_idx 
(cost=0.00..11190.36 rows=1420982 width=0) (actual time=0.173..0.174 
rows=1 loops=1)
               Index Cond: (nodes && '{1}'::bigint[])
Planning Time: 8.596 ms
Execution Time: 64.135 ms


max_parallel_workers_per_gather = 0 :

Bitmap Heap Scan on planet_osm_ways  (cost=11545.61..3462154.46 
rows=1420982 width=8) (actual time=0.677..0.679 rows=1 loops=1)
   Recheck Cond: (nodes && '{1}'::bigint[])
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on planet_osm_ways_nodes_idx 
(cost=0.00..11190.36 rows=1420982 width=0) (actual time=0.268..0.268 
rows=1 loops=1)
         Index Cond: (nodes && '{1}'::bigint[])
Planning Time: 0.193 ms
Execution Time: 0.773 ms


So this Parallel Bitmap Heap Scan seems to be quite problematic here.
What runtime setting should I change to fix this, without losing the 
benefit of parallelism ?

Thanks a lot !

Cheers
--
Arnaud



Re: Slow statement using parallelism after 9.6>11 upgrade

From
Tom Lane
Date:
"Arnaud L." <arnaud.listes@codata.eu> writes:
> We have upgraded our database from 9.6 to 11 (and updated PostGIS from 
> 2.3 to 2.5 as well).
> ...

Have you re-ANALYZEd the database?  The problem with this query
seems to be the spectacularly awful rowcount estimate here:

>    ->  Bitmap Index Scan on planet_osm_ways_nodes_idx 
> (cost=0.00..11190.36 rows=1420982 width=0) (actual time=0.268..0.268 
> rows=1 loops=1)
>          Index Cond: (nodes && '{1}'::bigint[])

The planner should be able to do better than that, given up-to-date
statistics on the "nodes" column.

            regards, tom lane



Re: Slow statement using parallelism after 9.6>11 upgrade

From
"Arnaud L."
Date:
Le 03/09/2019 à 15:43, Tom Lane a écrit :
> "Arnaud L." <arnaud.listes@codata.eu> writes:
>> We have upgraded our database from 9.6 to 11 (and updated PostGIS from 
>> 2.3 to 2.5 as well).
>> ...
> 
> Have you re-ANALYZEd the database?  The problem with this query
> seems to be the spectacularly awful rowcount estimate here:

You mean after the upgrade process ? Yes I have.
I've juste re-run "ANALYZE table" to rule this out, estimates are the same.

Maybe some statistic target problem ? Estimated number of rows is 
284.196.352
Also, this is a GIN index on a bigint[] column.

I've setup parallel_tuple_cost to 1.0 parallel_setup_cost to 5000.0 for 
the time being which solves this specific problem. These value don't 
look very sensible though, they are very high compared to the default ones.

Cheers
--
Arnaud



Re: Slow statement using parallelism after 9.6>11 upgrade

From
Paul Ramsey
Date:

> On Sep 3, 2019, at 7:06 AM, Arnaud L. <arnaud.listes@codata.eu> wrote:
>
> Le 03/09/2019 à 15:43, Tom Lane a écrit :
>> "Arnaud L." <arnaud.listes@codata.eu> writes:
>>> We have upgraded our database from 9.6 to 11 (and updated PostGIS from 2.3 to 2.5 as well).
>>> ...
>> Have you re-ANALYZEd the database?  The problem with this query
>> seems to be the spectacularly awful rowcount estimate here:
>
> You mean after the upgrade process ? Yes I have.
> I've juste re-run "ANALYZE table" to rule this out, estimates are the same.
>
> Maybe some statistic target problem ? Estimated number of rows is 284.196.352
> Also, this is a GIN index on a bigint[] column.
>
> I've setup parallel_tuple_cost to 1.0 parallel_setup_cost to 5000.0 for the time being which solves this specific
problem.These value don't look very sensible though, they are very high compared to the default ones. 

You can also leave that setting unchanged and just change the behaviour on your one table:

ALTER TABLE nodes SET ( parallel_workers = 0);

P.

>
> Cheers
> --
> Arnaud
>
>




Re: Slow statement using parallelism after 9.6>11 upgrade

From
Imre Samu
Date:
> We have upgraded our database from 9.6 to 11 

This is now the latest PG ?   PG11.5?
( for example - in PG11.5 fixed:  "Fix failure to resolve deadlocks involving multiple parallel worker processes"
> populated by the osm2pgsql program and updated on a daily basis.
> What runtime setting should I change to fix this, without losing the  benefit of parallelism ?

- the osm2pgsql has an own parallelizations ...  ( osm2pgsql  --number-processes .. ) 
   so be careful to add more parallelisation to the PG side with the same osm2pgsql parameters !   ( check the memory usages! )
- check the benchmarks and the tunnings:  https://wiki.openstreetmap.org/wiki/Osm2pgsql/benchmarks
- you can ask help on the  : "osm dev mail list" (="dev OpenStreetMap developer discusssion" https://lists.openstreetmap.org/listinfo/dev )

Imre




Arnaud L. <arnaud.listes@codata.eu> ezt írta (időpont: 2019. szept. 3., K, 14:11):
Hi list

We have upgraded our database from 9.6 to 11 (and updated PostGIS from
2.3 to 2.5 as well).
We are using it among other things to store an OpenStreetMap database,
populated by the osm2pgsql program and updated on a daily basis.

The process used to take ~1h30 minutes before the upgrade, but after the
upgrade it looks like it will never run to completion in a 24h time frame.

The problem is apparently that the planner choses to run some statements
in parallel, but these take a lot longer than when run without it.
Here are the to explain analyze of the same statement, with parallelism
on and off :

Statement :
PREPARE mark_ways_by_node(int8) AS select id from planet_osm_ways WHERE
nodes && ARRAY[$1];
EXPLAIN EXECUTE mark_ways_by_node(1);


max_parallel_workers_per_gather = 2 :

Gather  (cost=12545.61..2357352.25 rows=1420982 width=8) (actual
time=2.577..64.028 rows=1 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Bitmap Heap Scan on planet_osm_ways
(cost=11545.61..2214254.05 rows=592076 width=8) (actual
time=0.213..0.213 rows=0 loops=3)
         Recheck Cond: (nodes && '{1}'::bigint[])
         Heap Blocks: exact=1
         ->  Bitmap Index Scan on planet_osm_ways_nodes_idx
(cost=0.00..11190.36 rows=1420982 width=0) (actual time=0.173..0.174
rows=1 loops=1)
               Index Cond: (nodes && '{1}'::bigint[])
Planning Time: 8.596 ms
Execution Time: 64.135 ms


max_parallel_workers_per_gather = 0 :

Bitmap Heap Scan on planet_osm_ways  (cost=11545.61..3462154.46
rows=1420982 width=8) (actual time=0.677..0.679 rows=1 loops=1)
   Recheck Cond: (nodes && '{1}'::bigint[])
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on planet_osm_ways_nodes_idx
(cost=0.00..11190.36 rows=1420982 width=0) (actual time=0.268..0.268
rows=1 loops=1)
         Index Cond: (nodes && '{1}'::bigint[])
Planning Time: 0.193 ms
Execution Time: 0.773 ms


So this Parallel Bitmap Heap Scan seems to be quite problematic here.
What runtime setting should I change to fix this, without losing the
benefit of parallelism ?

Thanks a lot !

Cheers
--
Arnaud


Re: Slow statement using parallelism after 9.6>11 upgrade

From
"Arnaud L."
Date:
Le 03/09/2019 à 16:39, Paul Ramsey a écrit :
>> On Sep 3, 2019, at 7:06 AM, Arnaud L. <arnaud.listes@codata.eu> wrote:
>> I've setup parallel_tuple_cost to 1.0 parallel_setup_cost to 5000.0 for the time being which solves this specific
problem.These value don't look very sensible though, they are very high compared to the default ones.
 
> 
> You can also leave that setting unchanged and just change the behaviour on your one table:
> 
> ALTER TABLE nodes SET ( parallel_workers = 0);

Nice, this will fix this precise query and leave parallelism in place 
for the other ones. Thanks !

--
Arnaud



Re: Slow statement using parallelism after 9.6>11 upgrade

From
"Arnaud L."
Date:
Le 03/09/2019 à 17:02, Imre Samu a écrit :
>  > We have upgraded our database from 9.6 to 11
> 
> This is now the latest PG ?   PG11.5?

Yes, PostgreSQL 11.5, compiled by Visual C++ build 1914, 64-bit.


> - the osm2pgsql has an own parallelizations ...  ( osm2pgsql  
> --number-processes .. )
>     so be careful to add more parallelisation to the PG side with the 
> same osm2pgsql parameters !   ( check the memory usages! )
> - check the benchmarks and the tunnings: 
> https://wiki.openstreetmap.org/wiki/Osm2pgsql/benchmarks
> - you can ask help on the  : "osm dev mail list" (="dev OpenStreetMap 
> developer discusssion" https://lists.openstreetmap.org/listinfo/dev )


Yes, I know. It as been fine tuned over some time now to fit our 
server's capacity.
Disabling parallelisme has allowed the update to run in ~5000s, which is 
consistent with what we could witness before the upgrade.

I had started writing an issue on the osm2pgsql github before realizing 
it was probably a PG-only issue.


Cheers
--
Arnaud




Re: Slow statement using parallelism after 9.6>11 upgrade

From
"Arnaud L."
Date:
Le 03/09/2019 à 15:43, Tom Lane a écrit :
> "Arnaud L." <arnaud.listes@codata.eu> writes:
>> We have upgraded our database from 9.6 to 11 (and updated PostGIS from 
>> 2.3 to 2.5 as well).
>> ...
> 
> Have you re-ANALYZEd the database?  The problem with this query
> seems to be the spectacularly awful rowcount estimate here:
> 
>>    ->  Bitmap Index Scan on planet_osm_ways_nodes_idx 
>> (cost=0.00..11190.36 rows=1420982 width=0) (actual time=0.268..0.268 
>> rows=1 loops=1)
>>          Index Cond: (nodes && '{1}'::bigint[])
> 
> The planner should be able to do better than that, given up-to-date
> statistics on the "nodes" column.


Tom, I can confirm that with up to date statistics the planner is still 
lost.
I did a REINDEX to rule out a broken index and the estimate is still in 
the 100k+ range.


Regards
--
Arnaud



Re: Slow statement using parallelism after 9.6>11 upgrade

From
"Arnaud L."
Date:
Le 04/09/2019 à 09:04, Arnaud L. a écrit :
> Tom, I can confirm that with up to date statistics the planner is still
> lost.
> I did a REINDEX to rule out a broken index and the estimate is still in
> the 100k+ range.

Sorry, I meant 1M+ range.

EXPLAIN ANALYZE select id from planet_osm_ways WHERE nodes && 
ARRAY[123456789::bigint];

(parallel_workers = 0 on the table as per Paul's recommandation) :

Bitmap Heap Scan on planet_osm_ways  (cost=11582.45..3535447.30 
rows=1419000 width=8) (actual time=0.198..0.199 rows=1 loops=1)
   Recheck Cond: (nodes && '{123456789}'::bigint[])
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on planet_osm_ways_nodes_idx 
(cost=0.00..11227.70 rows=1419000 width=0) (actual time=0.151..0.151 
rows=1 loops=1)
         Index Cond: (nodes && '{123456789}'::bigint[])
Planning Time: 0.260 ms
Execution Time: 0.249 ms


Regards
--
Arnaud



Re: Slow statement using parallelism after 9.6>11 upgrade

From
"Arnaud L."
Date:
Le 04/09/2019 à 09:04, Arnaud L. a écrit :
> Le 03/09/2019 à 15:43, Tom Lane a écrit :
>> The planner should be able to do better than that, given up-to-date
>> statistics on the "nodes" column.
> 
> Tom, I can confirm that with up to date statistics the planner is still
> lost.
> I did a REINDEX to rule out a broken index and the estimate is still in
> the 100k+ range.


Setting STATISTICS target to 10.000 on the nodes column and re-analyzing 
did not help either. Estimates are still ~1.4M rows when result is 1 row.

Regards
--
Arnaud



Bad estimates on GIN bigint[] index

From
"Arnaud L."
Date:
Le 03/09/2019 à 15:43, Tom Lane a écrit :
> "Arnaud L." <arnaud.listes@codata.eu> writes:
>>    ->  Bitmap Index Scan on planet_osm_ways_nodes_idx 
>> (cost=0.00..11190.36 rows=1420982 width=0) (actual time=0.268..0.268 
>> rows=1 loops=1)
>>          Index Cond: (nodes && '{1}'::bigint[])
> 
> The planner should be able to do better than that, given up-to-date
> statistics on the "nodes" column.


Sorry to up this thread, but is there anything I can do to help the 
planner in this particular case ?
REINDEXing did not help, nor did ANALYZEing with different STATISTICS 
target for this specific column.


Regards
--
Arnaud