Re: pg12 partitions show bad performance vs pg96 - Mailing list pgsql-performance

From Mariel Cherkassky
Subject Re: pg12 partitions show bad performance vs pg96
Date
Msg-id CA+t6e1nrXp6HW8iaAmKCY=xn8hCw6yS32nW8WJGb+4FEX07zjQ@mail.gmail.com
Whole thread Raw
In response to Re: pg12 partitions show bad performance vs pg96  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: pg12 partitions show bad performance vs pg96  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-performance
OK so I found the problem but other problem appeared.
I found out that the pg12 machine had some limits on the vm settings in aspect of cpu and memory. Now both machines are exactly the same in aspect of all hardware and dont have any limit.
CPU - 8
RAM - 32GB.

I tested it with cold cache :
service postgresql stop;
echo 1 > /proc/sys/vm/drop_caches;
service postgresql start;
psql -d postgres -U postgres;

I used two simples queries, one that implicitly comparing the partition key with a const value and another one that joins other table by the partition column(and in this query the problem).

The first query : results are better with pg12 : 
explain analyze select * from iot_data where device=51;

PG96
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Seq Scan on iot_data  (cost=0.00..2083334.60 rows=976667 width=37) (actual time=1.560..67144.164 rows=1010315 loops=1)
   Filter: (device = 51)
   Rows Removed by Filter: 98989685
 Planning time: 9.219 ms
 Execution time: 67,228.431 ms
(5 rows)



PG12 - 3 PARTITIONS
                                                              QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..514086.40 rows=1027284 width=37) (actual time=3.871..15022.118 rows=1010315 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on iot_data_0  (cost=0.00..410358.00 rows=428035 width=37) (actual time=1.670..14815.480 rows=336772 loops=3)
         Filter: (device = 51)
         Rows Removed by Filter: 9764341
 Planning Time: 27.292 ms
 Execution Time: 15085.317 ms
(8 rows)

The second query with pg12 : 
QUERY : explain analyze select * from iot_data da,iot_device de where de.name in ('50a') and de.id = da.device and da.metadata=50 and lower(da.data) like '50';  

PG96

postgres=# explain analyze select * from iot_data da,iot_device de where de.name in ('50a') and de.id = da.device and da.metadata=50 and lower(da.data) like '50';
                                                         QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..2583361.51 rows=20 width=44) (actual time=44894.312..44894.312 rows=0 loops=1)
   Join Filter: (da.device = de.id)
   ->  Seq Scan on iot_device de  (cost=0.00..2.25 rows=1 width=7) (actual time=0.018..0.028 rows=1 loops=1)
         Filter: (name = '50a'::text)
         Rows Removed by Filter: 99
   ->  Seq Scan on iot_data da  (cost=0.00..2583334.84 rows=1954 width=37) (actual time=44894.279..44894.279 rows=0 loops=1)
         Filter: ((metadata = 50) AND (lower(data) ~~ '50'::text))
         Rows Removed by Filter: 100000000
 Planning time: 11.313 ms
 Execution time: 44894.357 ms
(10 rows)



PG12 - 3 PARTITIONS

                                                                     QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1002.26..1563512.35 rows=10 width=44) (actual time=22306.091..22309.209 rows=0 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Hash Join  (cost=2.26..1562511.35 rows=4 width=44) (actual time=22299.412..22299.413 rows=0 loops=3)
         Hash Cond: (da_2.device = de.id)
         ->  Parallel Append  (cost=0.00..1562506.90 rows=814 width=37) (actual time=22299.411..22299.411 rows=0 loops=3)
               ->  Parallel Seq Scan on iot_data_2 da_2  (cost=0.00..584064.14 rows=305 width=37) (actual time=9076.535..9076.535 rows=0 loops=3)
                     Filter: ((metadata = 50) AND (lower(data) ~~ '50'::text))
                     Rows Removed by Filter: 12460009
               ->  Parallel Seq Scan on iot_data_1 da_1  (cost=0.00..504948.69 rows=262 width=36) (actual time=10296.751..10296.751 rows=0 loops=2)
                     Filter: ((metadata = 50) AND (lower(data) ~~ '50'::text))
                     Rows Removed by Filter: 16158316
               ->  Parallel Seq Scan on iot_data_0 da  (cost=0.00..473490.00 rows=247 width=37) (actual time=19075.081..19075.081 rows=0 loops=1)
                     Filter: ((metadata = 50) AND (lower(data) ~~ '50'::text))
                     Rows Removed by Filter: 30303339
         ->  Hash  (cost=2.25..2.25 rows=1 width=7) (never executed)
               ->  Seq Scan on iot_device de  (cost=0.00..2.25 rows=1 width=7) (never executed)
                     Filter: (name = '50a'::text)
 Planning Time: 30.429 ms
 Execution Time: 22309.364 ms
(20 rows)

I tried disabling max_parallel_workers_gathers but It just decreased the db`s performance.
Now regarding the main issue here - as u can see when I used the second query, I didnt mentioned the partition column specificly but I joined another table based on it( where de.name in ('50a') and de.id = da.device)
This condition should be enough for the optimizer to understand that it needs to scan a specific partition but it scans all the partitions. The "never executed" tag isnt added to the partitions scans but it is added to the joined table.

Justin - Regarding adding index on the parittion column - I dont understand why ? the value in that column is the same for all rows in the partition, when exactly the index will be used ?

pgsql-performance by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: pg12 partitions show bad performance vs pg96
Next
From: Mariel Cherkassky
Date:
Subject: Re: pg12 partitions show bad performance vs pg96