Home > mailing lists

Re: pg12 partitions show bad performance vs pg96 - Mailing list pgsql-performance

From	Mariel Cherkassky
Subject	Re: pg12 partitions show bad performance vs pg96
Date	March 9, 2020 13:15:50
Msg-id	CA+t6e1mx_KiR735zbPFmkG-HBmLOOGRX3-ZMWNok7F-ieTdB3A@mail.gmail.com Whole thread Raw
In response to	Re: pg12 partitions show bad performance vs pg96 (David Rowley <dgrowleyml@gmail.com>)
Responses	Re: pg12 partitions show bad performance vs pg96 (Mariel Cherkassky <mariel.cherkassky@gmail.com>)
List	pgsql-performance

Tree view

8 ms seems pretty slow to planning that query. Does the planning time
drop if you execute this multiple times in the same session? Does the
time change if you try again without any foreign keys?

No one is using the system besides me, therefore after running the query one time

most of the data is in the cache... If I run it multiple times the query time is reduced :

Planning Time: 0.361 ms
Execution Time: 0.110 ms

Can you share the results of that?

Sure. I did the same procedure but this time I inserted 100m records instead of 10m. This time the results were by far worse in pg12 :

PG12 :

postgres=# explain analyze select * from iot_data da,iot_device de where de.name in ('50a') and de.id = da.device and da.metadata=50 and lower(da.data) like '50';

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=1002.26..1563512.35 rows=10 width=44) (actual time=95161.056..95218.764 rows=0 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Join (cost=2.26..1562511.35 rows=4 width=44) (actual time=94765.860..94765.861 rows=0 loops=3)
Hash Cond: (da_2.device = de.id)
-> Parallel Append (cost=0.00..1562506.90 rows=814 width=37) (actual time=94765.120..94765.120 rows=0 loops=3)
-> Parallel Seq Scan on iot_data_2 da_2 (cost=0.00..584064.14 rows=305 width=37) (actual time=36638.829..36638.829 rows=0 loops=3)
Filter: ((metadata = 50) AND (lower(data) ~~ '50'::text))
Rows Removed by Filter: 12460009
-> Parallel Seq Scan on iot_data_1 da_1 (cost=0.00..504948.69 rows=262 width=36) (actual time=43990.427..43990.427 rows=0 loops=2)
Filter: ((metadata = 50) AND (lower(data) ~~ '50'::text))
Rows Removed by Filter: 16158316
-> Parallel Seq Scan on iot_data_0 da (cost=0.00..473490.00 rows=247 width=37) (actual time=86396.665..86396.665 rows=0 loops=1)
Filter: ((metadata = 50) AND (lower(data) ~~ '50'::text))
Rows Removed by Filter: 30303339
-> Hash (cost=2.25..2.25 rows=1 width=7) (never executed)
-> Seq Scan on iot_device de (cost=0.00..2.25 rows=1 width=7) (never executed)
Filter: (name = '50a'::text)
Planning Time: 45.724 ms
Execution Time: 95252.712 ms
(20 rows)

PG96 :

The most basic guidelines for table partitioning are, don't partition
your tables unless it's a net win. If partitioning was always
faster, we'd just have designed Postgres to implicitly partition all
of your tables for you. There are some other guidelines in [1].

Isnt creating partition should increase the execution time ? I mean, instead of running on a table with 10m records, I can run over a partition with 3m records. isnt less data means better performance for simple queries like the one I used ?

I read the best practice for the docs, and I think that I met most of them - I chose the right partition key(in this case it was device),

Regarding the amount of partitions - I choose 3 just to test the results. I didnt create a lot of partitions, and my logic tells me that querying a table with 3m records should be faster than 10m records.. Am I missing something ?

pgsql-performance by date:

From: Justin Pryzby
Date: 09 March 2020, 06:10:05
Subject: Re: pg12 partitions show bad performance vs pg96

From: Mariel Cherkassky
Date: 09 March 2020, 13:31:15
Subject: Re: pg12 partitions show bad performance vs pg96

Re: pg12 partitions show bad performance vs pg96 - Mailing list pgsql-performance

Previous

Next