Re: partitioning performance tests after recent patches - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: partitioning performance tests after recent patches |
Date | |
Msg-id | 1c190271-526b-d4df-ca98-6f9011ca89f6@lab.ntt.co.jp Whole thread Raw |
In response to | partitioning performance tests after recent patches (Floris Van Nee <florisvannee@Optiver.com>) |
Responses |
Re: partitioning performance tests after recent patches
|
List | pgsql-hackers |
Hi, Thanks a lot for very exhaustive testing. David already replied to some points, but let me comment on a couple of points. Please be advised that these may not be the numbers (or scalability pattern of these numbers) you'll see when PG 12 is actually released, because we may end up changing something that makes performance suffer a bit. In particular, we are contemplating some changes around the safety of planner's handling of cached partitioning metadata (in light of reduced lock levels for adding/removing partitions) that might reduce the TPS figure, the impact of which would worsen as the number of partitions increases. Although, nothing is final yet; if interested, you can follow that discussion at [1]. On 2019/04/15 4:19, Floris Van Nee wrote: > The test cases were (see benchmark.sql for the SQL commands for setup and test cases): > 1. Insert batches of 1000 rows per transaction > 2. Simple SELECT query pruning on a static timestamp > 3. The same SELECT query with static timestamp but with an added 'ORDER BY a, updated_at DESC LIMIT 1', which matches theindex defined on the table > 4. The same simple SELECT query, but now it's wrapped inside an inlineable SQL function, called with a static timestamp > 5. The same simple SELECT query, but now it's wrapped inside a non-inlineable SQL function, called with a static timestamp > 6. The same simple SELECT query, but now it's wrapped inside a plpgsql function, called with a static timestamp > 7. Simple SELECT query pruning on a timestamp now() > 8. The same SELECT query with dynamic timestamp but with an added 'ORDER BY a, updated_at DESC LIMIT 1', which matchesthe index defined on the table > 9. The same simple SELECT query, but now it's wrapped inside an inlineable SQL function, called with a dynamic timestamp > 10. The same simple SELECT query, but now it's wrapped inside a non-inlineable SQL function, called with a dynamic timestamp > 11. The same simple SELECT query, but now it's wrapped inside a plpgsql function, called with a dynamic timestamp > 12. The same query as 2) but then in an inlineable function > 13. The same query as 3) but then in an inlineable function > 14. A SELECT with nested loop (10 iterations) with opportunities for run-time pruning - some rows from a table are selectedand the timestamp from rows in that table is used to join on another partitioned table > > The full results can be found in the attached file (results.txt). I also produced graphs of the results, which can be foundon TimescaleDb's Github page [1]. Please take a look at these figures for an easy overview of the results. In generalperformance of HEAD looks really good. > > While looking at these results, there were a few questions that I couldn't answer. > 1) It seems like the queries inside plpgsql functions (case 6 and 11) perform relatively well in PG11 compared to a non-inlineableSQL function (case 5 and 10), when a table consists of many partitions. As far as I know, both plpgsql andnon-inlineable SQL functions are executed with generic plans. What can explain this difference? Are non-inlineable SQLfunction plans not reused between transactions, while plpgsql plans are? > 2) Is running non-inlined SQL functions with a generic plan even the best option all the time? Wouldn't it be better toadopt a similar approach to what plpgsql does, where it tries to test if using a generic plan is beneficial? The non-inlineableSQL functions suffer a big performance hit for a large number of partitions, because they cannot rely on staticplanning-time pruning. I'd never noticed this before. It indeed seems to be the case that SQL functions and plpgsql functions are handled using completely different code paths, of which only for the latter it's possible to use static planning-time pruning. > There was one other thing I noticed, and I believe it was raised by Tom in a separate thread as well: the setup code itselfis really slow. Creating of partitions is taking a long time (it's taking several minutes to create the 4096 partitiontable). Yeah that's rather bad. Thinking of doing something about that for PG 13. Thanks, Amit [1] https://www.postgresql.org/message-id/27380.1555270166%40sss.pgh.pa.us
pgsql-hackers by date: