Re: [HACKERS] Runtime Partition Pruning - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: [HACKERS] Runtime Partition Pruning |
Date | |
Msg-id | CAKJS1f8dq1BcvNEMTitp3O7Gc_cN1Dmn9vxxacb-kiX+mmyGSg@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Runtime Partition Pruning (Jesper Pedersen <jesper.pedersen@redhat.com>) |
Responses |
Re: [HACKERS] Runtime Partition Pruning
Re: [HACKERS] Runtime Partition Pruning Re: [HACKERS] Runtime Partition Pruning |
List | pgsql-hackers |
On 10 March 2018 at 07:50, Jesper Pedersen <jesper.pedersen@redhat.com> wrote: > On 03/01/2018 08:29 PM, David Rowley wrote: >> I'll delay releasing a new patch as there's some discussion over on >> the faster partition pruning thread which affects this too [1] >> >> [1] >> https://www.postgresql.org/message-id/CA+Tgmoa4D1c4roj7L8cx8gkkeBWAZD=MTcXKxTwBnsLRHD3rig@mail.gmail.com >> > > Sure, 0003-0005 depends on that thread. 0002 is refactoring so that one is > ready. Okay, back to this again. The faster partition pruning patch has undergone a large refactor which cause me to delay basing this patch on top of it until that patch has stabilised again. It's now getting close, so this seems like a good time to send a new verison of this. > In 0004 should we add a HASH based test case, I don't think we really need to test each partition type in too much detail in the run-time pruning patch. I'm more interested in testing multiple partition levels there's important new code in this patch which does need lots of additional testing for multi-level partitions. I think adding a HASH partition case would be more aiming to test the planner's pruning code, which this does not make any behavioural changes to. > Also, should 0004 consider the "Parallel Append" case, aka There is a parallel Append test case in there already, see the queries below the "-- parallel append" comment. I've attached a new version of the patch. I'm now at v18 after having some versions of the patch that I didn't release which were based on various versions of Amit's faster partition pruning patch. The attached patchset is based on Amit's v45 faster partition pruning [1]. I've made a few changes since the v14 version. Since Amit's v45 patch now creates the partition pruning details in a data structure that can be copied from the planner over to the executor, it means this patch is now able to do much of the setup work for run-time pruning in the planner. Doing this now allows us to determine if run-time pruning is even possible at plan time, rather than the executor attempting it and sometimes wasting effort when it failed to find Params matching the partition key. Another change from the last version is that I've separated out the handling of exec Params and external Params. The new patch now will perform a pruning step at executor startup if some external params match the partition key. This is very beneficial to a PREPAREd OLTP type query against a partitioned table as it means we can skip sub-plan initialisation for all non-matching partitions. Initialising Append/MergeAppend/ModifyTable nodes with fewer subnodes than were originally planned did require a small change in explain.c in some code that was assuming the subnode arrays were the same length as the subplan list. I also ended up adding a Int property to EXPLAIN to show the number of subnodes that have been removed during execution. Unfortunately, there is a small corner case with this in the case where all subnodes are removed it leaves EXPLAIN with no subnodes to search for outer side Vars in. I didn't really see any sane way to handle this in EXPLAIN, so I think the best fix for this is what I've done, and that's just to always initalise at least 1 subnode, even when none match the external Params. Cost-wise, I don't think this is such a big deal as the cost savings here are coming from saving on initalising ten's or hundreds of subnodes, not 1. To put the new patch to the test, I tried pgbench -S -M prepared -s 100 with and without having modified pgbench_accounts to separate into 10 RANGE partitions of equal size. A non-partitioned table was getting 12503 TPS. With partitioned tables, the old version of this patch was getting: 5470 TPS. With partitioned tables, the attached version gets 11247 TPS. For perspective, today's master with a partitioned table gets 4719 TPS. So you can see it's a pretty good performance boost by skipping initialisation of the 9 non-matching subplans. It's not hard to imagine the gains getting more significant with a larger number of partitions. Ideally, the performance of a partitioned table would be faster than the non-partitioned case, but please remember this query is a single row PK lookup SELECT, so is a very fast query in both cases. Partitioning cases should improve more as the table grows and indexes struggle to fit in RAM, or when many rows are being taken from the partition and being aggregated. Unfortunately, the story is not quite as good for the non -M prepared version of the benchmark. This shows that planning time of partitioned table queries could still use some improvements. Amit's v45 patch certainly makes a dent in the planner slow performance here, but it's still nothing like as fast as the non-partitioned case. More work is required there in the future. This patchset also contains a patch to improve the performance of inheritance planning of UPDATE/DELETE queries. This patch also implements run-time pruning for UPDATE/DELETE too. This also has a significant performance improvement for planning of UPDATE/DELETE operations on partitioned tables with a large number of partitions, performance is as follows: Values in transactions per second. Partitions = 1 Unpatched: 7323.3 Patched: 6573.2 (-10.24%) Partitions = 2 Unpatched: 6784.8 Patched: 6377.1 (-6.01%) Partitions = 4 Unpatched: 5903.0 Patched: 6106.8 (3.45%) Partitions = 8 Unpatched: 4582.0 Patched: 5579.9 (21.78%) Partitions = 16 Unpatched: 3131.5 Patched: 4521.2 (44.38%) Partitions = 32 Unpatched: 1779.8 Patched: 3387.8 (90.35%) Partitions = 64 Unpatched: 821.9 Patched: 2245.4 (173.18%) Partitions = 128 Unpatched: 322.2 Patched: 1319.6 (309.56%) Partitions = 256 Unpatched: 84.3 Patched: 731.7 (768.27%) Partitions = 512 Unpatched: 22.5 Patched: 382.8 (1597.74%) Partitions = 1024 Unpatched: 5.5 Patched: 150.1 (2607.83%) This shows a small regression in planner performance of 10% for a table with 1 partition, but performance starts improving again by just 4 partitions. By the time we get to 1024 partitions the patched planner is 26 times faster than unpatched. It's likely not a no-brainer since some people may get a small performance regression, but it certainly makes having larger numbers of partitions much more usable than it was previously. There are still a few bugs lingering in v45 of Amit's faster partition pruning patch. One of which is causing a regression test to fail in the attached patch set. I've also attached a patch which must be applied after Amit's v45 patchset to fixup a couple of things missing. Hopefully, this won't be needed once v46 is out. To test this apply all patches in [1], then apply faster_part_prune_v45_fixups.patch, then the attached 0001-0005 patches. [1] https://www.postgresql.org/message-id/fc73cef4-6879-26c3-6859-2f910640234a@lab.ntt.co.jp -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
- faster_part_prune_v45_fixups.patch
- v18-0001-Provide-infrastructure-to-allow-partition-prunin.patch
- v18-0002-Add-bms_prev_member-function.patch
- v18-0003-Allow-Append-subnodes-to-be-pruned-during-execut.patch
- v18-0004-Allow-MergeAppend-s-subnodes-to-be-pruned-during.patch
- v18-0005-Improve-planning-speed-of-partitioned-table-UPDA.patch
pgsql-hackers by date: