Re: [HACKERS] path toward faster partition pruning - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: [HACKERS] path toward faster partition pruning |
Date | |
Msg-id | 0df782e5-e7a3-d19d-c172-1e73f9cd9424@lab.ntt.co.jp Whole thread Raw |
In response to | Re: [HACKERS] path toward faster partition pruning (David Rowley <david.rowley@2ndquadrant.com>) |
Responses |
Re: [HACKERS] path toward faster partition pruning
Re: [HACKERS] path toward faster partition pruning |
List | pgsql-hackers |
Hi David. On 2017/11/08 13:44, David Rowley wrote: > On 7 November 2017 at 01:52, David Rowley <david.rowley@2ndquadrant.com> wrote: >> Thanks. I'll look over it all again starting my Tuesday morning. (UTC+13) > > Hi Amit, > > I had another look over this today. Apologies if any of the review seems petty. Thanks a lot for the review. > Here goes: > > 1. If test seems to be testing for a child that's a partitioned table, > but is testing for a non-NULL part_scheme. > > /* > * If childrel is itself partitioned, add it and its partitioned > * children to the list being propagated up to the root rel. > */ > if (childrel->part_scheme && rel->part_scheme) > > Should this code use IS_PARTITIONED_REL() instead? Seems a bit strange > to test for a NULL part_scheme I guess that makes sense, done. > > 2. There's a couple of mistakes in my bms_add_range() code. I've > attached bms_add_range_fix.patch. Can you apply this to your tree? Thanks. I have used your bms_add_range_v2.patch that you sent earlier today and listed both your and Horiguchi-san's names as author. > 3. This assert seems to be Asserting the same thing twice: > > Assert(rel->live_partitioned_rels != NIL && > list_length(rel->live_partitioned_rels) > 0); > > A List with length == 0 is always NIL. You're right. I changed it to: Assert(list_length(rel->live_partitioned_rels) >= 1); > 4. get_partitions_from_clauses(), can you comment why you perform the > list_concat() there. > > I believe this is there so that the partition bound from the parent is > passed down to the child so that we can properly eliminate all child > partitions when the 2nd level of partitioning is using the same > partition key as the 1st level. I think this deserves a paragraph of > comment to explain this. Yes, that's the intent. I implemented it as a solution to fix a problem that was reported upthread, whereby the default partition pruning didn't work as desired. I tried to explain it in the following email: https://www.postgresql.org/message-id/8499324c-8a33-4be7-9d23-7e6a95e60ddf%40lab.ntt.co.jp But, since I devised it as a solution to get the desired behavior for the default partition, I modified the code to include partition constraint clauses to do it only when the table has a default partition in the first place. Doing it always is an overkill. Please see the comment added nearby if it now helps make sense of what's going on. > 5. Please add a comment to explain what's going on here in > classify_partition_bounding_keys() > > if (partattno == 0) > { > partexpr = lfirst(partexprs_item); > partexprs_item = lnext(partexprs_item); > } > > Looks like, similar to index expressions, that partition expressions > are attno 0 to mark to consume the next expression from the list. Right. > Does this need validation that there are enough partexprs_item items > like what is done in get_range_key_properties()? Or is this validated > somewhere else? Yeah, I added the check as follows: + /* Set partexpr if needed. */ if (partattno == 0) { + if (partexprs_item == NULL) + elog(ERROR, "wrong number of partition key expressions"); partexpr = lfirst(partexprs_item); partexprs_item = lnext(partexprs_item); > > 6. Comment claims the if test will test something which it does not > seem to test for: > > /* > * Redundant key elimination using btree-semantics based tricks. > * > * Only list and range partitioning use btree operator semantics, so > * skip otherwise. Also, if there are expressions whose value is yet > * unknown, skip this step, because we need to compare actual values > * below. > */ > memset(keyclauses, 0, PARTITION_MAX_KEYS * sizeof(List *)); > if (partkey->strategy == PARTITION_STRATEGY_LIST || > partkey->strategy == PARTITION_STRATEGY_RANGE) > > I was expecting this to be skipped when the clauses contained a > non-const, but it does not seem to. Fixed the comment. Actually we might end up with non-consts here if executor invokes it, so the downstream code is in position to handle them, skipping any optimizations that depend on constant values being available. There are actually even cases when the planner wouldn't mind calling here even if the matched clauses contained non-const operands as long as there are at least some constants available. > 7. Should be "compare them" > > /* > * If the leftarg and rightarg clauses' constants are both of the type > * expected by "op" clause's operator, then compare then using the > * latter's comparison function. > */ > > But if I look at the code "compare then using the latter's comparison > function." is not true, it seems to use op's comparison function not > rightarg's. With most of the calls op and rightarg are the same, but > not all of them. The function shouldn't make that assumption even if > the args op was always the same as rightarg. Rearranged the code in partition_cmp_args() a bit and added more clarifying comments. > 8. remove_redundant_clauses() needs an overview comment of what the > function does. Done. > 9. The comment should explain what we would do in the case of key < 3 > AND key <= 2 using some examples. > > /* try to keep only one of <, <= */ Done. > 10. Wondering why this loop runs backward? > > for (s = BTMaxStrategyNumber; --s >= 0;) > > Why not just: > > for (s = 0; s < BTMaxStrategyNumber; s++) > > I can't see a special reason for it to run backward. It seems unusual, > but if there's a good reason that I've failed to realise then it's > maybe worth a comment. Hmm, no special reason. So, done the other way. I actually brought this redundant key logic elimination logic over from nbtutils.c: _bt_preprocess_keys() and the loop runs that way over there. > 11. Pleae comment on why *constfalse = true is set here: > > if (!chk || s == (BTEqualStrategyNumber - 1)) > continue; > > if (partition_cmp_args(partopfamily, partopcintype, chk, eq, chk, > &test_result)) > { > if (!test_result) > { > *constfalse = true; > return; > } > /* discard the redundant key. */ > xform[s] = NULL; > } > > Looks like we'd hit this in a case such as: WHERE key = 1 AND key > 1. Right. Added a comment with example. > Also please add a comment when discarding the redundant key maybe > explain that equality is more useful than the other strategies when > there's an overlap. Done, too. Please find attached updated patch set. There are significant changes in this version as described below, including the support for hash partitioned tables. Earlier today, I reported [1] what looks to me like a bug in how default partition's constraint gets generated and how that sometimes makes constraint exclusion mistakenly prune a default partition. I have included the patches I posted there in this series too. They are ahead in the list. So attached patches are now as follows: 0001-Add-default-partition-case-in-inheritance-testing.patch 0002-Tweak-default-range-partition-s-constraint-a-little.patch Patches at [1]. 0003-Add-new-tests-for-partition-pruning.patch Tests. Mostly unchanged from the last version. 0004-Add-a-bms_add_range.patch Uses the bms_add_range_v2 patch. 0005-Planner-side-changes-for-partition-pruning.patch Fixed some issues with how OR clauses were matched and the code for checking if the operator is partitioning compatible is no longer in the planner code, instead it's now only in partition.c. Other cosmetic improvements including those that resulted from the review comments. 0006-Implement-get_partitions_from_clauses.patch Several bug fixes and many cosmetic improvements including improved commentary. Per Amul's comment upthread, the patch no longer depends on using value -1 to denote an invalid value of NulltestType enum. In fact, it doesn't use NulltestType type variables at all. 0007-Some-interface-changes-for-partition_bound_-cmp-bsea.patch No changes. 0008-Implement-get_partitions_for_keys.patch Support for hash partitioning and tests for the same. Also, since update/delete on partitioned tables still depend on constraint exclusion for pruning, fix things such that get_relation_constraints includes partition constraints in its result only for non-select queries (for selects we have the new pruning code). Other bug fixes. Thanks, Amit [1] https://www.postgresql.org/message-id/ba7aaeb1-4399-220e-70b4-62eade1522d0%40lab.ntt.co.jp
Attachment
- 0001-Add-default-partition-case-in-inheritance-testing.patch
- 0002-Tweak-default-range-partition-s-constraint-a-little.patch
- 0003-Add-new-tests-for-partition-pruning-v12.patch
- 0004-Add-a-bms_add_range-v12.patch
- 0005-Planner-side-changes-for-partition-pruning-v12.patch
- 0006-Implement-get_partitions_from_clauses-v12.patch
- 0007-Some-interface-changes-for-partition_bound_-cmp-bsea-v12.patch
- 0008-Implement-get_partitions_for_keys-v12.patch
pgsql-hackers by date: