Re: Ordered Partitioned Table Scans - Mailing list pgsql-hackers
From | Antonin Houska |
---|---|
Subject | Re: Ordered Partitioned Table Scans |
Date | |
Msg-id | 29278.1540998105@localhost Whole thread Raw |
In response to | Re: Ordered Partitioned Table Scans (David Rowley <david.rowley@2ndquadrant.com>) |
Responses |
Re: Ordered Partitioned Table Scans
|
List | pgsql-hackers |
David Rowley <david.rowley@2ndquadrant.com> wrote: > On 31 October 2018 at 13:05, David Rowley <david.rowley@2ndquadrant.com> wrote: > >>> On 28 October 2018 at 03:49, Julien Rouhaud <rjuju123@gmail.com> wrote: > >> I've registered as a reviewer. I still didn't have a deep look at > >> the patch yet, but thanks a lot for working on it! > > > > Thanks for signing up to review. I need to send another revision of > > the patch to add a missing call to truncate_useless_pathkeys(). Will > > try to do that today. > > I've attached a patch that ... I've picked this one when looking around what I could review. * As for the logic, I found generate_mergeappend_paths() to be the most interesting part: Imagine table partitioned by "i", so "partition_pathkeys" is {i}. partition 1: i | j --+-- 0 | 0 1 | 1 0 | 1 1 | 0 partition 2: i | j --+-- 3 | 0 2 | 0 2 | 1 3 | 1 Even if "pathkeys" is {i, j}, i.e. not contained in "partition_pathkeys", the ordering of the subpaths should not change the way tuples are split into partitions. Obviously a problem is if "partition_pathkeys" and "pathkeys" lists start with different items. To propose more generic rule, I used this example of range-partitioned table, where "i" and "j" are the partitioning keys: partition 1: i | j | k ---+---+--- 0 | 0 | 1 0 | 0 | 0 partition 2: i | j | k ---+---+--- 0 | 1 | 0 0 | 1 | 1 If the output "pathkey" is {i, k}, then the Append path makes rows of both partitions interleave: i | j | k ---+---+--- 0 | 0 | 0 0 | 1 | 0 0 | 0 | 1 0 | 1 | 1 So in general I think the restriction is that no valid position of "pathkeys" and "partition_pathkeys" may differ. Or in other words: the shorter of the 2 pathkey lists must be contained in the longer one. Does it make sense to you? Another problem I see is that build_partition_pathkeys() continues even if it fails to create a pathkey for some partitioning column. In the example above it would mean that the table can have "partition_pathkeys" equal to {j} instead of {i, j} on some EC-related conditions. However such a key does not correspond to reality - this is easier to imagine if another partition is considered. partition 3: i | j | k ---+---+--- 1 | 0 | 1 1 | 0 | 0 So I think no "partition_pathkeys" should be generated in that case. On the other hand, if the function returned the part of the list it could construct so far, it'd be wrong because such incomplete pathkeys could pass the checks I proposed above for reasons unrelated to the partitioning scheme. The following comments are mostly on coding: * Both qsort_partition_list_value_cmp() and qsort_partition_rbound_cmp() have this sentence in the header comment: Note: If changing this, see build_partition_pathkeys() However I could not find other use besides that in RelationBuildPartitionDesc(). * create_append_path(): /* * Apply query-wide LIMIT if known and path is for sole base relation. * (Handling this at this low level is a bit klugy.) */ if (root != NULL && pathkeys != NULL && bms_equal(rel->relids, root->all_baserels)) pathnode->limit_tuples = root->limit_tuples; else pathnode->limit_tuples = -1.0; I think this optimization is not specific to AppendPath / MergeAppendPath, so it could be moved elsewhere (as a separate patch of course). But specifically for AppendPath, why do we have to test pathkeys? The pathkeys of the AppendPath do not necessarily describe the order of the set to which LIMIT is applied, so their existence should not be important here. * If pathkeys is passed, shouldn't create_append_path() include the cost_sort() of subpaths just like create_merge_append_path() does? And if so, then create_append_path() and create_merge_append_path() might eventually have some common code (at least for the subpath processing) to be put into a separate function. * Likewise, create_append_plan() / create_merge_append_plan() are going to be more similar then before, so some refactoring could also make sense. Although it's not too much code, I admit the patch is not trivial, so I'm curious about your opinion. -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26, A-2700 Wiener Neustadt Web: https://www.cybertec-postgresql.com
pgsql-hackers by date: