Re: Ordered Partitioned Table Scans - Mailing list pgsql-hackers
From | Antonin Houska |
---|---|
Subject | Re: Ordered Partitioned Table Scans |
Date | |
Msg-id | 4591.1541063138@localhost Whole thread Raw |
In response to | Re: Ordered Partitioned Table Scans (David Rowley <david.rowley@2ndquadrant.com>) |
Responses |
Re: Ordered Partitioned Table Scans
|
List | pgsql-hackers |
David Rowley <david.rowley@2ndquadrant.com> wrote: > On 1 November 2018 at 04:01, Antonin Houska <ah@cybertec.at> wrote: > > * As for the logic, I found generate_mergeappend_paths() to be the most > > interesting part: > > > > Imagine table partitioned by "i", so "partition_pathkeys" is {i}. > > > > partition 1: > > > > i | j > > --+-- > > 0 | 0 > > 1 | 1 > > 0 | 1 > > 1 | 0 > > > > partition 2: > > > > i | j > > --+-- > > 3 | 0 > > 2 | 0 > > 2 | 1 > > 3 | 1 > > > > Even if "pathkeys" is {i, j}, i.e. not contained in "partition_pathkeys", the > > ordering of the subpaths should not change the way tuples are split into > > partitions. > > > > ... > > I understand what you're saying. I just don't understand what you > think is wrong with the patch in this area. I think these conditions are too restrictive: /* * Determine if these pathkeys match the partition order, or reverse * partition order. It can't match both, so only go to the trouble of * checking the reverse order when it's not in ascending partition * order. */ partition_order = pathkeys_contained_in(pathkeys, partition_pathkeys); partition_order_desc = !partition_order && pathkeys_contained_in(pathkeys, partition_pathkeys_desc); In the example above I wanted to show that your new feature should work even if "pathkeys" is not contained in "partition_pathkeys". > > Another problem I see is that build_partition_pathkeys() continues even if it > > fails to create a pathkey for some partitioning column. In the example above > > it would mean that the table can have "partition_pathkeys" equal to {j} > > instead of {i, j} on some EC-related conditions. However such a key does not > > correspond to reality - this is easier to imagine if another partition is > > considered. > > > > partition 3: > > > > i | j | k > > ---+---+--- > > 1 | 0 | 1 > > 1 | 0 | 0 > > > > So I think no "partition_pathkeys" should be generated in that case. On the > > other hand, if the function returned the part of the list it could construct > > so far, it'd be wrong because such incomplete pathkeys could pass the checks I > > proposed above for reasons unrelated to the partitioning scheme. > > Oops. That's a mistake. We should return what we have so far if we > can't make one of the pathkeys. Will fix. I think no "partition_pathkeys" should be created in this case, but before we can discuss this in detail there needs to be an agreement on the evaluation of the relationship between "pathkeys" and "partition_pathkeys", see above. > > The following comments are mostly on coding: > > > > * Both qsort_partition_list_value_cmp() and qsort_partition_rbound_cmp() have > > this sentence in the header comment: > > > > Note: If changing this, see build_partition_pathkeys() > > > > However I could not find other use besides that in > > RelationBuildPartitionDesc(). > > While the new code does not call those directly, the new code does > depend on the sort order of the partitions inside the PartitionDesc, > which those functions are responsible for. Perhaps there's a better > way to communicate that. I pointed this out because I suspect that changes of these functions would affect more features, not only the one you're trying to implement. > > * If pathkeys is passed, shouldn't create_append_path() include the > > cost_sort() of subpaths just like create_merge_append_path() does? And if > > so, then create_append_path() and create_merge_append_path() might > > eventually have some common code (at least for the subpath processing) to be > > put into a separate function. > > It does. It's just done via the call to cost_append(). ok, I missed that. > > * Likewise, create_append_plan() / create_merge_append_plan() are going to be > > more similar then before, so some refactoring could also make sense. > > > > Although it's not too much code, I admit the patch is not trivial, so I'm > > curious about your opinion. > > I think the costing code is sufficiently different to warant not > sharing more. For example, the startup costing is completely > different. Append can start on the startup cost of the first subpath, > but MergeAppend takes the sum of the startup cost of all subpaths. ok -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26, A-2700 Wiener Neustadt Web: https://www.cybertec-postgresql.com
pgsql-hackers by date: