Re: [HACKERS] [POC] hash partitioning - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: [HACKERS] [POC] hash partitioning |
Date | |
Msg-id | CAFjFpRfgVF98W5KuOFGWnJNfBuDNa=TYLpZRdrX4riM8u+QE5A@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] [POC] hash partitioning (amul sul <sulamul@gmail.com>) |
Responses |
Re: [HACKERS] [POC] hash partitioning
Re: [HACKERS] [POC] hash partitioning |
List | pgsql-hackers |
On Wed, May 3, 2017 at 6:39 PM, amul sul <sulamul@gmail.com> wrote: > On Thu, Apr 27, 2017 at 1:42 AM, Robert Haas <robertmhaas@gmail.com> wrote: > >> >> This is not yet a detailed review - I may be missing things, and >> review and commentary from others is welcome. If there is no major >> disagreement with the idea of moving forward using Amul's patch as a >> base, then I will do a more detailed review of that patch (or, >> hopefully, an updated version that addresses the above comments). > I agree that Amul's approach makes dump/restore feasible whereas Nagata-san's approach makes that difficult. That is a major plus point about Amul's patch. Also, it makes it possible to implement Nagata-san's syntax, which is more user-friendly in future. Here are some review comments after my initial reading of Amul's patch: Hash partitioning will partition the data based on the hash value of the partition key. Does that require collation? Should we throw an error/warning if collation is specified in PARTITION BY clause? + int *indexes; /* Partition indexes; in case of hash + * partitioned table array length will be + * value of largest modulus, and for others + * one entry per member of the datums array + * (plus one if range partitioned table) */ This may be rewritten as "Partition indexes: For hash partitioned table the number of indexes will be same as the largest modulus. For list partitioned table the number of indexes will be same as the number of datums. For range partitioned table the number of indexes will be number of datums plus one.". You may be able to reword it to a shorter version, but essentially we will have separate description for each strategy. I guess, we need to change the comments for the other members too. For example "datums" does not contain tuples with key->partnatts attributes for hash partitions. It contains a tuple with two attributes, modulus and remainder. We may not want to track null_index separately since rows with NULL partition key will fit in the partition corresponding to the hash value of NULL. OR may be we want to set null_index to partition which contains NULL values, if there is a partition created for corresponding remainder, modulus pair and set has_null accordingly. Accordingly we will need to update the comments. cal_hash_value() may be renamed as calc_has_value() or compute_hash_value()? Should we change the if .. else if .. construct in RelationBuildPartitionDesc() to a switch case? There's very less chance that we will support a fourth partitioning strategy, so if .. else if .. may be fine. + int mod = hbounds[i]->modulus, + place = hbounds[i]->remainder; Although there are places in the code where we separate variable declaration with same type by comma, most of the code declares each variable with the data type on separate line. Should variable "place" be renamed as "remainder" since that's what it is ultimately? RelationBuildPartitionDesc() fills up mapping array but never uses it. In this code the index into mapping array itself is the mapping so it doesn't need to be maintained separately like list partiioning case. Similary next_index usage looks unnecessary, although that probably improves readability, so may be fine. + * for p_p1: satisfies_hash_partition(2, 1, pkey, value) + * for p_p2: satisfies_hash_partition(4, 2, pkey, value) + * for p_p3: satisfies_hash_partition(8, 0, pkey, value) + * for p_p4: satisfies_hash_partition(8, 4, pkey, value) What the function builds is satisfies_hash_partition(2, 1, pkey). I don't see code to add value as an argument to the function. Is that correct? + int modulus = DatumGetInt32(datum); May be you want to rename this variable to greatest_modulus like in the other places. + Assert(spec->modulus > 0 && spec->remainder >= 0); I liked this assertion. Do you want to add spec->modulus > spec->reminder also here? + char *strategy; /* partitioning strategy + ('hash', 'list' or 'range') */ We need the second line to start with '*' +-- check validation when attaching list partitions Do you want to say "hash" instead of "list" here? I think we need to explain the reasoning behind this syntax somewhere as a README or in the documentation or in the comments. Otherwise it's difficult to understand how various pieces of code are related. This is not full review. I am still trying to understand how the hash partitioning implementation fits with list and range partitioning. I am going to continue to review this patch further. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
pgsql-hackers by date: