Re: [HACKERS] [POC] hash partitioning - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: [HACKERS] [POC] hash partitioning
Date
Msg-id CAFjFpRdeESKFkVGgmOdYvmD3d56-58c5VCBK0zDRjHrkq_VcNg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [POC] hash partitioning  (amul sul <sulamul@gmail.com>)
Responses Re: [HACKERS] [POC] hash partitioning  (amul sul <sulamul@gmail.com>)
Re: [HACKERS] [POC] hash partitioning  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, May 17, 2017 at 12:04 AM, amul sul <sulamul@gmail.com> wrote:
> On Tue, May 16, 2017 at 10:00 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> On Tue, May 16, 2017 at 4:22 PM, amul sul <sulamul@gmail.com> wrote:
>>> v6 patch has bug in partition oid mapping and indexing, fixed in the
>>> attached version.
>>>
>>> Now partition oids will be arranged in the ascending order of hash
>>> partition bound  (i.e. modulus and remainder sorting order)
>>
>> Thanks for the update patch. I have some more comments.
>>
>> ------------
>> + if (spec->remainder < 0)
>> + ereport(ERROR,
>> + (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
>> +  errmsg("hash partition remainder must be less than modulus")));
>>
>> I think this error message is not correct, you might want to change it
>> to "hash partition remainder must be non-negative integer"
>>
>
> Fixed in the attached version;  used "hash partition remainder must be
> greater than or equal to 0" instead.

I would suggest "non-zero positive", since that's what we are using in
the documentation.

>
>> -------
>>
>> +         The table is partitioned by specifying remainder and modulus for each
>> +         partition. Each partition holds rows for which the hash value of
>>
>> Wouldn't it be better to say "modulus and remainder" instead of
>> "remainder and modulus" then it will be consistent?
>>
>
> You are correct, fixed in the attached version.
>
>> -------
>> +       An <command>UPDATE</> that causes a row to move from one partition to
>> +       another fails, because
>>
>> fails, because -> fails because
>>
>
> This hunk is no longer exists in the attached patch, that was mistaken
> copied, sorry about that.
>
>> -------
>>
>> Wouldn't it be a good idea to document how to increase the number of
>> hash partitions, I think we can document it somewhere with an example,
>> something like Robert explained upthread?
>>
>> create table foo (a integer, b text) partition by hash (a);
>> create table foo1 partition of foo with (modulus 2, remainder 0);
>> create table foo2 partition of foo with (modulus 2, remainder 1);
>>
>> You can detach foo1, create two new partitions with modulus 4 and
>> remainders 0 and 2, and move the data over from the old partition
>>
>> I think it will be good information for a user to have? or it's
>> already documented and I missed it?
>>

This is already part of documentation contained in the patch.

Here are some more comments
@@ -3296,6 +3311,14 @@ ALTER TABLE measurement ATTACH PARTITION
measurement_y2008m02       not the partitioned table.      </para>     </listitem>
+
+     <listitem>
+      <para>
+       An <command>UPDATE</> that causes a row to move from one partition to
+       another fails, because the new value of the row fails to satisfy the
+       implicit partition constraint of the original partition.
+      </para>
+     </listitem>    </itemizedlist>    </para>    </sect3>
The description in this chunk is applicable to all the kinds of partitioning.
Why should it be part of a patch implementing hash partitioning?

+        Declarative partitioning only supports hash, list and range
+        partitioning, whereas table inheritance allows data to be
+        divided in a manner of the user's choosing.  (Note, however,
+        that if constraint exclusion is unable to prune partitions
+        effectively, query performance will be very poor.)
Looks like the line width is less than 80 characters.

In partition_bounds_equal(), please add comments explaining why is it safe to
check just the indexes? May be we should add code under assertion to make sure
that the datums are equal as well. The comment could be something
like, "If two partitioned tables have different greatest moduli, their
partition schemes don't match. If they have same greatest moduli, and
all remainders have different indexes, they all have same modulus
specified and the partitions are ordered by remainders, thus indexes
array will be an identity i.e. index[i] = i. If the partition
corresponding to a given remainder exists, it will have same index
entry for both partitioned tables or if it's missing it will be -1.
Thus if indexes array matches, corresponding datums array matches. If
there are multiple remainders corresponding to a given partition,
their partitions are ordered by the lowest of the remainders, thus if
indexes array matches, both of the tables have same indexes arrays, in
both the tables remainders corresponding to multiple partitions all
have same indexes and thus same modulus. Thus again if the indexes are
same, datums are same.".

In the same function   if (key->strategy == PARTITION_STRATEGY_HASH)   {       int            greatest_modulus;
       /*        * Compare greatest modulus of hash partition bound which        * is the last element of datums array.
      */       if (b1->datums[b1->ndatums - 1][0] != b2->datums[b2->ndatums - 1][0])           return false;
 
       /* Compare indexes */       greatest_modulus = DatumGetInt32(b1->datums[b1->ndatums - 1][0]);       for (i = 0;
i< greatest_modulus; i++)           if (b1->indexes[i] != b2->indexes[i])               return false;   }
 
if we return true from where this block ends, we will save one indenation level
for rest of the code and also FWIW extra diffs in this patch because of this
indentation change.

+        /*
+         * Hash operator classes provide only equality, not ordering.
+         * Collation, which is relevant for ordering and not equality is
+         * irrelevant for hash partitioning.
+         */
A comma is missing after "equality", and may be we need "for" before
"equality".        * Collation, which is relevant for ordering and not equality, is

+         * we use hash operator class. */
*/ should be on new line.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: [HACKERS] COPY FROM STDIN behaviour on end-of-file
Next
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw