Home > mailing lists

Re: [HACKERS] [POC] hash partitioning - Mailing list pgsql-hackers

From	amul sul
Subject	Re: [HACKERS] [POC] hash partitioning
Date	October 12, 2017 16:08:37
Msg-id	CAAJ_b948MUW38gEOzkFAgO-mdty4K=iJ7orf13WPL2y_m5_y8w@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] [POC] hash partitioning (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: [HACKERS] [POC] hash partitioning Re: [HACKERS] [POC] hash partitioning
List	pgsql-hackers

Tree view

On Thu, Oct 12, 2017 at 6:31 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Oct 10, 2017 at 7:07 AM, amul sul <sulamul@gmail.com> wrote:
>> How about the attached patch(0003)?
>> Also, the dim variable is renamed to natts.
>
> I'm not sure I believe this comment:
>
> +        /*
> +         * We arrange the partitions in the ascending order of their modulus
> +         * and remainders.  Also every modulus is factor of next larger
> +         * modulus.  This means that the index of a given partition is same as
> +         * the remainder of that partition.  Also entries at (remainder + N *
> +         * modulus) positions in indexes array are all same for (modulus,
> +         * remainder) specification for any partition.  Thus datums array from
> +         * both the given bounds are same, if and only if their indexes array
> +         * will be same.  So, it suffices to compare indexes array.
> +         */
>
> I am particularly not sure that I believe that the index of a
> partition must be the same as the remainder.  It doesn't seem like
> that would be true when there is more than one modulus or when some
> partitions are missing.
>

Looks like an explanation by the comment is not good enough, will think on this.

Here are the links for the previous discussion:
1] https://postgr.es/m/CAFjFpRfHqSGBjNgJV2p%2BC4Yr5Qxvwygdsg4G_VQ6q9NTB-i3MA%40mail.gmail.com
2] https://postgr.es/m/CAFjFpRdeESKFkVGgmOdYvmD3d56-58c5VCBK0zDRjHrkq_VcNg%40mail.gmail.com


> +                    if (offset < 0)
> +                    {
> +                        next_modulus = DatumGetInt32(datums[0][0]);
> +                        valid_modulus = (next_modulus % spec->modulus) == 0;
> +                    }
> +                    else
> +                    {
> +                        prev_modulus = DatumGetInt32(datums[offset][0]);
> +                        valid_modulus = (spec->modulus % prev_modulus) == 0;
> +
> +                        if (valid_modulus && (offset + 1) < ndatums)
> +                        {
> +                            next_modulus =
> DatumGetInt32(datums[offset + 1][0]);
> +                            valid_modulus = (next_modulus %
> spec->modulus) == 0;
> +                        }
> +                    }
>
> I don't think this is quite right.  It checks the new modulus against
> prev_modulus whenever prev_modulus is defined, which is correct, but
> it doesn't check the new modulus against the next_modulus except when
> offset < 0.  But actually that check needs to be performed, I think,
> whenever the new modulus is less than the greatest modulus seen so
> far.
>
It does. See the "if (valid_modulus && (offset + 1) < ndatums)"  block in the
else part of the snippet that you are referring.

For e.g new modulus 25 & 150 is not accepted for the hash partitioned bound with
modulus 10,50,200. Will cover this test as well.

> + * For a partitioned table defined as:
> + *    CREATE TABLE simple_hash (a int, b char(10)) PARTITION BY HASH (a, b);
> + *
> + * CREATE TABLE p_p1 PARTITION OF simple_hash
> + *    FOR VALUES WITH (MODULUS 2, REMAINDER 1);
> + * CREATE TABLE p_p2 PARTITION OF simple_hash
> + *    FOR VALUES WITH (MODULUS 4, REMAINDER 2);
> + * CREATE TABLE p_p3 PARTITION OF simple_hash
> + *    FOR VALUES WITH (MODULUS 8, REMAINDER 0);
> + * CREATE TABLE p_p4 PARTITION OF simple_hash
> + *    FOR VALUES WITH (MODULUS 8, REMAINDER 4);
> + *
> + * This function will return one of the following in the form of an
> + * expression:
> + *
> + * for p_p1: satisfies_hash_partition(2, 1, hash_fn_1_extended(a, HASH_SEED),
> + *                                             hash_fn_2_extended(b,
> HASH_SEED))
> + * for p_p2: satisfies_hash_partition(4, 2, hash_fn_1_extended(a, HASH_SEED),
> + *                                             hash_fn_2_extended(b,
> HASH_SEED))
> + * for p_p3: satisfies_hash_partition(8, 0, hash_fn_1_extended(a, HASH_SEED),
> + *                                             hash_fn_2_extended(b,
> HASH_SEED))
> + * for p_p4: satisfies_hash_partition(8, 4, hash_fn_1_extended(a, HASH_SEED),
> + *                                             hash_fn_2_extended(b,
> HASH_SEED))
>
> I think instead of this lengthy example you should try to explain the
> general rule.  Maybe something like: the partition constraint for a
> hash partition is always a call to the built-in function
> satisfies_hash_partition().  The first two arguments are the modulus
> and remainder for the partition; the remaining arguments are the hash
> values computed for each column of the partition key using the
> extended hash function from the appropriate opclass.
>
Okay will add this.

> +static uint64
> +mix_hash_value(int nkeys, Datum *hash_array, bool *isnull)
>
How about combining high 32 bits and the low 32 bits separately as shown below?

static inline uint64
hash_combine64(uint64 a, uint64 b)
{   return (((uint64) hash_combine((uint32) a >> 32, (uint32) b >> 32) << 32)           | hash_combine((unit32) a,
(unit32)b));
 
}

> It would be nice to use the hash_combine() facility Andres recently
> added for this rather than having a way to do it that is specific to
> hash partitioning, but that function only works for 32-bit hash
> values.  Maybe we can persuade Andres to add a hash_combine64...
>
> +         * a hash operator class
>
> Missing period at end.
>
Okay will fix this.

> +        if (strategy == PARTITION_STRATEGY_HASH)
> +            ereport(ERROR,
> +                    (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
> +                     errmsg("default hash partition is not supported")));
>
> Maybe errmsg("a hash-partitioned table may not have a default partition")?
>
Okay will add this.

> +/* Seed for the extended hash function */
> +#define HASH_SEED UINT64CONST(0x7A5B22367996DCFF)
>
> I suggest HASH_PARTITION_SEED -- this is too generic.
>
Okay will add this.

> Have you checked how well the tests you've added cover the code you've
> added?  What code is not covered by the tests, and is there any way to
> cover it?
>
Will try to get gcov report for this patch.

Thanks for your review.

Regards,
Amul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

From: Tomas Vondra
Date: 12 October 2017, 16:07:30
Subject: Re: [HACKERS] Parallel Bitmap Heap Scans segfaults due to(tbm->dsa==NULL) on PostgreSQL 10

From: Dilip Kumar
Date: 12 October 2017, 16:14:10
Subject: Re: [HACKERS] Parallel Bitmap Heap Scans segfaults due to(tbm->dsa==NULL) on PostgreSQL 10

Re: [HACKERS] [POC] hash partitioning - Mailing list pgsql-hackers

Previous

Next