Re: Postgres 11, partitioning with a custom hash function - Mailing list pgsql-general

From Harry B
Subject Re: Postgres 11, partitioning with a custom hash function
Date
Msg-id CAMG7=yUde-E+4Fd0w=VU7VsgiL0yqpVB6uCi5drs5KLDyOCzFQ@mail.gmail.com
Whole thread Raw
In response to Re: Postgres 11, partitioning with a custom hash function  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Postgres 11, partitioning with a custom hash function  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-general
Thanks for the quick response David! this has been really helpful.

Looking at the code, this step wasn't totally unnecessary - if I had multi-column hash you would have had to do this step anyways - because pg hashes each column separately and combines them. True, unnecessary for single column hashes. It would have been better for the custom function to handle all columns at the same time, but then entire API surface would have had to change. At least it makes sense to me why it is this way....

All hope is not lost, at least for my case... because.... the bitshifting you have was on 'a', which was zero. So the expression

a ^= b + UINT64CONST(0x49a0f4dd15e5a8e3) + (a << 54) + (a >> 7);

becomes

a = b + UINT64CONST(0x49a0f4dd15e5a8e3)

This also explains why I noticed a constant-offset from the desired value regardless of the actual key being hashed.

It also works for the 32 partition example I showed https://play.golang.org/p/kcD-JhyLYD6
(original session/data in http://dpaste.com/382NDBG )

Now the big question: How scared should I be relying on this? I don't mind it breaking on major version upgrades (which would mean I need to dump & restore my entire set), but how likely is it to change unannounced in a minor/security release? Unless of course, you break it in a way that makes custom-hash function impossible.

Thanks
--
Harry



On Thu, Oct 4, 2018 at 12:39 PM David Rowley <david.rowley@2ndquadrant.com> wrote:
On 5 October 2018 at 06:18, Harry B <harrysungod@gmail.com> wrote:
>
> Thank you David! These helped me create an operator class.
> However, there still seems to be a 'off-by-a-fixed-N' difference between the
> hash value returned and how PG selects the partition.

hmm, actually, this is probably due to the hash_combine64() call in
compute_partition_hash_value(). This seems to combine the hash value
with 0 regardless of if there's another partition key column to hash.
If you could somehow do the reverse of what hash_combine64() will do
to you hash before returning it to the function then you might get
somewhere, but that does not look possible since it appears to be
throwing away some bits.

It may not have been a great choice to decide to have
compute_partition_hash_value() do this unnecessary combine, but it's
likely a few months too late to change that now.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Harry

pgsql-general by date:

Previous
From: David Rowley
Date:
Subject: Re: Postgres 11, partitioning with a custom hash function
Next
From: David Rowley
Date:
Subject: Re: Postgres 11, partitioning with a custom hash function