Re: BUG #16840: Rows not found in table partitioned by hash when not all partitions exists - Mailing list pgsql-bugs

From Michał Albrycht
Subject Re: BUG #16840: Rows not found in table partitioned by hash when not all partitions exists
Date
Msg-id CACsoHGCxAJje+s6mZuDmHHmfVk+iUzgJk0+LNWvMMz8vnZS0og@mail.gmail.com
Whole thread Raw
In response to Re: BUG #16840: Rows not found in table partitioned by hash when not all partitions exists  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Thanks for quick bug confirmation and patch. I agree that the way code is presented here is unlikely to happen on production, but I was experimenting with a custom, dummy hash function which would guarantee that all rows with volume_id=1 would go to partition 1, volume_id=2 to partition 2 and so on. 

CREATE OR REPLACE FUNCTION partition_custom_bigint_hash(value INT8, seed INT8)
RETURNS INT8 AS $$
-- this number is UINT64CONST(0x49a0f4dd15e5a8e3) from https://github.com/postgres/postgres/blob/REL_13_STABLE/src/include/common/hashfn.h#L83
SELECT value - 5305509591434766563;
$$ LANGUAGE SQL IMMUTABLE;

CREATE OPERATOR CLASS partition_custom_bigint_hash_op
FOR TYPE int8
USING hash AS
OPERATOR 1 =,
FUNCTION 2 partition_custom_bigint_hash(int8, int8);

Then adding that operator class to table definition:
CREATE TABLE dir (
id SERIAL,
volume_id BIGINT,
path TEXT
) PARTITION BY HASH (volume_id partition_custom_bigint_hash_op);
Now I'm able to create a partition only when it's needed and I know which partition should be created for a given volume_id (partition_number = volume_id % number_of_partitions).
Michał Albrycht

śr., 27 sty 2021 o 23:54 Tom Lane <tgl@sss.pgh.pa.us> napisał(a):
I wrote:
> Hmm, seems to be a case of faulty partition exclusion, because the
> plan isn't scanning anything:

Here's a proposed patch for this.  The core of the problem is confusion
around the number of entries in the PartitionBoundInfoData.indexes array.
Each of the three types of partitioning has a different rule for that,
despite which we were expecting assorted code to know what to do, and
some places got it wrong for hash --- even hash-specific code :-(

I propose here to solve that by explicitly storing the number of entries
in PartitionBoundInfoData, and thereby removing the need for partition-
strategy-independent code to know anything about the rules.  I think
we can get away with that in the back branches by adding "nindexes"
at the end of the struct.  This could break extensions that are
manufacturing their own PartitionBoundInfoData structs, but it seems
unlikely that there are any.

Most of the patch just straightforwardly sets or uses the new field.
Notably, partition_bounds_equal() and partition_bounds_copy() get
significantly simpler and safer.  The actual bug fix is in
get_matching_hash_bounds() and perform_pruning_combine_step(), where
"all partitions" needs to be 0 .. nindexes-1 not 0 .. ndatums-1.
(The reason your example fails is that the OR clause should produce
"all partitions potentially match", but because of this bug, it's
producing a bitmask that doesn't include the partition we need.)

                        regards, tom lane

pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #16841: psql -- \d tablename , displays "Error : column c.relhasoids does not exit"
Next
From: Sergei Kornilov
Date:
Subject: Re: BUG #16841: psql -- \d tablename , displays "Error : column c.relhasoids does not exit"