Re: BUG #17076: Server crashes on composing an error message about invalid modulus for a new table partition - Mailing list pgsql-bugs

From Amit Langote
Subject Re: BUG #17076: Server crashes on composing an error message about invalid modulus for a new table partition
Date
Msg-id CA+HiwqF+keTbzQqYPR=QtzNon+huzCRcb676eb66NTqPYDARtA@mail.gmail.com
Whole thread Raw
In response to Re: BUG #17076: Server crashes on composing an error message about invalid modulus for a new table partition  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #17076: Server crashes on composing an error message about invalid modulus for a new table partition  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On Tue, Jun 29, 2021 at 7:30 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
> > I think the patch has got more problems than that too, as it's far
> > from clear why the next remainder would have anything to do with the
> > next larger modulus.  IOW I suspect that when it does manage to print
> > a partition name without crashing, it's very likely to print the
> > wrong partition.
>
> Concretely, this variant of the test case:
>
> drop table hash_parted;
> CREATE TABLE hash_parted (
>         a int
> ) PARTITION BY HASH (a);
> CREATE TABLE hpart_1 PARTITION OF hash_parted FOR VALUES WITH (MODULUS 50, REMAINDER 0);
> CREATE TABLE hpart_2 PARTITION OF hash_parted FOR VALUES WITH (MODULUS 10, REMAINDER 1);
> CREATE TABLE hpart_3 PARTITION OF hash_parted FOR VALUES WITH (MODULUS 200, REMAINDER 2);
>
> CREATE TABLE fail_part PARTITION OF hash_parted FOR VALUES WITH (MODULUS 25, REMAINDER 3);
>
> prints
>
> ERROR:  every hash partition modulus must be a factor of the next larger modulus
> DETAIL:  The new modulus 25 is not divisible by 10, the modulus of existing partition "hpart_1".
>
> which is obviously wrong.

Yes.  Though, I think the problem is that the code uses
boundinfo->indexes[<offset>] as an index into partdesc->oids[] array
to get the existing partition to show in the error message.  The
correct thing here would be to use <offset> directly.  That is because
the hash partition bounds as laid out in boudinfo->datums[] correspond
one-to-one with the partition OIDs laid out in partdesc->oids[].
boundinfo->indexes should only be consulted to get the partition OID
from a remainder value, as get_partition_for_tuple() does.

(OTOH, range and list partition bound offsets, because there may be
more than one bound assigned to each partition (considering
de-duplication of pairs of matching upper and next lower bound in the
range case), do not correspond one-to-one with those of partition
OIDs.)

> I think that rescuing this might require scanning through the list of
> partitions for one that has the desired modulus.  There could be more
> than one match, but I'd be content to take the first one.

Hmm, the bound offset that the code comes up with to show in the error
message seems fine to me.

I have attached a patch containing the above mentioned fix.  The patch
also adjusts the relevant test cases in create_table.sql to cover
these cases.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

Attachment

pgsql-bugs by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: BUG #17073: docs - "Improve signal handling reliability"
Next
From: Tom Lane
Date:
Subject: Re: BUG #17076: Server crashes on composing an error message about invalid modulus for a new table partition