Thread: Fix to enum hashing for dump and restore
Hello, I have discovered a bug in one usage of enums. If a table with hash partitions uses an enum as a partitioning key, it can no longer be backed up and restored correctly. This is because enums are represented simply as oids, and the hash function for enums hashes that oid to determine partition distribution. Given the way oids are assigned, any dump+restore of a database with such a table may fail with the error "ERROR: new row for relation "TABLENAME" violates partition constraint". This can be reproduced with the following steps: ************************************************ create database test; \c test create type colors as enum ('red', 'green', 'blue', 'yellow'); create table part (color colors) partition by hash(color); create table prt_0 partition of part for values with (modulus 3, remainder 0); create table prt_1 partition of part for values with (modulus 3, remainder 1); create table prt_2 partition of part for values with (modulus 3, remainder 2); insert into part values ('red'); /usr/local/pgsql/bin/pg_dump -d test -f /tmp/dump.sql /usr/local/pgsql/bin/createdb test2 /usr/local/pgsql/bin/psql test2 -f /tmp/dump.sql ************************************************ I have written a patch to fix this bug (attached), by instead having the hashenum functions look up the enumsortorder ID of the value being hashed. These are deterministic across databases, and so allow for stable dump and restore. This admittedly comes at the performance cost of doing a catalog lookup, but there is precedent for this in e.g. hashrange and hashtext. I look forward to your feedback on this, thank you! Sincerely, Andrew J Repp (VMware)
Attachment
Andrew <pgsqlhackers@andrewrepp.com> writes: > I have discovered a bug in one usage of enums. If a table with hash > partitions uses an enum as a partitioning key, it can no longer be > backed up and restored correctly. This is because enums are represented > simply as oids, and the hash function for enums hashes that oid to > determine partition distribution. Given the way oids are assigned, any > dump+restore of a database with such a table may fail with the error > "ERROR: new row for relation "TABLENAME" violates partition constraint". Ugh, that was not well thought out :-(. I suppose this isn't a problem for pg_upgrade, which should preserve the enum value OIDs, but an ordinary dump/restore will indeed hit this. > I have written a patch to fix this bug (attached), by instead having the > hashenum functions look up the enumsortorder ID of the value being > hashed. These are deterministic across databases, and so allow for > stable dump and restore. Unfortunately, I'm not sure those are as deterministic as all that. They are floats, so there's a question of roundoff error, not to mention cross-platform variations in what a float looks like. (At the absolute minimum, I think we'd have to change your patch to force consistent byte ordering of the floats.) Actually though, roundoff error wouldn't be a problem for the normal exact-integer values of enumsortorder. Where it could come into play is with the fractional values used after you insert a value into the existing sort order. And then the whole idea fails, because a dump/restore won't duplicate those fractional values. Another problem with this approach is that we can't get from here to there without a guaranteed dump/reload failure, since it's quite unlikely that the partition assignment will be the same when based on enumsortorder as it was when based on OIDs. Worse, it also breaks the pg_upgrade case. I wonder if it'd work to make pg_dump force --load-via-partition-root mode when a hashed partition key includes an enum. regards, tom lane
Those are excellent points.
We will investigate adjusting pg_dump behavior,
as this is primarily a dump+restore issue.
Thank you!
-Andrew J Repp (VMware)
On Tue, Jan 24, 2023, at 9:56 PM, Tom Lane wrote:
Andrew <pgsqlhackers@andrewrepp.com> writes:> I have discovered a bug in one usage of enums. If a table with hash> partitions uses an enum as a partitioning key, it can no longer be> backed up and restored correctly. This is because enums are represented> simply as oids, and the hash function for enums hashes that oid to> determine partition distribution. Given the way oids are assigned, any> dump+restore of a database with such a table may fail with the error> "ERROR: new row for relation "TABLENAME" violates partition constraint".Ugh, that was not well thought out :-(. I suppose this isn't a problemfor pg_upgrade, which should preserve the enum value OIDs, but anordinary dump/restore will indeed hit this.> I have written a patch to fix this bug (attached), by instead having the> hashenum functions look up the enumsortorder ID of the value being> hashed. These are deterministic across databases, and so allow for> stable dump and restore.Unfortunately, I'm not sure those are as deterministic as all that.They are floats, so there's a question of roundoff error, not tomention cross-platform variations in what a float looks like. (At theabsolute minimum, I think we'd have to change your patch to forceconsistent byte ordering of the floats.) Actually though, roundofferror wouldn't be a problem for the normal exact-integer values ofenumsortorder. Where it could come into play is with the fractionalvalues used after you insert a value into the existing sort order.And then the whole idea fails, because a dump/restore won't duplicatethose fractional values.Another problem with this approach is that we can't get from here to therewithout a guaranteed dump/reload failure, since it's quite unlikely thatthe partition assignment will be the same when based on enumsortorderas it was when based on OIDs. Worse, it also breaks the pg_upgrade case.I wonder if it'd work to make pg_dump force --load-via-partition-rootmode when a hashed partition key includes an enum.regards, tom lane