Re: BUG #17746: Partitioning by hash of a text depends on icu version when text collation is not deterministic - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #17746: Partitioning by hash of a text depends on icu version when text collation is not deterministic
Date
Msg-id 530248.1673458746@sss.pgh.pa.us
Whole thread Raw
In response to BUG #17746: Partitioning by hash of a text depends on icu version when text collation is not deterministic  (PG Bug reporting form <noreply@postgresql.org>)
List pgsql-bugs
PG Bug reporting form <noreply@postgresql.org> writes:
> What bothers me is that partitioning depends on the hash that can be
> computed differently with the OS upgrade/migration.

There's basically no way to avoid such problems with a non-deterministic
collation.  The hash function is required to compute the same hash for
all values that compare equal, and that set can change if the collation
does.  Even if the collation hasn't changed in any user-visible way,
what we are hashing for such cases is the result of ucol_getSortKey(),
and the new collation version might well produce a different answer.

Personally, I think hash partitioning is an anti-pattern that ought
to come with bright red warning flags in the docs.  If you think you
want it, you're generally wrong, for a number of reasons beyond this.

(Admittedly, range partitioning can also get broken by collation
updates, but at least that doesn't happen without user-visible
behavioral changes in the collation.)

            regards, tom lane



pgsql-bugs by date:

Previous
From: Alex Richman
Date:
Subject: Re: Logical Replica ReorderBuffer Size Accounting Issues
Next
From: PG Bug reporting form
Date:
Subject: BUG #17747: Registry entry "Base Directory" is not populated if you only install Command-line tools