Re: Collation versioning - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Collation versioning
Date
Msg-id CA+hUKG+JrRdBVBvSZLvhAOnTcBF6CASj9BkUjyq9OqG2=BErrw@mail.gmail.com
Whole thread Raw
In response to Re: Collation versioning  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Collation versioning
List pgsql-hackers
On Fri, Oct 30, 2020 at 1:20 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> 4.  I didn't really like the use of '' for unknown.  I figured out how
> to use NULL for that.

Hrmph.  That logic didn't work for the pattern ops case.  I fixed
that, by partially reintroducing a special value, but this time
restricting the code that knows about that to just pg_dump, and I used
the value 'unknown', so really it's not special at all as far as the
server is concerned and there is only one kind of warning message.

Furthermore, I realised that I really don't like the policy of
assuming that all text-related indexes imported from older releases
need the "unknown" warning.  That'll just make this feature
unnecessarily noisy and unpopular when 14 comes out, essentially
crying wolf, even though it's technically true that the collations in
imported-from-before-14 indexes are of unknown version.  Even worse,
instructions might be shared around the internet to show how to shut
the warnings up without reindexing, and then later when there really
is a version change, someone might find those instructions and follow
them!  So I propose that the default should be to assume that indexes
are not corrupted, unless you opt into the more pessimistic policy
with --index-collation-versions-unknown.  Done like that.

I also realised that I don't like carrying a bunch of C code to
support binary upgrades, when it's really just a hand-coded trivial
UPDATE of pg_depend.  Is there any reason pg_dump --binary-upgrade
shouldn't just dump UPDATE statements, and make this whole feature a
lot less mysterious, and shorter?  Done like that.

While testing on another OS that will be encountered in the build farm
when I commit this, I realised that I needed to add --encoding=UTF8 to
tests under src/bin/pg_dump and src/test/locale, because they now do
things with ICU collations (if built with ICU support) and that only
works with UTF8.  Another option would be to find a way to skip those
tests if the encoding is not UTF8.  Hmm, I wonder if it's bad to
effectively remove the testing that comes for free from buildfarm
animals running this under non-UTF8 encodings; but if we actually
valued that, I suppose we'd do it explicitly as another test pass with
SQL_ASCII.

Attachment

pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: bulk typos
Next
From: Bharath Rupireddy
Date:
Subject: Re: Log message for GSS connection is missing once connection authorization is successful.