Re: Collation versioning - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Collation versioning
Date
Msg-id CAOBaU_ZvdNzXZkYUNcm-rTWJnAdrTtTpZPfQEe1rjm_AQ66VLg@mail.gmail.com
Whole thread Raw
In response to Re: Collation versioning  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Collation versioning  (Julien Rouhaud <rjuju123@gmail.com>)
List pgsql-hackers
On Wed, Jan 29, 2020 at 9:58 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Tue, Jan 28, 2020 at 1:06 PM Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
> > 0001-0003 seem pretty much OK.  Why is pg_depend.refobjversion of type
> > "name" whereas the previous pg_collation.collversion was type "text"?
> > Related to that, we previously used null to indicate an unknown
> > collation version, and now it's an empty string.
>
> That's what Thomas implemented when he started to work on it and I
> simply kept it that way until now.  I'm assuming that it was simply to
> avoid wasting time on the varlena stuff while working on the
> prototype, so barring any objections I'll change to nullable text
> column in the next revision.

Done

> > The documentation snippet for this patch talks about upgrades from PG12
> > to later.  But what about upgrades on platforms where we currently don't
> > have collation versioning but might introduce it later (FreeBSD,
> > Windows)?  This needs to be generalized.
>
> Good point, I'll try to improve that.

Done.

> > I think this whole thing needs more tests.  We are designing this
> > delicate mechanism but have no real tests for it.  We at least need to
> > come up with a framework for how to test this automatically, so that we
> > can add more test cases over time as people will invariably report
> > issues when this hits the real world.
>
> Indeed.  I have some unlikely index test cases I'm for now using to
> validate the behavior, but didn't start a real test infrastructure.
> I'll also work on that for the next revision, although I'll need some
> more thinking on how to properly test the pg_upgrade part.

So I added all tests I could think of to validate the correct behavior
of all the new stuff.  Mostly:

- tests to make sure that we properly track a collation version for
various case of collation hidden somewhere in index definitions
- tests for pg_dump in binary upgrade mode to make sure that the
collation version (or the lack of known version) is correctly
preserved.  I also modified pg_dump TAP tests to restore the binary
dump on an instance in binary mode, redump it and rerun the related
testsuite.  While doing that I also realized that the previous support
for unknown version for partly broken, as it's possible to end up with
a database where only part of the collation versions are none.  I
fixed it, adding a new "DEPEND ON COLLATION x UNKNOWN VERSION"
alternative for that case, with proper pg_dump support and required
tests
- new pg_dump TAP test mode with both --binary-upgrade and the new
--unknown-collations-binary-compatible switch, to make sure that we
don't dump the UNKNOWN VERSION orders when the
--collation-binary-compatible pg_upgrade option is used
- test for the ALTER INDEX name DEPENDS ON COLLATION name CURRENT VERSION

To avoid too many platform dependent behavior, I restricted the tests
for ICU collation only, with the required changes to ignore the tests
when postgres isn't compile with ICU support.  One exception is a
couple of test to validate that we correctly add dependencies for
default collation, as we for now only support libc default collation.
It means that as written, the collate.icu.utf8 test will need an
alternative output for glibc platforms (which I didn't added yet, as
I'm sure there'll be other changes required, so let's avoid the pain
of maintaining it for now), as I've been testing the expected file
against macos.

Note that I didn't change any syntax (or switched to native functions
for the binary pg_dump) as it's still not clear to me what exactly
should be implemented.

Attachment

pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Add %x to PROMPT1 and PROMPT2
Next
From: Fabien COELHO
Date:
Subject: Re: Add %x to PROMPT1 and PROMPT2