Re: Collation versioning - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Collation versioning
Date
Msg-id CAOBaU_aL=4f67L+m2s28DmiaacZ=Dn75BZY-HGmEq1WquGa-Jg@mail.gmail.com
Whole thread Raw
In response to Re: Collation versioning  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Collation versioning
List pgsql-hackers
On Mon, Nov 4, 2019 at 11:13 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Mon, Nov 4, 2019 at 4:58 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> >
> > Unfortunately I haven't had time to work on it seriously, but here's a
> > quick rebase to get the proof-of-concept back into working shape.

Thanks!  I also removed the test for REFRESH VERSION command that was
forgotten in the patch set, and run a pgindent.

> > Here are some problems to think about:
> >
> > * We'd need to track dependencies on the default collation once we
> > have versioning for that (see
> > https://www.postgresql.org/message-id/flat/5e756dd6-0e91-d778-96fd-b1bcb06c161a%402ndquadrant.com).
> > That is how most people actually consume collations out there in real
> > life, and yet we don't normally track dependencies on the default
> > collation and I don't know if that's simply a matter of ripping out
> > all the code that looks like "xxx != DEFAULT_COLLATION_ID" in the
> > dependency analysis code or if there's more to it.
>
> This isn't enough.  What would remain is:
>
> - teach get_collation_version_for_oid() to return the  default
> collation name, which is simple
> - have recordDependencyOnVersion() actually records the dependency,
> which wouldn't happen as the default collation is pinned.
>
> An easy fix would be to teach isObjectPinned() to ignore
> CollationRelationId / DEFAULT_COLLATION_OID, but that's ugly and would
> allow too many dependencies to be stored.  Not pinning the default
> collation during initdb doesn't sound a good alternative either.
> Maybe adding a force flag or a new DependencyType that'd mean "normal
> but forced" would be ok?

Attached 4th patch handles default collation.  I went with an
ignore_systempin flag in recordMultipleDependencies.

>
> > * Andres mentioned off-list that pg_depend rows might get blown away
> > and recreated in some DDL circumstances.  We need to look into that.

I tried various flavour of DDL but I couldn't wipe out the pg_depend
rows without having an index rebuild triggered (like changing the
underlying column datatype).  Do you have any scenario where the index
rebuild wouldn't be triggered?

> > * Another is that pg_upgrade won't preserve pg_depend rows, so you'd
> > need some catalog manipulation (direct or via new DDL) to fix that.

Attached 5th patch add a new "ALTER INDEX idx_name DEPENDS ON
COLLATION coll_oid VERSION coll_version_text" that can only be
executed in binary upgrade mode, and teach pg_dump to generate such
commands (including for indexes created for constraints).  One issue
is that older versions don't have pg_depend information,  so pg_dump
can't find the dependencies to generate such commands and override the
version with anything else.  It means that the upgraded cluster will
show all indexes as depending on the current collation provider
version.  I'm not sure if that's the best thing to do, or if we should
change all pg_depend rows to mention "unknown" version or something
like that.  It would generate so much noise that it's probably not
worth it.

I didn't do anything for the spurious warning when running a reindex,
and kept original approach of pg_depend catalog.

Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: alternative to PG_CATCH
Next
From: Eugen Konkov
Date:
Subject: Re: Does 'instead of delete' trigger support modification of OLD