Re: Q: documentation improvement re collation version mismatch - Mailing list pgsql-general

From Julien Rouhaud
Subject Re: Q: documentation improvement re collation version mismatch
Date
Msg-id 20221110083322.cqztec3ruvadzfpi@jrouhaud
Whole thread Raw
In response to Q: documentation improvement re collation version mismatch  (Karsten Hilbert <Karsten.Hilbert@gmx.net>)
Responses Aw: Re: Q: documentation improvement re collation version mismatch  (Karsten Hilbert <Karsten.Hilbert@gmx.net>)
Re: Q: documentation improvement re collation version mismatch  (Ron <ronljohnsonjr@gmail.com>)
List pgsql-general
Hi,

On Wed, Nov 09, 2022 at 12:45:17PM +0100, Karsten Hilbert wrote:
> Dear all,
>
> regarding changed collation versions this
>
>     https://www.postgresql.org/docs/devel/sql-altercollation.html
>
> says:
>
>     The following query can be used to identify all
>     collations in the current database that need to be
>     refreshed and the objects that depend on them:
>
>     SELECT pg_describe_object(refclassid, refobjid, refobjsubid) AS "Collation",
>            pg_describe_object(classid, objid, objsubid) AS "Object"
>       FROM pg_depend d JOIN pg_collation c
>            ON refclassid = 'pg_collation'::regclass AND refobjid = c.oid
>       WHERE c.collversion <> pg_collation_actual_version(c.oid)
>       ORDER BY 1, 2;
>
> I feel the result of that query can be slightly surprising
> because it does not return (to my testing) any objects
> depending on the database default collation, nor the database
> itself (as per a collation version mismatch in pg_database).

Indeed.  The default collation is "pinned", so we don't record any dependency
on it.

But also, getting the list of direct dependency to a collation is also almost
useless as there are so many other scenario where we wouldn't record an index
dependency on a collation.

> Now, there is a line
>
> Perhaps this query (taken from the net)
>
>     SELECT    -- get collation-change endangered indices
>         indrelid::regclass::text,
>         indexrelid::regclass::text,
>         collname,
>         pg_get_indexdef(indexrelid)
>     FROM (
>             SELECT
>                 indexrelid,
>                 indrelid,
>                 indcollation[i] coll
>             FROM
>                 pg_index, generate_subscripts(indcollation, 1) g(i)
>         ) s
>             JOIN pg_collation c ON coll=c.oid
>     WHERE
>         collprovider IN ('d', 'c')
>             AND
>         collname NOT IN ('C', 'POSIX');
>
> could be added to the paragraph (or it could be folded into
> the first query by a UNION or some such) ?

That query is a bit better, but unfortunately there are a lot of cases it won't
detect (like some use of collation in expressions or WHERE clauses), so if you
had a collation library upgrade that breaks your collations you can't use that
to reliably fix your indexes.

For now, the only safe way to go is either reindex everything, or everything
except some safe cases (non-partial indexes on plain-non-collatable datatypes
only).  Usually, those safe cases are usually enough to avoid most of useless
reindex activity.



pgsql-general by date:

Previous
From: Ian Lawrence Barwick
Date:
Subject: Re: List user databases
Next
From: 黄宁
Date:
Subject: change analyze function for a array type