Thread: Q: error on updating collation version information
Dear all, following an ICU upgrade, collations in a stock Debian PG 15.1 cluster now have divergent version information in pg_collations. Now gnumed_v22=> ALTER COLLATION pg_catalog."br_FR@euro" REFRESH VERSION; ERROR: collation "pg_catalog.br_FR@euro" for encoding "UTF8" does not exist despite gnumed_v22=> select *, pg_encoding_to_char(collencoding) from pg_collation where collname = 'br_FR@euro'; -[ RECORD 1 ]-------+----------- oid | 12413 collname | br_FR@euro collnamespace | 11 collowner | 10 collprovider | c collisdeterministic | t collencoding | 16 collcollate | br_FR@euro collctype | br_FR@euro colliculocale | collversion | 2.35 pg_encoding_to_char | LATIN9 However, note the UTF8 vs LATIN9. The manual sayeth: Some (less frequently used) encodings are not supported by ICU. When the database encoding is one of these, ICU collation entries in pg_collation are ignored. Attempting to use one will draw an error along the lines of “collation "de-x-icu" for encoding "WIN874" does not exist”. which sounds somewhat related. The database encoding is UTF8. That br_FR@euro.LATIN9 had _not_ been added manually. It is also not actively used in my database(s). What is the canonical advice on the way forward here ? Is the _suggested_ solution to delete the collation or am I missing to see the "proper" approach to fixing it ? Thanks, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Am Sun, Dec 04, 2022 at 01:22:02PM +0100 schrieb Karsten Hilbert: > gnumed_v22=> ALTER COLLATION pg_catalog."br_FR@euro" REFRESH VERSION; > ERROR: collation "pg_catalog.br_FR@euro" for encoding "UTF8" does not exist The OS (libc) does seem to know that collation: @hermes:~$ locale -a | grep br_FR br_FR br_FR@euro br_FR.utf8 Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Am Sun, Dec 04, 2022 at 01:22:02PM +0100 schrieb Karsten Hilbert: > following an ICU upgrade, collations in a stock Debian PG 15.1 > cluster now have divergent version information in pg_collations. Correction: this is following a libc upgrade 2.35 -> 2.36 Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
On 12/4/22 04:35, Karsten Hilbert wrote: > Am Sun, Dec 04, 2022 at 01:22:02PM +0100 schrieb Karsten Hilbert: > >> following an ICU upgrade, collations in a stock Debian PG 15.1 >> cluster now have divergent version information in pg_collations. > > Correction: this is following a libc upgrade 2.35 -> 2.36 So to be clear this database is not using ICU, but collations from libc? How was the database installed? In first post you had: gnumed_v22=> select *, pg_encoding_to_char(collencoding) from pg_collation where collname = 'br_FR@euro'; -[ RECORD 1 ]-------+----------- oid | 12413 collname | br_FR@euro collnamespace | 11 collowner | 10 collprovider | c collisdeterministic | t collencoding | 16 collcollate | br_FR@euro collctype | br_FR@euro colliculocale | collversion | 2.35 pg_encoding_to_char | LATIN9 where collprovider c means libc and collversion 2.35. Not exactly sure how that interacts with from here: https://www.postgresql.org/docs/current/catalog-pg-collation.html collversion text Provider-specific version of the collation. This is recorded when the collation is created and then checked when it is used, to detect changes in the collation definition that could lead to data corruption. > > Karsten > -- > GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B > > -- Adrian Klaver adrian.klaver@aklaver.com
Am Sun, Dec 04, 2022 at 10:09:47AM -0800 schrieb Adrian Klaver: > >>following an ICU upgrade, collations in a stock Debian PG 15.1 > >>cluster now have divergent version information in pg_collations. > > > >Correction: this is following a libc upgrade 2.35 -> 2.36 > > So to be clear this database is not using ICU, but collations from libc? Sorry for the confusion. This database carries collations from _both_ libc and ICU in pg_collations. The collation in question (br_FR@euro) is _not_ in use (as in being depended on by any in-database object). > How was the database installed? stock Debian apt-get install postgresql-15 (which gives 15.1) followed by CREATE DATABASE "gnumed_v22" with owner = "redacted :-)" template = "template1" encoding = 'unicode'; as "postgres". > In first post you had: > > gnumed_v22=> select *, pg_encoding_to_char(collencoding) from pg_collation where > collname = 'br_FR@euro'; > -[ RECORD 1 ]-------+----------- > oid | 12413 > collname | br_FR@euro > collnamespace | 11 > collowner | 10 > collprovider | c > collisdeterministic | t > collencoding | 16 > collcollate | br_FR@euro > collctype | br_FR@euro > colliculocale | > collversion | 2.35 > pg_encoding_to_char | LATIN9 > > where collprovider c means libc and collversion 2.35. Yeah, that's when I figured that I misspoke about the ICU upgrade. Yes, there was an ICU upgrade, and yes, it did affect collations. Those I was able to fix up (the "reindex / revalidate constraint / refresh collation version" dance). There also was a libc upgrade which also affected locales. Most of them were fixable by that dance but some popped up (such as br_FR@euro) to not be "correctable" showing the "does not exist for encoding" error. Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Is this to be expected ? PG 15.1 on Debian: gnumed_v22=# select *, pg_collation_actual_version(oid), pg_encoding_to_char(collencoding) from pg_collation where collname= 'zh_TW'; -[ RECORD 1 ]---------------+------------ oid | 12985 collname | zh_TW collnamespace | 11 collowner | 10 collprovider | c collisdeterministic | t collencoding | 4 collcollate | zh_TW.euctw collctype | zh_TW.euctw colliculocale | collversion | 2.35 pg_collation_actual_version | 2.36 pg_encoding_to_char | EUC_TW -[ RECORD 2 ]---------------+------------ oid | 12986 collname | zh_TW collnamespace | 11 collowner | 10 collprovider | c collisdeterministic | t collencoding | 6 collcollate | zh_TW.utf8 collctype | zh_TW.utf8 colliculocale | collversion | 2.36 pg_collation_actual_version | 2.36 pg_encoding_to_char | UTF8 gnumed_v22=# begin; BEGIN gnumed_v22=*# alter collation pg_catalog."zh_TW" refresh version ; NOTICE: version has not changed ALTER COLLATION gnumed_v22=*# alter collation pg_catalog."zh_TW.utf8" refresh version ; NOTICE: version has not changed ALTER COLLATION gnumed_v22=*# alter collation pg_catalog."zh_TW.euctw" refresh version ; ERROR: collation "pg_catalog.zh_TW.euctw" for encoding "UTF8" does not exist gnumed_v22=!# As far as I can tell the documentation asserts that since the database encoding is UTF8 the pg_catalog."zh_TW.euctw" will be ignored by the server for all practical purposes. Does this mean it is impossible to "correct" its version information ? And if so, that is expected to be non-harmful and is not expected to trigger nag messages ? Thanks, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Karsten Hilbert wrote: > The database encoding is UTF8. That br_FR@euro.LATIN9 had > _not_ been added manually. It is also not actively used in my > database(s). br_FR@euro.LATIN9 cannot be used actively in an UTF-8 database because it's for a different encoding than the database. It was probably available in the OS at initdb time. Every available locale (see locale -a) gets imported into the template databases, and then into the other databases, since CREATE DATABASE copies a template without filtering out the locales that are incompatible with the target database. There's no need to upgrade anything in the OS to get the ALTER COLLATION... REFRESH error you mention. It's sufficient to have a mix of collations for different encodings and try to refresh collations whose encoding are not compatible with the current database. > What is the canonical advice on the way forward here ? Is > the _suggested_ solution to delete the collation or am I > missing to see the "proper" approach to fixing it ? ISTM that dropping that collation from any non-LATIN9 database is the more sensible action. Maybe it could be discussed as a possible improvement to have ALTER COLLATION... REFRESH ignore the database encoding and still refresh the version number. Best regards, -- Daniel Vérité https://postgresql.verite.pro/ Twitter: @DanielVerite
Dear all, I managed to drop the "special" collations default, C, and POSIX with OIDs 100, 950, 951. Is there a way to recreate them (short of restoring a backup) ? Naive attempts with create collation do not seem to work out. Thanks, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
On Wed, 2022-12-21 at 15:24 +0100, Karsten Hilbert wrote: > I managed to drop the "special" collations default, C, and > POSIX with OIDs 100, 950, 951. > > Is there a way to recreate them (short of restoring a backup) > ? Naive attempts with create collation do not seem to work > out. I would definitely go for the backup, but here is how you can create these three rows in PostgreSQL v15: INSERT INTO pg_collation (oid, collname, collnamespace, collowner, collprovider, collisdeterministic, collencoding, collcollate, collctype) VALUES (100, 'default', 11, 10, 'd', TRUE, -1, NULL, NULL), (950, 'C', 11, 10, 'c', TRUE, -1, 'C', 'C'), (951, 'POSIX', 11, 10, 'c', TRUE, -1, 'POSIX', 'POSIX'); Yours, Laurenz Albe
Am Wed, Dec 21, 2022 at 03:41:24PM +0100 schrieb Laurenz Albe: > I would definitely go for the backup, but here is how you can > create these three rows in PostgreSQL v15: > > INSERT INTO pg_collation > (oid, collname, collnamespace, collowner, collprovider, > collisdeterministic, collencoding, collcollate, collctype) > VALUES > (100, 'default', 11, 10, 'd', TRUE, -1, NULL, NULL), > (950, 'C', 11, 10, 'c', TRUE, -1, 'C', 'C'), > (951, 'POSIX', 11, 10, 'c', TRUE, -1, 'POSIX', 'POSIX'); Many thanks ! I wasn't so sure whether inserting appropriate rows would be equivalent to create collation... (pg_collation might have been a view projecting inner workings of the server engine). Thanks, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Am Wed, Dec 21, 2022 at 03:46:48PM +0100 schrieb Karsten Hilbert: > I wasn't so sure whether inserting appropriate > rows would be equivalent to create collation... For that matter, is DELETE FROM pg_collation ... equivalent to DROP COLLATION ? Thanks, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Karsten Hilbert <Karsten.Hilbert@gmx.net> writes: > For that matter, is DELETE FROM pg_collation ... equivalent > to DROP COLLATION ? There's also entries in pg_depend and pg_shdepend to worry about. For these built-in collations, as of v15 there are no such entries, but prior versions had explicit "pin" entries. regards, tom lane