AW: BUG #17923: Excessive warnings of collation version mismatch in logs - Mailing list pgsql-bugs

From Hans Buschmann
Subject AW: BUG #17923: Excessive warnings of collation version mismatch in logs
Date
Msg-id 5d3fa6cf2dd7489dba0e087e7e6a164e@nidsa.net
Whole thread Raw
In response to Re: BUG #17923: Excessive warnings of collation version mismatch in logs  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Responses Re: AW: BUG #17923: Excessive warnings of collation version mismatch in logs  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
List pgsql-bugs

Hello Peter,


Thank you for looking at.


I think, there is a little misunderstanding:


The problem is not at connection time (I only mentioned it for info).


The problem is the endless repeating of these log entries without user login (cut a little bit at line end):


17:09:08 CEST  01000  1:> WARNING:  database "template1" has a collation version mismatch
17:09:08 CEST  01000  2:> DETAIL:  The database was created using collation version 2.36,
17:09:08 CEST  01000  3:> HINT:  Rebuild all objects in this database that use the defaul
17:09:38 CEST  01000  1:> WARNING:  database "postgres" has a collation version mismatch
17:09:38 CEST  01000  2:> DETAIL:  The database was created using collation version 2.36,
17:09:38 CEST  01000  3:> HINT:  Rebuild all objects in this database that use the defaul
17:10:08 CEST  01000  1:> WARNING:  database "template1" has a collation version mismatch
17:10:08 CEST  01000  2:> DETAIL:  The database was created using collation version 2.36,
17:10:08 CEST  01000  3:> HINT:  Rebuild all objects in this database that use the defaul
17:10:38 CEST  01000  1:> WARNING:  database "postgres" has a collation version mismatch
17:10:38 CEST  01000  2:> DETAIL:  The database was created using collation version 2.36,
17:10:38 CEST  01000  3:> HINT:  Rebuild all objects in this database that use the defaul
17:11:08 CEST  01000  1:> WARNING:  database "template1" has a collation version mismatch
17:11:08 CEST  01000  2:> DETAIL:  The database was created using collation version 2.36,
17:11:08 CEST  01000  3:> HINT:  Rebuild all objects in this database that use the defaul
17:11:38 CEST  01000  1:> WARNING:  database "postgres" has a collation version mismatch
17:11:38 CEST  01000  2:> DETAIL:  The database was created using collation version 2.36,
17:11:38 CEST  01000  3:> HINT:  Rebuild all objects in this database that use the defaul
17:12:08 CEST  01000  1:> WARNING:  database "template1" has a collation version mismatch
17:12:08 CEST  01000  2:> DETAIL:  The database was created using collation version 2.36,
17:12:08 CEST  01000  3:> HINT:  Rebuild all objects in this database that use the defaul
17:12:38 CEST  01000  1:> WARNING:  database "postgres" has a collation version mismatch
17:12:38 CEST  01000  2:> DETAIL:  The database was created using collation version 2.36,
17:12:38 CEST  01000  3:> HINT:  Rebuild all objects in this database that use the defaul
17:13:08 CEST  01000  1:> WARNING:  database "template1" has a collation version mismatch
17:13:08 CEST  01000  2:> DETAIL:  The database was created using collation version 2.36,
17:13:08 CEST  01000  3:> HINT:  Rebuild all objects in this database that use the defaul

This is repeated for about 48000 cases!! in a period of roughly 20 days!!!!!


Let me explain my suggestions (Please remind that I am not familiar with the source code, internal data structures or extensive C hackking, so I'll present it in a more conceptual way):

1. Current situation:
The server checks every minute for every database if there is a collation version mismatch.
If it doesn't match, it emits every minute a warning (without user login).

This hits performance and floods the logs.

2. My simple suggestion (as by question 1):
Check this mismatch only at server startup (the glic cannot be exchanged on a running server!)
and report the warnings only once per database.
It should be investigated, by which algorithm the server checks that every minute and this should be avoided if possible.

3. Further suggestion:
When the mismatch occurs at server startup, check if the database has any objects (more selective: if it has any indexes or materialized views)
This is a very short one-time operation only at server startup.
If there are no such objects in the mentioned database, change the collation version silently to the current version (do the ALTER DATABASE xxxxx REFRESH COLLATION VERSION automatically)

4. Elaborated premium solution with focus to easy administration:
With the ever growing databases easy recovery of collation problems is not easy for the user (see other discussions on ICU defaults etc. on hackers).

So for every collation a hash value could be computed and stored in the catalogs.

If there is now a mismatch, the warnings can inform the users of the changed collations (not every collation is subject to frequent changes!).
This circumvents the indeterminable changes of a glibc version change.

A query to the catalogs for such a database can then provide the objects where the altered collations are used and the user can take appropriate action (reindexing a big database is also very expensive!).

This certainly is much more work then only stopping the endless warnings but gives a fair amount of self-administration guide to the user.

PS: perhaps you may move the discussion to hackers to reach a broader audience.

Hans Buschmann







Von: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Gesendet: Mittwoch, 10. Mai 2023 07:30
An: Hans Buschmann; pgsql-bugs@lists.postgresql.org
Betreff: Re: BUG #17923: Excessive warnings of collation version mismatch in logs
 
On 07.05.23 12:13, PG Bug reporting form wrote:
> 1. Why isn't it sufficient to report this message only at server startup ?
> The messages are sent to psql too, but only when logging into one of these
> mostly unused databases, so it is not observed in a regular routine.

Each database is separate.  The server doesn't look at each database on
startup.  This is only done when you connect to a given database.

> 2. Why is this message reported for databases with no objects or similar
> cases?

When you connect to a database, the server doesn't scan all objects in
the database to see if they might be affected.  That would be too slow.

If you have no objects in the database, you can just run the provided
REFRESH command to make the warning go away.

> 3. Is there a real collation change in glibc 2.37 or is it only a version
> number mismatch with changes only in other parts of the library?

We don't know, because glibc isn't reliable at telling us about it.  So
we take the safe course.

> 4. Could the version information be shown in \l output of psql?

That's a valid idea.

> 5. Wouldn't it make sense to automatically update the collation version
> number of such databases (without any collation-related objects) at server
> startup?

See #2.  We are not going to start scanning each database on each
connection to check whether it might be empty.

pgsql-bugs by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: BUG #17926: Segfault in SELECT
Next
From: Robert Haas
Date:
Subject: Re: Clause accidentally pushed down ( Possible bug in Making Vars outer-join aware)