Re: Database-level collation version tracking - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Database-level collation version tracking
Date
Msg-id 20220207102947.6npnjyilcy4cgkxt@jrouhaud
Whole thread Raw
In response to Database-level collation version tracking  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Responses Re: Database-level collation version tracking  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
List pgsql-hackers
On Tue, Feb 01, 2022 at 04:20:14PM +0100, Peter Eisentraut wrote:
> This patch adds to database objects the same version tracking that collation
> objects have.

This version conflicts with 87669de72c2 (Some cleanup for change of collate and
ctype fields to type text), so I'm attaching a simple rebase of your patch to
make the cfbot happy, no other changes.

> There is a new pg_database column datcollversion that stores
> the version, a new function pg_database_collation_actual_version() to get
> the version from the operating system, and a new subcommand ALTER DATABASE
> ... REFRESH COLLATION VERSION.
> 
> This was not originally added together with pg_collation.collversion, since
> originally version tracking was only supported for ICU, and ICU on a
> database-level is not currently supported.  But we now have version tracking
> for glibc (since PG13), FreeBSD (since PG14), and Windows (since PG13), so
> this is useful to have now.  And of course ICU on database-level is being
> worked on at the moment as well.
> This patch is pretty much complete AFAICT.

Agreed.  Here's a review of the patch:

- there should be a mention to the need for a catversion bump in the message
  comment
- there is no test
- it's missing some updates in create_database.sgml, and psql tab completion
  for CREATE DATABASE with the new collation_version defelem.

- that's not really something new with this patch, but should we output the
  collation version info or mismatch info in \l / \dO?

+       if (!actual_versionstr)
+           ereport(ERROR,
+                   (errmsg("database \"%s\" has no actual collation version, but a version was specified",
+                           name)));-

this means you can't connect on such a database anymore.  The level is probably
ok for collation version check, but for db isn't that too much?

+Oid
+AlterDatabaseRefreshColl(AlterDatabaseRefreshCollStmt *stmt)
+{
+   Relation    rel;
+   Oid         dboid;
+   HeapTuple   tup;
+   Datum       datum;
+   bool        isnull;
+   char       *oldversion;
+   char       *newversion;
+
+   rel = table_open(DatabaseRelationId, RowExclusiveLock);
+   dboid = get_database_oid(stmt->dbname, false);
+
+   if (!pg_database_ownercheck(dboid, GetUserId()))
+       aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_DATABASE,
+                      stmt->dbname);
+
+   tup = SearchSysCacheCopy1(DATABASEOID, ObjectIdGetDatum(dboid));
+   if (!HeapTupleIsValid(tup))
+       elog(ERROR, "cache lookup failed for database %u", dboid);

Is that ok to not obtain a lock on the database when refreshing the collation?
I guess it's not worth bothering as it can only lead to spurious messages for
connection done concurrently, but there should be a comment to clarify it.
Also, it means that someone can drop the database concurrently. So it's
should be a "database does not exist" rather than a cache lookup failed error
message.

+   /*
+    * Check collation version.  See similar code in
+    * pg_newlocale_from_collation().
+    */
+   datum = SysCacheGetAttr(DATABASEOID, tup, Anum_pg_database_datcollversion,
+                           &isnull);
+   if (!isnull)
+   {

This (and pg_newlocale_from_collation()) reports a problem if a recorded
collation version is found but there's no reported collation version.
Shouldn't it also complain if it's the opposite?  It's otherwise a backdoor to
make sure there won't be any check about the version anymore, and while it can
probably happen if you mess with the catalogs it still doesn't look great.

+       /*
+        * template0 shouldn't have any collation-dependent objects, so unset
+        * the collation version.  This avoids warnings when making a new
+        * database from it.
+        */
+       "UPDATE pg_database SET datcollversion = NULL WHERE datname = 'template0';\n\n",

I'm not opposed, but shouldn't there indeed be a warning in case of discrepancy
in the source database (whether template or not)?

# update pg_database set datcollversion = 'meh' where datname in ('postgres', 'template1');
UPDATE 2

# create database test1 template postgres;
CREATE DATABASE

# create database test2 template template1;
CREATE DATABASE

# \c test2
WARNING:  database "test2" has a collation version mismatch
DETAIL:  The database was created using collation version meh, but the operating system provides version 2.34.
HINT:  Rebuild all objects affected by collation in this database and run ALTER DATABASE test2 REFRESH COLLATION
VERSION,or build PostgreSQL with the right library version.
 
You are now connected to database "test2" as user "rjuju".

The rest of the patch looks good to me.  There's notably pg_dump and pg_upgrade
support so it indeed looks complete.

> One bonus thing would be to add
> a query to the ALTER DATABASE ref page similar to the one on the ALTER
> COLLATION ref page that identifies the objects that are affected by outdated
> collations.  The database version of that might just show all
> collation-using objects or something like that.  Suggestions welcome.

That would be nice, but that's something quite hard to do and the resulting
query would be somewhat unreadable.
Also, you need to look at custom data types, expression and quals at least, so
I'm not even sure that you can actually do it in pure SQL without additional
infrastructure.

> Also, one curious behavior is that if you get to a situation where the
> collation version is mismatched, every time an autovacuum worker launches
> you get the collation version mismatch warning in the log. Maybe that's
> actually correct, but maybe we want to dial it down a bit for
> non-interactive sessions.

I'm +0.5 to keep it that way.  Index corruption is a real danger, so if you
have enough autovacuum worker launched to have a real problem with that, you
clearly should take care of the problem even faster.

Attachment

pgsql-hackers by date:

Previous
From: Frédéric Yhuel
Date:
Subject: Allow parallel plan for referential integrity checks?
Next
From: Fujii Masao
Date:
Subject: Re: Support escape sequence for cluster_name in postgres_fdw.application_name