Thread: Database-level collation version tracking
This patch adds to database objects the same version tracking that collation objects have. There is a new pg_database column datcollversion that stores the version, a new function pg_database_collation_actual_version() to get the version from the operating system, and a new subcommand ALTER DATABASE ... REFRESH COLLATION VERSION. This was not originally added together with pg_collation.collversion, since originally version tracking was only supported for ICU, and ICU on a database-level is not currently supported. But we now have version tracking for glibc (since PG13), FreeBSD (since PG14), and Windows (since PG13), so this is useful to have now. And of course ICU on database-level is being worked on at the moment as well. This patch is pretty much complete AFAICT. One bonus thing would be to add a query to the ALTER DATABASE ref page similar to the one on the ALTER COLLATION ref page that identifies the objects that are affected by outdated collations. The database version of that might just show all collation-using objects or something like that. Suggestions welcome. Also, one curious behavior is that if you get to a situation where the collation version is mismatched, every time an autovacuum worker launches you get the collation version mismatch warning in the log. Maybe that's actually correct, but maybe we want to dial it down a bit for non-interactive sessions.
Attachment
On Tue, Feb 01, 2022 at 04:20:14PM +0100, Peter Eisentraut wrote: > This patch adds to database objects the same version tracking that collation > objects have. This version conflicts with 87669de72c2 (Some cleanup for change of collate and ctype fields to type text), so I'm attaching a simple rebase of your patch to make the cfbot happy, no other changes. > There is a new pg_database column datcollversion that stores > the version, a new function pg_database_collation_actual_version() to get > the version from the operating system, and a new subcommand ALTER DATABASE > ... REFRESH COLLATION VERSION. > > This was not originally added together with pg_collation.collversion, since > originally version tracking was only supported for ICU, and ICU on a > database-level is not currently supported. But we now have version tracking > for glibc (since PG13), FreeBSD (since PG14), and Windows (since PG13), so > this is useful to have now. And of course ICU on database-level is being > worked on at the moment as well. > This patch is pretty much complete AFAICT. Agreed. Here's a review of the patch: - there should be a mention to the need for a catversion bump in the message comment - there is no test - it's missing some updates in create_database.sgml, and psql tab completion for CREATE DATABASE with the new collation_version defelem. - that's not really something new with this patch, but should we output the collation version info or mismatch info in \l / \dO? + if (!actual_versionstr) + ereport(ERROR, + (errmsg("database \"%s\" has no actual collation version, but a version was specified", + name)));- this means you can't connect on such a database anymore. The level is probably ok for collation version check, but for db isn't that too much? +Oid +AlterDatabaseRefreshColl(AlterDatabaseRefreshCollStmt *stmt) +{ + Relation rel; + Oid dboid; + HeapTuple tup; + Datum datum; + bool isnull; + char *oldversion; + char *newversion; + + rel = table_open(DatabaseRelationId, RowExclusiveLock); + dboid = get_database_oid(stmt->dbname, false); + + if (!pg_database_ownercheck(dboid, GetUserId())) + aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_DATABASE, + stmt->dbname); + + tup = SearchSysCacheCopy1(DATABASEOID, ObjectIdGetDatum(dboid)); + if (!HeapTupleIsValid(tup)) + elog(ERROR, "cache lookup failed for database %u", dboid); Is that ok to not obtain a lock on the database when refreshing the collation? I guess it's not worth bothering as it can only lead to spurious messages for connection done concurrently, but there should be a comment to clarify it. Also, it means that someone can drop the database concurrently. So it's should be a "database does not exist" rather than a cache lookup failed error message. + /* + * Check collation version. See similar code in + * pg_newlocale_from_collation(). + */ + datum = SysCacheGetAttr(DATABASEOID, tup, Anum_pg_database_datcollversion, + &isnull); + if (!isnull) + { This (and pg_newlocale_from_collation()) reports a problem if a recorded collation version is found but there's no reported collation version. Shouldn't it also complain if it's the opposite? It's otherwise a backdoor to make sure there won't be any check about the version anymore, and while it can probably happen if you mess with the catalogs it still doesn't look great. + /* + * template0 shouldn't have any collation-dependent objects, so unset + * the collation version. This avoids warnings when making a new + * database from it. + */ + "UPDATE pg_database SET datcollversion = NULL WHERE datname = 'template0';\n\n", I'm not opposed, but shouldn't there indeed be a warning in case of discrepancy in the source database (whether template or not)? # update pg_database set datcollversion = 'meh' where datname in ('postgres', 'template1'); UPDATE 2 # create database test1 template postgres; CREATE DATABASE # create database test2 template template1; CREATE DATABASE # \c test2 WARNING: database "test2" has a collation version mismatch DETAIL: The database was created using collation version meh, but the operating system provides version 2.34. HINT: Rebuild all objects affected by collation in this database and run ALTER DATABASE test2 REFRESH COLLATION VERSION,or build PostgreSQL with the right library version. You are now connected to database "test2" as user "rjuju". The rest of the patch looks good to me. There's notably pg_dump and pg_upgrade support so it indeed looks complete. > One bonus thing would be to add > a query to the ALTER DATABASE ref page similar to the one on the ALTER > COLLATION ref page that identifies the objects that are affected by outdated > collations. The database version of that might just show all > collation-using objects or something like that. Suggestions welcome. That would be nice, but that's something quite hard to do and the resulting query would be somewhat unreadable. Also, you need to look at custom data types, expression and quals at least, so I'm not even sure that you can actually do it in pure SQL without additional infrastructure. > Also, one curious behavior is that if you get to a situation where the > collation version is mismatched, every time an autovacuum worker launches > you get the collation version mismatch warning in the log. Maybe that's > actually correct, but maybe we want to dial it down a bit for > non-interactive sessions. I'm +0.5 to keep it that way. Index corruption is a real danger, so if you have enough autovacuum worker launched to have a real problem with that, you clearly should take care of the problem even faster.
Attachment
On 07.02.22 11:29, Julien Rouhaud wrote: > - there should be a mention to the need for a catversion bump in the message > comment done > - there is no test Suggestions where to put it? We don't really have tests for the collation-level versioning either, do we? > - it's missing some updates in create_database.sgml, and psql tab completion > for CREATE DATABASE with the new collation_version defelem. Added to create_database.sgml, but not to psql. We don't have completion for the collation option either, since it's only meant to be used by pg_upgrade, not interactively. > - that's not really something new with this patch, but should we output the > collation version info or mismatch info in \l / \dO? It's a possibility. Perhaps there is a question of performance if we show it in \l and people have tons of databases and they have to make a locale call for each one. As you say, it's more an independent feature, but it's worth looking into. > + if (!actual_versionstr) > + ereport(ERROR, > + (errmsg("database \"%s\" has no actual collation version, but a version was specified", > + name)));- > > this means you can't connect on such a database anymore. The level is probably > ok for collation version check, but for db isn't that too much? Right, changed to warning. > +Oid > +AlterDatabaseRefreshColl(AlterDatabaseRefreshCollStmt *stmt) > +{ > + Relation rel; > + Oid dboid; > + HeapTuple tup; > + Datum datum; > + bool isnull; > + char *oldversion; > + char *newversion; > + > + rel = table_open(DatabaseRelationId, RowExclusiveLock); > + dboid = get_database_oid(stmt->dbname, false); > + > + if (!pg_database_ownercheck(dboid, GetUserId())) > + aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_DATABASE, > + stmt->dbname); > + > + tup = SearchSysCacheCopy1(DATABASEOID, ObjectIdGetDatum(dboid)); > + if (!HeapTupleIsValid(tup)) > + elog(ERROR, "cache lookup failed for database %u", dboid); > > Is that ok to not obtain a lock on the database when refreshing the collation? That code takes a RowExclusiveLock on pg_database. Did you have something else in mind? > + /* > + * Check collation version. See similar code in > + * pg_newlocale_from_collation(). > + */ > + datum = SysCacheGetAttr(DATABASEOID, tup, Anum_pg_database_datcollversion, > + &isnull); > + if (!isnull) > + { > > This (and pg_newlocale_from_collation()) reports a problem if a recorded > collation version is found but there's no reported collation version. > Shouldn't it also complain if it's the opposite? It's otherwise a backdoor to > make sure there won't be any check about the version anymore, and while it can > probably happen if you mess with the catalogs it still doesn't look great. get_collation_actual_version() always returns either null or not null for a given installation. So the situation that the stored version is null and the actual version is not null can only happen as part of a software upgrade. In that case, all uses of collations after an upgrade would immediately start complaining about missing versions, which seems like a bad experience. Users can explicitly opt in to version tracking by running REFRESH VERSION once. > + /* > + * template0 shouldn't have any collation-dependent objects, so unset > + * the collation version. This avoids warnings when making a new > + * database from it. > + */ > + "UPDATE pg_database SET datcollversion = NULL WHERE datname = 'template0';\n\n", > > I'm not opposed, but shouldn't there indeed be a warning in case of discrepancy > in the source database (whether template or not)? > > # update pg_database set datcollversion = 'meh' where datname in ('postgres', 'template1'); > UPDATE 2 > > # create database test1 template postgres; > CREATE DATABASE > > # create database test2 template template1; > CREATE DATABASE > > # \c test2 > WARNING: database "test2" has a collation version mismatch I don't understand what the complaint is here. It seems to work ok?
Attachment
On Mon, Feb 07, 2022 at 04:44:24PM +0100, Peter Eisentraut wrote: > On 07.02.22 11:29, Julien Rouhaud wrote: > > > - there is no test > > Suggestions where to put it? We don't really have tests for the > collation-level versioning either, do we? There's so limited testing in collate.* regression tests, so I thought it would be ok to add it there. At least some ALTER DATABASE ... REFRESH VERSION would be good, similarly to collation-level versioning. > > - it's missing some updates in create_database.sgml, and psql tab completion > > for CREATE DATABASE with the new collation_version defelem. > > Added to create_database.sgml, but not to psql. We don't have completion > for the collation option either, since it's only meant to be used by > pg_upgrade, not interactively. Ok. > > > - that's not really something new with this patch, but should we output the > > collation version info or mismatch info in \l / \dO? > > It's a possibility. Perhaps there is a question of performance if we show > it in \l and people have tons of databases and they have to make a locale > call for each one. As you say, it's more an independent feature, but it's > worth looking into. Agreed. > > +Oid > > +AlterDatabaseRefreshColl(AlterDatabaseRefreshCollStmt *stmt) > > +{ > > + Relation rel; > > + Oid dboid; > > + HeapTuple tup; > > + Datum datum; > > + bool isnull; > > + char *oldversion; > > + char *newversion; > > + > > + rel = table_open(DatabaseRelationId, RowExclusiveLock); > > + dboid = get_database_oid(stmt->dbname, false); > > + > > + if (!pg_database_ownercheck(dboid, GetUserId())) > > + aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_DATABASE, > > + stmt->dbname); > > + > > + tup = SearchSysCacheCopy1(DATABASEOID, ObjectIdGetDatum(dboid)); > > + if (!HeapTupleIsValid(tup)) > > + elog(ERROR, "cache lookup failed for database %u", dboid); > > > > Is that ok to not obtain a lock on the database when refreshing the collation? > > That code takes a RowExclusiveLock on pg_database. Did you have something > else in mind? But that lock won't prevent a concurrent DROP DATABASE, so it's totally expected to hit that cache lookup failed error. There should either be a shdepLockAndCheckObject(), or changing the error message to some errmsg("data xxx does not exist"). > > + /* > > + * Check collation version. See similar code in > > + * pg_newlocale_from_collation(). > > + */ > > + datum = SysCacheGetAttr(DATABASEOID, tup, Anum_pg_database_datcollversion, > > + &isnull); > > + if (!isnull) > > + { > > > > This (and pg_newlocale_from_collation()) reports a problem if a recorded > > collation version is found but there's no reported collation version. > > Shouldn't it also complain if it's the opposite? It's otherwise a backdoor to > > make sure there won't be any check about the version anymore, and while it can > > probably happen if you mess with the catalogs it still doesn't look great. > > get_collation_actual_version() always returns either null or not null for a > given installation. So the situation that the stored version is null and > the actual version is not null can only happen as part of a software > upgrade. In that case, all uses of collations after an upgrade would > immediately start complaining about missing versions, which seems like a bad > experience. Users can explicitly opt in to version tracking by running > REFRESH VERSION once. Ah right, I do remember that point which was also discussed in the collation version tracking. Sorry about the noise. > > + /* > > + * template0 shouldn't have any collation-dependent objects, so unset > > + * the collation version. This avoids warnings when making a new > > + * database from it. > > + */ > > + "UPDATE pg_database SET datcollversion = NULL WHERE datname = 'template0';\n\n", > > > > I'm not opposed, but shouldn't there indeed be a warning in case of discrepancy > > in the source database (whether template or not)? > > > > # update pg_database set datcollversion = 'meh' where datname in ('postgres', 'template1'); > > UPDATE 2 > > > > # create database test1 template postgres; > > CREATE DATABASE > > > > # create database test2 template template1; > > CREATE DATABASE > > > > # \c test2 > > WARNING: database "test2" has a collation version mismatch > > I don't understand what the complaint is here. It seems to work ok? The comment says that you explicitly set a NULL collation version to avoid warning when creating a database using template0 as a template, presumably after a collation lib upgrade. But as far as I can see the source database collation version is not checked when creating a new database, so it seems to me that either the comment is wrong, or we need another check for that. The latter seems preferable to me.
On 07.02.22 17:08, Julien Rouhaud wrote: > There's so limited testing in collate.* regression tests, so I thought it would > be ok to add it there. At least some ALTER DATABASE ... REFRESH VERSION would > be good, similarly to collation-level versioning. I don't think you can run ALTER DATABASE from the regression test scripts, since the database name is not fixed. You'd have to paste the command together using psql tricks or something. I guess it could be done, but maybe there is a better solution. We could put it into the createdb test suite, or write a new TAP test suite somewhere. There is no good precedent for this. >> That code takes a RowExclusiveLock on pg_database. Did you have something >> else in mind? > > But that lock won't prevent a concurrent DROP DATABASE, so it's totally > expected to hit that cache lookup failed error. There should either be a > shdepLockAndCheckObject(), or changing the error message to some errmsg("data > xxx does not exist"). I was not familiar with that function. AFAICT, it is only used for database and role settings (AlterDatabaseSet(), AlterRoleSet()), but not when just updating the role or database catalog (e.g., AlterRole(), RenameRole(), RenameDatabase()). So I don't think it is needed here. Maybe I'm missing something.
On Tue, Feb 08, 2022 at 12:14:02PM +0100, Peter Eisentraut wrote: > On 07.02.22 17:08, Julien Rouhaud wrote: > > There's so limited testing in collate.* regression tests, so I thought it would > > be ok to add it there. At least some ALTER DATABASE ... REFRESH VERSION would > > be good, similarly to collation-level versioning. > > I don't think you can run ALTER DATABASE from the regression test scripts, > since the database name is not fixed. You'd have to paste the command > together using psql tricks or something. I guess it could be done, but > maybe there is a better solution. We could put it into the createdb test > suite, or write a new TAP test suite somewhere. There is no good precedent > for this. I was thinking using a simple DO command, as there are already some other usage for that for non deterministic things (like TOAST tables). If it's too problematic I'm fine with not having tests for that for now. > > > That code takes a RowExclusiveLock on pg_database. Did you have something > > > else in mind? > > > > But that lock won't prevent a concurrent DROP DATABASE, so it's totally > > expected to hit that cache lookup failed error. There should either be a > > shdepLockAndCheckObject(), or changing the error message to some errmsg("data > > xxx does not exist"). > > I was not familiar with that function. AFAICT, it is only used for database > and role settings (AlterDatabaseSet(), AlterRoleSet()), but not when just > updating the role or database catalog (e.g., AlterRole(), RenameRole(), > RenameDatabase()). So I don't think it is needed here. Maybe I'm missing > something. I'm just saying that without such a lock you can easily trigger the "cache lookup" error, and that's something that's supposed to happen with normal usage I think. So it should be a better message saying that the database has been concurrently dropped, or actually simply does not exist like it's done in AlterDatabaseOwner() for the same pattern: [...] tuple = systable_getnext(scan); if (!HeapTupleIsValid(tuple)) ereport(ERROR, (errcode(ERRCODE_UNDEFINED_DATABASE), errmsg("database \"%s\" does not exist", dbname))); [...] Apart from that I still think that we should check the collation version of the source database when creating a new database. It won't cost much but will give the DBA a chance to recreate the indexes before risking invalid index usage.
On Mon, Feb 07, 2022 at 04:44:24PM +0100, Peter Eisentraut wrote: > On 07.02.22 11:29, Julien Rouhaud wrote: > > - that's not really something new with this patch, but should we output the > > collation version info or mismatch info in \l / \dO? > > It's a possibility. Perhaps there is a question of performance if we show > it in \l and people have tons of databases and they have to make a locale > call for each one. As you say, it's more an independent feature, but it's > worth looking into. Ok, but \l+ shows among others the database size, so I guess at that level also showing this should be fine? (or is that already the case?) Michael
On 08.02.22 13:55, Julien Rouhaud wrote: > I'm just saying that without such a lock you can easily trigger the "cache > lookup" error, and that's something that's supposed to happen with normal > usage I think. So it should be a better message saying that the database has > been concurrently dropped, or actually simply does not exist like it's done in > AlterDatabaseOwner() for the same pattern: > > [...] > tuple = systable_getnext(scan); > if (!HeapTupleIsValid(tuple)) > ereport(ERROR, > (errcode(ERRCODE_UNDEFINED_DATABASE), > errmsg("database \"%s\" does not exist", dbname))); > [...] In my code, the existence of the database is checked by dboid = get_database_oid(stmt->dbname, false); This also issues an appropriate user-facing error message if the database does not exist. The flow in AlterDatabaseOwner() is a bit different, it looks up the pg_database tuple directly from the name. I think both are correct. My code has been copied from the analogous code in AlterCollation().
On Wed, Feb 09, 2022 at 12:48:35PM +0100, Peter Eisentraut wrote: > On 08.02.22 13:55, Julien Rouhaud wrote: > > I'm just saying that without such a lock you can easily trigger the "cache > > lookup" error, and that's something that's supposed to happen with normal > > usage I think. So it should be a better message saying that the database has > > been concurrently dropped, or actually simply does not exist like it's done in > > AlterDatabaseOwner() for the same pattern: > > > > [...] > > tuple = systable_getnext(scan); > > if (!HeapTupleIsValid(tuple)) > > ereport(ERROR, > > (errcode(ERRCODE_UNDEFINED_DATABASE), > > errmsg("database \"%s\" does not exist", dbname))); > > [...] > > In my code, the existence of the database is checked by > > dboid = get_database_oid(stmt->dbname, false); > > This also issues an appropriate user-facing error message if the database > does not exist. Yes but if you run a DROP DATABASE concurrently you will either get a "database does not exist" or "cache lookup failed" depending on whether the DROP is processed before or after the get_database_oid(). I agree that it's not worth trying to make it concurrent-drop safe, but I also thought that throwing plain elog(ERROR) should be avoided when reasonably doable. And in that situation we know it can happen in normal situation, so having a real error message looks like a cost-free improvement. Now if it's better to have a cache lookup error even in that situation just for safety or something ok, it's not like trying to refresh a db collation and having someone else dropping it at the same time is going to be a common practice anway. > The flow in AlterDatabaseOwner() is a bit different, it looks up the > pg_database tuple directly from the name. I think both are correct. My > code has been copied from the analogous code in AlterCollation(). I also think it would be better to have a "collation does not exist" in the syscache failure message, but same here dropping collation is probably even less frequent than dropping database, let alone while refreshing the collation version.
On 08.02.22 13:55, Julien Rouhaud wrote: > Apart from that I still think that we should check the collation version of the > source database when creating a new database. It won't cost much but will give > the DBA a chance to recreate the indexes before risking invalid index usage. A question on this: In essence, this would be putting code into createdb() similar to the code in postinit.c:CheckMyDatabase(). But what should we make it do and say exactly? After thinking about this for a bit, I suggest: If the actual collation version of the newly created database does not match the recorded collation version of the template database, we should error out and make the user fix the template database (by reindexing there etc.). The alternative is to warn, as it does now in postinit.c. But then the phrasing of the message becomes complicated: Should we make the user fix the new database or the template database or both? And if they don't fix the template database, they will have the same problem again. So making it a hard failure seems better to me.
On 2022-Feb-08, Julien Rouhaud wrote: > On Tue, Feb 08, 2022 at 12:14:02PM +0100, Peter Eisentraut wrote: > > I don't think you can run ALTER DATABASE from the regression test scripts, > > since the database name is not fixed. You'd have to paste the command > > together using psql tricks or something. I guess it could be done, but > > maybe there is a better solution. We could put it into the createdb test > > suite, or write a new TAP test suite somewhere. There is no good precedent > > for this. > > I was thinking using a simple DO command, as there are already some other usage > for that for non deterministic things (like TOAST tables). If it's too > problematic I'm fine with not having tests for that for now. You can do this: select current_database() as datname \gset alter database :"datname" owner to foo; -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
On Wed, Feb 09, 2022 at 05:12:41PM +0100, Peter Eisentraut wrote: > On 08.02.22 13:55, Julien Rouhaud wrote: > > Apart from that I still think that we should check the collation version of the > > source database when creating a new database. It won't cost much but will give > > the DBA a chance to recreate the indexes before risking invalid index usage. > > A question on this: In essence, this would be putting code into createdb() > similar to the code in postinit.c:CheckMyDatabase(). But what should we > make it do and say exactly? > > After thinking about this for a bit, I suggest: If the actual collation > version of the newly created database does not match the recorded collation > version of the template database, we should error out and make the user fix > the template database (by reindexing there etc.). I'm not sure what you really mean by "actual collation version of the newly created database". Is it really a check after having done all the work or just checking a discrepancy when computing the to-be-created database version from the source database, ie. something like if (dbcollversion == NULL) + { dbcollversion = src_collversion; + if src_collversion != get_collation_actual_version(the source db) + // raise error or warning > The alternative is to warn, as it does now in postinit.c. But then the > phrasing of the message becomes complicated: Should we make the user fix the > new database or the template database or both? And if they don't fix the > template database, they will have the same problem again. So making it a > hard failure seems better to me. Agreed, I'm in favor of a hard error. Maybe a message like: errmsg(cannot create database %s) errdetail(the template database %s was created using collation version %s, but the operating system provides version %s) Also, that check shouldn't be done when using the COLLATION_VERSION option of create database, since it's there for pg_upgrade usage?
New patch that fixes all reported issues, I think: - Added test for ALTER DATABASE / REFRESH COLLATION VERSION - Rewrote AlterDatabaseRefreshCollVersion() with better locking - Added version checking in createdb()
Attachment
On Thu, Feb 10, 2022 at 09:57:59AM +0100, Peter Eisentraut wrote: > New patch that fixes all reported issues, I think: > > - Added test for ALTER DATABASE / REFRESH COLLATION VERSION > > - Rewrote AlterDatabaseRefreshCollVersion() with better locking > > - Added version checking in createdb() Thanks! All issues are indeed fixed. I just have a few additional comments: > From 290ebb9ca743a2272181f435d5ea76d8a7280a0a Mon Sep 17 00:00:00 2001 > From: Peter Eisentraut <peter@eisentraut.org> > Date: Thu, 10 Feb 2022 09:44:20 +0100 > Subject: [PATCH v4] Database-level collation version tracking > + * collation version was specified explicitly as an statement option; that typo, should be "as a statement" > + actual_versionstr = get_collation_actual_version(COLLPROVIDER_LIBC, dbcollate); > + if (!actual_versionstr) > + ereport(ERROR, > + (errmsg("template database \"%s\" has a collation version, but no actual collation version could bedetermined", > + dbtemplate))); > + > + if (strcmp(actual_versionstr, src_collversion) != 0) > + ereport(ERROR, > + (errmsg("template database \"%s\" has a collation version mismatch", > + dbtemplate), > + errdetail("The template database was created using collation version %s, " > + "but the operating system provides version %s.", > + src_collversion, actual_versionstr), > + errhint("Rebuild all objects affected by collation in the template database and run " > + "ALTER DATABASE %s REFRESH COLLATION VERSION, " > + "or build PostgreSQL with the right library version.", > + quote_identifier(dbtemplate)))); After a second read I think the messages are slightly ambiguous. What do you think about specifying the problematic collation name and provider? For now we only support libc default collation so users will probably have to reindex almost everything on that database (not sure if the versioning is more fine grained on Windows), but we should probably still specify "affected by libc collation" in the errhint so they have a chance to avoid unnecessary reindex. And this will hopefully become more important to have those information, when ICU default collations will land :) > +/* > + * ALTER DATABASE name REFRESH COLLATION VERSION > + */ > +ObjectAddress > +AlterDatabaseRefreshColl(AlterDatabaseRefreshCollStmt *stmt) I'm wondering why you changed this function to return an ObjectAddress rather than an Oid? There's no event trigger support for ALTER DATABASE, and the rest of similar utility commands also returns Oid. Other than that it all looks good to me!
On 10.02.22 12:08, Julien Rouhaud wrote: >> + errhint("Rebuild all objects affected by collation in the template database and run " >> + "ALTER DATABASE %s REFRESH COLLATION VERSION, " >> + "or build PostgreSQL with the right library version.", >> + quote_identifier(dbtemplate)))); > > After a second read I think the messages are slightly ambiguous. What do you > think about specifying the problematic collation name and provider? > > For now we only support libc default collation so users will probably have to > reindex almost everything on that database (not sure if the versioning is more > fine grained on Windows), but we should probably still specify "affected by > libc collation" in the errhint so they have a chance to avoid unnecessary > reindex. I think accurate would be something like "objects using the default collation", since objects using a specific collation are not meant, even if they use the same provider. >> +/* >> + * ALTER DATABASE name REFRESH COLLATION VERSION >> + */ >> +ObjectAddress >> +AlterDatabaseRefreshColl(AlterDatabaseRefreshCollStmt *stmt) > > I'm wondering why you changed this function to return an ObjectAddress rather > than an Oid? There's no event trigger support for ALTER DATABASE, and the rest > of similar utility commands also returns Oid. Hmm, I was looking at RenameDatabase() and AlterDatabaseOwner(), which return ObjectAddress.
On Fri, Feb 11, 2022 at 12:07:02PM +0100, Peter Eisentraut wrote: > On 10.02.22 12:08, Julien Rouhaud wrote: > > > + errhint("Rebuild all objects affected by collation in the template database and run " > > > + "ALTER DATABASE %s REFRESH COLLATION VERSION, " > > > + "or build PostgreSQL with the right library version.", > > > + quote_identifier(dbtemplate)))); > > > > After a second read I think the messages are slightly ambiguous. What do you > > think about specifying the problematic collation name and provider? > > > > For now we only support libc default collation so users will probably have to > > reindex almost everything on that database (not sure if the versioning is more > > fine grained on Windows), but we should probably still specify "affected by > > libc collation" in the errhint so they have a chance to avoid unnecessary > > reindex. > > I think accurate would be something like "objects using the default > collation", since objects using a specific collation are not meant, even if > they use the same provider. Technically is the objects explicitly use the same collation as the default collation they should be impacted the same way, but agreed. > > > +/* > > > + * ALTER DATABASE name REFRESH COLLATION VERSION > > > + */ > > > +ObjectAddress > > > +AlterDatabaseRefreshColl(AlterDatabaseRefreshCollStmt *stmt) > > > > I'm wondering why you changed this function to return an ObjectAddress rather > > than an Oid? There's no event trigger support for ALTER DATABASE, and the rest > > of similar utility commands also returns Oid. > > Hmm, I was looking at RenameDatabase() and AlterDatabaseOwner(), which > return ObjectAddress. Apparently I managed to only check AlterDatabase and AlterDatabaseSet, which both return an Oid. Maybe we could also update those two to also return an ObjectAddress, for consistency?
On 11.02.22 13:51, Julien Rouhaud wrote: >>> I'm wondering why you changed this function to return an ObjectAddress rather >>> than an Oid? There's no event trigger support for ALTER DATABASE, and the rest >>> of similar utility commands also returns Oid. >> >> Hmm, I was looking at RenameDatabase() and AlterDatabaseOwner(), which >> return ObjectAddress. > > Apparently I managed to only check AlterDatabase and AlterDatabaseSet, which > both return an Oid. Maybe we could also update those two to also return an > ObjectAddress, for consistency? I have committed this patch. I didn't address the above issue. I looked at it a bit, but I also found other (non-database) object types that had a mix of different return types. It's not clear to me what this is all supposed to mean. If no one is checking the return, they should really all be turned into void, IMO. Maybe this should be a separate discussion.
Hi, On Mon, Feb 14, 2022 at 09:55:19AM +0100, Peter Eisentraut wrote: > I have committed this patch. Great! Do you plan to send a rebased version of the ICU default collation soon or should I start looking at the current v4? > I didn't address the above issue. I looked at it a bit, but I also found > other (non-database) object types that had a mix of different return types. > It's not clear to me what this is all supposed to mean. If no one is > checking the return, they should really all be turned into void, IMO. Maybe > this should be a separate discussion. Agreed.
On 14.02.22 10:14, Julien Rouhaud wrote: > Do you plan to send a rebased version of the ICU default collation > soon or should I start looking at the current v4? I will send an updated patch in the next few days.
On 2022-Feb-14, Peter Eisentraut wrote: > On 11.02.22 13:51, Julien Rouhaud wrote: > > > > I'm wondering why you changed this function to return an ObjectAddress rather > > > > than an Oid? There's no event trigger support for ALTER DATABASE, and the rest > > > > of similar utility commands also returns Oid. > > > > > > Hmm, I was looking at RenameDatabase() and AlterDatabaseOwner(), which > > > return ObjectAddress. > > > > Apparently I managed to only check AlterDatabase and AlterDatabaseSet, which > > both return an Oid. Maybe we could also update those two to also return an > > ObjectAddress, for consistency? > I didn't address the above issue. I looked at it a bit, but I also found > other (non-database) object types that had a mix of different return types. > It's not clear to me what this is all supposed to mean. If no one is > checking the return, they should really all be turned into void, IMO. Maybe > this should be a separate discussion. IIRC we changed the return types of all DDL back when we were doing the event triggers work (first to OIDs and then to ObjectAddress), but we didn't realize at the time that shared objects such as databases etc were not going to be supported by event triggers. So those particular changes were for naught, but we never reverted them. Maybe it's OK to have all the functions supporting databases and tablespaces return void. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ <inflex> really, I see PHP as like a strange amalgamation of C, Perl, Shell <crab> inflex: you know that "amalgam" means "mixture with mercury", more or less, right? <crab> i.e., "deadly poison"