Thread: 16: Collation versioning and dependency helpers
Motivation: We haven't fully solved the changing collation-provider problem. An upgrade of the OS may change the version of libc or icu, and that might affect the collation, which could leave you with various corrupt database objects including: * indexes * constraints * range types or multiranges (or other types dependent on collation for internal consistency) * materialized views * partitioned tables (range or hash) There's discussion about trying to reliably detect these changes and remedy them. But there are major challenges; for instance, glibc doesn't give a reliable signal that a collation may have changed, which would leave us with a lot of false positives and create a new set of problems (e.g. reindexing when it's unnecessary). And even with ICU, we don't have a way to support multiple versions of a provider or of a single collation, so trying to upgrade would still be a hassle. Proposal: Add in some tools to make it easier for administrators to find out if they are at risk and solve the problem for themselves in a systematic way. Patches: 0001: Treat "default" collation as unpinned, so that entries in pg_depend are created. The rationale is that, since the "default" collation can change, it's not really an immutable system object, and it's worth tracking which objects are affected by it. It seems to bloat pg_depend by about 5-10% though -- that doesn't seem great, but I'm not sure if it's a real problem or not. 0002: Enable pg_collation_actual_version() to work on the default collation (oid=100) so that it doesn't need to be treated as a special case. 0003: Fix ALTER COLLATION "default" REFRESH VERSION, which currently throws an unhelpful internal error. Instead, issue a more helpful error that suggests "ALTER DATABASE ... REFRESH COLLATION VERSION" instead. 0004: Add system views: pg_collation_versions: quickly see the current (from the catalog) and actual (from the provider) versions of each collation pg_collation_dependencies: map of objects to the collations they depend on Along with these patches, you can use some tricks to verify data, such as /contrib/amcheck; or fix the data with things like: * REINDEX * VACUUM FULL/TRUNCATE/CLUSTER * REFRESH MATERIALIZED VIEW And then refresh the collation version when you're confident that your data is valid. TODO: * The dependencies view is not rigorously complete, because the directed dependency graph doesn't quite establish an "affected by" relationship. One exception is that a composite type doesn't depend on its associated relation, so a composite type over a range type doesn't depend on the range type. * Consider adding in some verification helpers that can verify that a value is still valid (e.g. a range type that depends on a collation might have corrupt values). We could have a collation verifier for types that are collation-dependenent, or perhaps just go through the input and output functions and catch any errors. * Consider better tracking of which collation versions were active on a particular object since the last REINDEX (or REFRESH MATERIALIZED VIEW, TRUNCATE, or other command that would remove any trace of data affected by the previous collation version). Regards, Jeff Davis
Attachment
On Sun, Oct 30, 2022 at 5:41 PM Jeff Davis <pgsql@j-davis.com> wrote: > We haven't fully solved the changing collation-provider problem. An > upgrade of the OS may change the version of libc or icu, and that might > affect the collation, which could leave you with various corrupt > database objects including: > > * indexes > * constraints > * range types or multiranges (or other types dependent > on collation for internal consistency) > * materialized views > * partitioned tables (range or hash) Check. > There's discussion about trying to reliably detect these changes and > remedy them. But there are major challenges; for instance, glibc > doesn't give a reliable signal that a collation may have changed, which > would leave us with a lot of false positives and create a new set of > problems (e.g. reindexing when it's unnecessary). And even with ICU, we > don't have a way to support multiple versions of a provider or of a > single collation, so trying to upgrade would still be a hassle. FWIW some experimental code for multi-version ICU is proposed for discussion here: https://commitfest.postgresql.org/40/3956/ > Proposal: > > Add in some tools to make it easier for administrators to find out if > they are at risk and solve the problem for themselves in a systematic > way. Excellent goal. > Patches: > > 0001: Treat "default" collation as unpinned, so that entries in > pg_depend are created. The rationale is that, since the "default" > collation can change, it's not really an immutable system object, and > it's worth tracking which objects are affected by it. It seems to bloat > pg_depend by about 5-10% though -- that doesn't seem great, but I'm not > sure if it's a real problem or not. FWIW we did this (plus a lot more) in the per-index version tracking feature reverted from 14. > 0002: Enable pg_collation_actual_version() to work on the default > collation (oid=100) so that it doesn't need to be treated as a special > case. Makes sense. > 0003: Fix ALTER COLLATION "default" REFRESH VERSION, which currently > throws an unhelpful internal error. Instead, issue a more helpful error > that suggests "ALTER DATABASE ... REFRESH COLLATION VERSION" instead. Makes sense. > 0004: Add system views: > pg_collation_versions: quickly see the current (from the catalog) > and actual (from the provider) versions of each collation > pg_collation_dependencies: map of objects to the collations they > depend on > > Along with these patches, you can use some tricks to verify data, such > as /contrib/amcheck; or fix the data with things like: > > * REINDEX > * VACUUM FULL/TRUNCATE/CLUSTER > * REFRESH MATERIALIZED VIEW > > And then refresh the collation version when you're confident that your > data is valid. Here you run into an argument that we had many times in that cycle: what's the point of views that suffer both false positives and false negatives? > TODO: > * Consider better tracking of which collation versions were active on > a particular object since the last REINDEX (or REFRESH MATERIALIZED > VIEW, TRUNCATE, or other command that would remove any trace of data > affected by the previous collation version). Right, the per-object dependency tracking feature, reverted from 14, aimed to do exactly that. It fell down on (1) some specific bugs that were hard to fix, like dependencies inherited via composite types when you change the composite type, and (2) doubt expressed by Tom, and earlier Stephen, that pg_depend was a good place to store version information.
On Sun, 2022-10-30 at 19:10 +1300, Thomas Munro wrote: > FWIW we did this (plus a lot more) in the per-index version tracking > feature reverted from 14. Thank you. I will catch up on that patch/thread. > > 0002: Enable pg_collation_actual_version() to work on the default > > Makes sense. > > > 0003: Fix ALTER COLLATION "default" REFRESH VERSION, which > > Makes sense. Committed these two small changes. > > 0004: Add system views: > > pg_collation_versions: quickly see the current (from the > > catalog) > > and actual (from the provider) versions of each collation > > pg_collation_dependencies: map of objects to the collations > > they > > depend on > > > > Along with these patches, you can use some tricks to verify data, > > such > > as /contrib/amcheck; or fix the data with things like: > > > > * REINDEX > > * VACUUM FULL/TRUNCATE/CLUSTER > > * REFRESH MATERIALIZED VIEW > > > > And then refresh the collation version when you're confident that > > your > > data is valid. > > Here you run into an argument that we had many times in that cycle: > what's the point of views that suffer both false positives and false > negatives? The pg_collation_versions view is just a convenience, useful because the default collation isn't represented normally in pg_collation so it needs to be special-cased. I could see how it would be tricky to precisely track the dependencies through composite types (that is, create the proper pg_depend records), but to just provide a view of the affected-by relationship seems more doable. I'll review the previous discussion and see what I come up with. Of course, the view will just show an "affected by" relationship, it won't show which objects are actually in violation of the current collation version. But it at least gives the administrator a starting place. Regards, Jeff Davis
Hi, On 2022-10-31 16:36:54 -0700, Jeff Davis wrote: > Committed these two small changes. FWIW, as it stands cfbot can't apply the remaining changes: http://cfbot.cputube.org/patch_40_3977.log Perhaps worth posting a new version? Or are the remaining patches abandoned in favor of the other threads? Greetings, Andres Freund
On Tue, 2022-12-06 at 10:53 -0800, Andres Freund wrote: > Perhaps worth posting a new version? Or are the remaining patches > abandoned in > favor of the other threads? Marked what is there as committed, and the remainder is abandoned in favor of other threads. Thanks, Jeff Davis