Re: Collation versioning - Mailing list pgsql-hackers
From | Julien Rouhaud |
---|---|
Subject | Re: Collation versioning |
Date | |
Msg-id | 20200402130045.GE64485@nol Whole thread Raw |
In response to | Re: Collation versioning (Julien Rouhaud <rjuju123@gmail.com>) |
Responses |
Re: Collation versioning
|
List | pgsql-hackers |
On Fri, Mar 20, 2020 at 02:52:33PM +0100, Julien Rouhaud wrote: > On Thu, Mar 19, 2020 at 08:12:47PM +0100, Julien Rouhaud wrote: > > On Thu, Mar 19, 2020 at 12:31:54PM +0900, Michael Paquier wrote: > > > On Wed, Mar 18, 2020 at 04:35:43PM +0100, Julien Rouhaud wrote: > > > > On Wed, Mar 18, 2020 at 09:56:40AM +0100, Julien Rouhaud wrote: > > > > AFAICT it was only missing a call to index_update_collation_versions() in > > > > ReindexRelationConcurrently. I added regression tests to make sure that > > > > REINDEX, REINDEX [INDEX|TABLE] CONCURRENTLY and VACUUM FULL are doing what's > > > > expected. > > > > > > If you add a call to index_update_collation_versions(), the old and > > > invalid index will use the same refobjversion as the new index, which > > > is the latest collation version of the system, no? If the operation > > > is interrupted before the invalid index is dropped, then we would keep > > > a confusing value for refobjversion, because the old invalid index > > > does not rely on the new collation version, but on the old one. > > > Hence, it seems to me that it would be correct to have the old invalid > > > index either use an empty version string to say "we don't know" > > > because the index is invalid anyway, or keep a reference to the old > > > collation version intact. I think that the latter is much more useful > > > for debugging issues when upgrading a subset of indexes if the > > > operation is interrupted for a reason or another. > > > > Indeed, I confused the _ccold and _ccnew indexes. So, the root cause is phase > > 4, more precisely the dependency swap in index_concurrently_swap. > > > > A possible fix would be to teach changeDependenciesOf() to preserve the > > dependency version. It'd be quite bit costly as this would mean an extra index > > search for each dependency row found. We could probably skip the lookup if the > > row have a NULL recorded version, as version should either be null or non null > > for both objects. > > > > I'm wondering if that's a good time to make changeDependenciesOf and > > changeDependenciesOn private, and instead expose a swapDependencies(classid, > > obj1, obj2) that would call both, as preserving the version doesn't really > > makes sense outside a switch. It's als oa good way to ensure that no CCI is > > performed in the middle. > > Hearing no complaints, I implemented that approach in attached v18. > > Here's the new behavior for interrupted REINDEX CONCURRENTLY: > > # drop table if exists t1;create table t1(id integer, val text); create index on t1(val collate "fr-x-icu"); > NOTICE: 00000: table "t1" does not exist, skipping > DROP TABLE > CREATE TABLE > CREATE INDEX > > # update pg_depend set refobjversion = 'meh' where refobjversion = '153.97'; > UPDATE 1 > > # select objid::regclass, indisvalid, refobjversion from pg_depend d join pg_index i on i.indexrelid = d.objid where refobjversionis not null; > objid | indisvalid | refobjversion > ------------+------------+--------------- > t1_val_idx | t | meh > (1 row) > > (on another session: begin; select * from t1 for update;) > > # reindex table CONCURRENTLY t1; > ^CCancel request sent > ERROR: 57014: canceling statement due to user request > > # select objid::regclass, indisvalid, refobjversion from pg_depend d join pg_index i on i.indexrelid = d.objid where refobjversionis not null; > objid | indisvalid | refobjversion > ------------------+------------+--------------- > t1_val_idx_ccold | f | meh > t1_val_idx | t | 153.97 > (2 rows) > > # reindex table CONCURRENTLY t1; > WARNING: 0A000: cannot reindex invalid index "public.t1_val_idx_ccold" concurrently, skipping > WARNING: XX002: cannot reindex invalid index "pg_toast.pg_toast_16385_index_ccold" concurrently, skipping > REINDEX > > # select objid::regclass, indisvalid, refobjversion from pg_depend d join pg_index i on i.indexrelid = d.objid where refobjversionis not null; > objid | indisvalid | refobjversion > ------------------+------------+--------------- > t1_val_idx_ccold | f | meh > t1_val_idx | t | 153.97 > (2 rows) > > # reindex table t1; > WARNING: 0A000: cannot reindex invalid index "pg_toast.pg_toast_16385_index_ccold" on TOAST table, skipping > REINDEX > > # select objid::regclass, indisvalid, refobjversion from pg_depend d join pg_index i on i.indexrelid = d.objid where refobjversionis not null; > objid | indisvalid | refobjversion > ------------------+------------+--------------- > t1_val_idx_ccold | t | 153.97 > t1_val_idx | t | 153.97 > (2 rows) > > > I also rebased the patchset against master (so removing the regcollation > patch), but no other changes otherwise, so there's still the direct updates on > the catalog in the regressoin tests. Conflict since 2f9eb3132 (pg_dump: Allow dumping data of specific foreign servers), v19 attached.
Attachment
pgsql-hackers by date: