Re: Collation versioning - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Collation versioning
Date
Msg-id 20200320135233.GA52612@nol
Whole thread Raw
In response to Re: Collation versioning  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Collation versioning
List pgsql-hackers
On Thu, Mar 19, 2020 at 08:12:47PM +0100, Julien Rouhaud wrote:
> On Thu, Mar 19, 2020 at 12:31:54PM +0900, Michael Paquier wrote:
> > On Wed, Mar 18, 2020 at 04:35:43PM +0100, Julien Rouhaud wrote:
> > > On Wed, Mar 18, 2020 at 09:56:40AM +0100, Julien Rouhaud wrote:
> > > AFAICT it was only missing a call to index_update_collation_versions() in
> > > ReindexRelationConcurrently.  I added regression tests to make sure that
> > > REINDEX, REINDEX [INDEX|TABLE] CONCURRENTLY and VACUUM FULL are doing what's
> > > expected.
> > 
> > If you add a call to index_update_collation_versions(), the old and
> > invalid index will use the same refobjversion as the new index, which
> > is the latest collation version of the system, no?  If the operation
> > is interrupted before the invalid index is dropped, then we would keep
> > a confusing value for refobjversion, because the old invalid index
> > does not rely on the new collation version, but on the old one.
> > Hence, it seems to me that it would be correct to have the old invalid
> > index either use an empty version string to say "we don't know"
> > because the index is invalid anyway, or keep a reference to the old
> > collation version intact.  I think that the latter is much more useful
> > for debugging issues when upgrading a subset of indexes if the
> > operation is interrupted for a reason or another.
> 
> Indeed, I confused the _ccold and _ccnew indexes.  So, the root cause is phase
> 4, more precisely the dependency swap in index_concurrently_swap.
> 
> A possible fix would be to teach changeDependenciesOf() to preserve the
> dependency version.  It'd be quite bit costly as this would mean an extra index
> search for each dependency row found.  We could probably skip the lookup if the
> row have a NULL recorded version, as version should either be null or non null
> for both objects.
> 
> I'm wondering if that's a good time to make changeDependenciesOf and
> changeDependenciesOn private, and instead expose a swapDependencies(classid,
> obj1, obj2) that would call both, as preserving the version doesn't really
> makes sense outside a switch.  It's als oa good way to ensure that no CCI is
> performed in the middle.

Hearing no complaints, I implemented that approach in attached v18.

Here's the new behavior for interrupted REINDEX CONCURRENTLY:

# drop table if exists t1;create table t1(id integer, val text); create index on t1(val collate "fr-x-icu");
NOTICE:  00000: table "t1" does not exist, skipping
DROP TABLE
CREATE TABLE
CREATE INDEX

# update pg_depend set refobjversion = 'meh' where refobjversion = '153.97';
UPDATE 1

# select objid::regclass, indisvalid, refobjversion from pg_depend d join pg_index i on i.indexrelid = d.objid where
refobjversionis not null;
 
   objid    | indisvalid | refobjversion
------------+------------+---------------
 t1_val_idx | t          | meh
(1 row)

(on another session: begin; select * from t1 for update;)

# reindex table CONCURRENTLY t1;
^CCancel request sent
ERROR:  57014: canceling statement due to user request

# select objid::regclass, indisvalid, refobjversion from pg_depend d join pg_index i on i.indexrelid = d.objid where
refobjversionis not null;
 
      objid       | indisvalid | refobjversion
------------------+------------+---------------
 t1_val_idx_ccold | f          | meh
 t1_val_idx       | t          | 153.97
(2 rows)

# reindex table CONCURRENTLY t1;
WARNING:  0A000: cannot reindex invalid index "public.t1_val_idx_ccold" concurrently, skipping
WARNING:  XX002: cannot reindex invalid index "pg_toast.pg_toast_16385_index_ccold" concurrently, skipping
REINDEX

# select objid::regclass, indisvalid, refobjversion from pg_depend d join pg_index i on i.indexrelid = d.objid where
refobjversionis not null;
 
      objid       | indisvalid | refobjversion
------------------+------------+---------------
 t1_val_idx_ccold | f          | meh
 t1_val_idx       | t          | 153.97
(2 rows)

# reindex table t1;
WARNING:  0A000: cannot reindex invalid index "pg_toast.pg_toast_16385_index_ccold" on TOAST table, skipping
REINDEX

# select objid::regclass, indisvalid, refobjversion from pg_depend d join pg_index i on i.indexrelid = d.objid where
refobjversionis not null;
 
      objid       | indisvalid | refobjversion
------------------+------------+---------------
 t1_val_idx_ccold | t          | 153.97
 t1_val_idx       | t          | 153.97
(2 rows)


I also rebased the patchset against master (so removing the regcollation
patch), but no other changes otherwise, so there's still the direct updates on
the catalog in the regressoin tests.

Attachment

pgsql-hackers by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: Berserk Autovacuum (let's save next Mandrill)
Next
From: Sergei Kornilov
Date:
Subject: Re: Planning counters in pg_stat_statements (using pgss_store)