Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE) - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE) |
Date | |
Msg-id | 20190501190550.jigb3tem3xxzspad@alap3.anarazel.de Whole thread Raw |
In response to | Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE) (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE) Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE) |
List | pgsql-hackers |
Hi, On 2019-05-01 14:44:12 -0400, Tom Lane wrote: > I'm busy looking into the REINDEX-on-pg_class mess, and one thing I found > while testing HEAD is that with CLOBBER_CACHE_ALWAYS on, once you've > gotten a failure, your session is hosed: > > regression=# select 1; > ?column? > ---------- > 1 > (1 row) > > regression=# reindex index pg_class_oid_index; > psql: ERROR: could not read block 0 in file "base/16384/35344": read only 0 of 8192 bytes > regression=# select 1; > psql: ERROR: could not open file "base/16384/35344": No such file or directory Yea, I noticed that too. Lead me astray for a while, because it triggered apparent REINDEX failures for pg_index too, even though not actually related to the reindex. > The problem is that the nailed relcache entry for pg_class_oid_index got > updated to point to the new relfilenode, and that change did not get > undone by transaction abort. There seem to be several bugs contributing > to that, but the one I'm asking about right now is that commit ae9aba69a > caused RelationCacheInvalidate to skip over relations that have > rd_newRelfilenodeSubid set, as indeed pg_class_oid_index will have at the > instant of the failure. That commit message is quite unhelpful about what exactly this was intended to fix. I assume it was intended to make COPY FREEZE usage predictable, but... > This seems quite wrong, because it prevents us from rebuilding the > entry with corrected values. In particular notice that the change > causes us to skip the RelationInitPhysicalAddr call that would > normally be applied to a nailed mapped index in that loop. That's > completely fatal in this case --- it keeps us from restoring the > correct relfilenode that the mapper would now tell us, if we only > bothered to ask. Indeed. I'm a bit surprised that doesn't lead to more problems. Not sure I understand where the RelationCacheInvalidate() call is coming from in this case though. Shouldn't the entry have been invalidated individually through ATEOXact_Inval(false)? > I think perhaps what we should do is revert ae9aba69a and instead do > > relcacheInvalsReceived++; > > - if (RelationHasReferenceCountZero(relation)) > + if (RelationHasReferenceCountZero(relation) && > + relation->rd_newRelfilenodeSubid == InvalidSubTransactionId) > { > /* Delete this entry immediately */ > Assert(!relation->rd_isnailed); > > so that entries with rd_newRelfilenodeSubid nonzero are preserved but are > rebuilt. Seems like a reasonable approach. > There might be an argument for treating rd_createSubid the same way, > not sure. That field wouldn't ever be set on a system catalog, so > it's less of a hazard, but there's something to be said for treating > the two fields alike. Seems sensible to treat it the same way. Not sure if there's an actual problem where the current treatment could cause a problem, but seems worth improving. Greetings, Andres Freund
pgsql-hackers by date: