Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6 - Mailing list pgsql-hackers

From Tom Lane
Subject Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6
Date
Msg-id 26665.1556218209@sss.pgh.pa.us
Whole thread Raw
In response to Re: REINDEX INDEX results in a crash for an index of pg_class since9.6  (Andres Freund <andres@anarazel.de>)
Responses Re: REINDEX INDEX results in a crash for an index of pg_class since9.6
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2019-04-25 12:29:16 -0400, Tom Lane wrote:
>> So looking at that, it seems like the table_relation_set_new_filenode
>> API is pretty darn ill-designed.  It assumes that it's passed an
>> already-entirely-valid relcache entry, but it also supposes that
>> it can pass back information that needs to go into the relation's
>> pg_class entry.  One or the other side of that has to give, unless
>> you want to doom everything to updating pg_class twice.

> I'm not super happy about it either - but I think that's somewhat of an
> outgrowth of how this worked before.

I'm not saying that the previous code was nice; I'm just saying that
what is there in HEAD needs to be factored differently so that we
can solve this problem in a reasonable way.

> Perhaps that's because I don't quite understand what you mean with "It
> assumes that it's passed an already-entirely-valid relcache entry". What
> do you mean by that / where does it assume that?

Well, I can see heapam_relation_set_new_filenode touching all of these
fields right now:

rel->rd_node
rel->rd_rel->relpersistence (and why is it looking at that rather than
the passed-in persistence???)
rel->rd_rel->relkind
whatever RelationOpenSmgr touches
rel->rd_smgr

As far as I can see, there is no API restriction on what parts of the
relcache entry it may presume are valid.  It *certainly* thinks that
rd_rel is valid, which is rather at odds with the fact that this has
to be called before the pg_class entry exists all (for the creation
case) or has been updated (for the set-new-relfilenode case).  Unless
you want to redefine things so that we create/update the pg_class
entry, put it into rel->rd_rel, call relation_set_new_filenode, and
then update the pg_class entry again with what that function gives back
for the xmin fields.

That's obviously stupid, of course.  But my point is that we need to
restrict what the function can touch or assume valid, if it's going
to be called before the pg_class update happens.  And I'd rather that
we did so by restricting its argument list so that it hasn't even got
access to stuff we don't want it assuming valid.

Also, in order to fix this problem, we cannot change the actual
relcache entry contents until after we've performed the tuple update.
So if we want the tableam code to be getting the relfilenode or
persistence info out of the relcache entry, rather than passing
those as standalone parameters, the call can't happen till after
the tuple update and CCI call.  That's why I was thinking about
splitting it into two functions.  Get the xid values, update the
pg_class tuple, CCI, then do relation_set_new_filenode with the
updated relcache entry would work.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: pg_waldump and PREPARE
Next
From: Alvaro Herrera
Date:
Subject: Re: Segfault when restoring -Fd dump on current HEAD