Re: Checking for missing heap/index files - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Checking for missing heap/index files
Date
Msg-id 2747057.1666129459@sss.pgh.pa.us
Whole thread Raw
In response to Re: Checking for missing heap/index files  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Checking for missing heap/index files  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Oct 18, 2022 at 3:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Isn't it already the case (or could be made so) that relation file
>> removal happens only in the checkpointer?

> I believe that individual backends directly remove all relation forks
> other than the main fork and all segments other than the first one.

Yeah, obviously some changes would need to be made to that, but ISTM
we could just treat all the forks as we now treat the first one.

> The discussion on various other threads has been in the direction of
> trying to standardize on moving that last case out of the checkpointer
> - i.e. getting rid of what Thomas dubbed "tombstone" files - which is
> pretty much the exact opposite of this proposal.

Yeah, we'd have to give up on that.  If that goes anywhere then
it kills this idea.

> But even apart from
> that, I don't think this would be that easy to implement. If you
> removed a large relation, you'd have to tell the checkpointer to
> remove many files instead of just 1.

The backends just implement this by deleting files until they don't
find the next one in sequence.  I fail to see how it'd be any
harder for the checkpointer to do that.

> And I don't think we really need to do any of that. We could invent a
> new kind of lock tag for <dboid/tsoid> combination. Take a share lock
> to create or remove files. Take an exclusive lock to scan the
> directory. I think that accomplishes the same thing as your proposal,
> but more directly, and with less overhead. It's still substantially
> more than NO overhead, though.

My concern about that is that it implies touching a whole lot of
places, and if you miss even one then you've lost whatever guarantee
you thought you were getting.  More, there's no easy way to find
all the relevant places (some will be in extensions, no doubt).
So I have approximately zero faith that it could be made reliable.
Funneling things through the checkpointer would make that a lot
more centralized.  I concede that cowboy unlink() calls could still
be a problem ... but I doubt there's any solution that's totally
free of that hazard.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Checking for missing heap/index files
Next
From: Peter Smith
Date:
Subject: Fix typo in code comment