Home > mailing lists

Re: Checking for missing heap/index files - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Checking for missing heap/index files
Date	October 19, 2022 00:44:19
Msg-id	2747057.1666129459@sss.pgh.pa.us Whole thread Raw
In response to	Re: Checking for missing heap/index files (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Checking for missing heap/index files (Robert Haas <robertmhaas@gmail.com>)
List	pgsql-hackers

Tree view

Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Oct 18, 2022 at 3:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Isn't it already the case (or could be made so) that relation file
>> removal happens only in the checkpointer?

> I believe that individual backends directly remove all relation forks
> other than the main fork and all segments other than the first one.

Yeah, obviously some changes would need to be made to that, but ISTM
we could just treat all the forks as we now treat the first one.

> The discussion on various other threads has been in the direction of
> trying to standardize on moving that last case out of the checkpointer
> - i.e. getting rid of what Thomas dubbed "tombstone" files - which is
> pretty much the exact opposite of this proposal.

Yeah, we'd have to give up on that.  If that goes anywhere then
it kills this idea.

> But even apart from
> that, I don't think this would be that easy to implement. If you
> removed a large relation, you'd have to tell the checkpointer to
> remove many files instead of just 1.

The backends just implement this by deleting files until they don't
find the next one in sequence.  I fail to see how it'd be any
harder for the checkpointer to do that.

> And I don't think we really need to do any of that. We could invent a
> new kind of lock tag for <dboid/tsoid> combination. Take a share lock
> to create or remove files. Take an exclusive lock to scan the
> directory. I think that accomplishes the same thing as your proposal,
> but more directly, and with less overhead. It's still substantially
> more than NO overhead, though.

My concern about that is that it implies touching a whole lot of
places, and if you miss even one then you've lost whatever guarantee
you thought you were getting.  More, there's no easy way to find
all the relevant places (some will be in extensions, no doubt).
So I have approximately zero faith that it could be made reliable.
Funneling things through the checkpointer would make that a lot
more centralized.  I concede that cowboy unlink() calls could still
be a problem ... but I doubt there's any solution that's totally
free of that hazard.

            regards, tom lane

pgsql-hackers by date:

From: Robert Haas
Date: 19 October 2022, 00:34:05
Subject: Re: Checking for missing heap/index files

From: Peter Smith
Date: 19 October 2022, 02:09:12
Subject: Fix typo in code comment

Re: Checking for missing heap/index files - Mailing list pgsql-hackers

Previous

Next