Re: reloption to prevent VACUUM from truncating empty pages at the end of relation - Mailing list pgsql-hackers

From Tom Lane
Subject Re: reloption to prevent VACUUM from truncating empty pages at the end of relation
Date
Msg-id 1261.1551392263@sss.pgh.pa.us
Whole thread Raw
In response to Re: reloption to prevent VACUUM from truncating empty pages at theend of relation  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2019-Feb-28, Tom Lane wrote:
>> I wasn't really working on that for v12 --- I figured it was way
>> too late in the cycle to be starting on such a significant change.

> Oh, well, it certainly seems far too late *now*.  However, what about
> the idea in 
> https://postgr.es/m/1255.1544562482@sss.pgh.pa.us
> namely that we write out the buffers involved?  That sounds like it
> might be backpatchable, and thus it's not too late for it.

I think that what we had in mind at that point was that allowing forced
writes of empty-but-dirty pages would provide a back-patchable solution
to the problem of ftruncate() failure leaving corrupt state on-disk.
That would not, by itself, remove the need for AccessExclusiveLock, so it
doesn't seem like it would eliminate people's desire for the kind of knob
being discussed here.

Thinking about it, the need for AEL is mostly independent of the data
corruption problem; rather, it's a hack to avoid needing to think about
concurrent-truncation scenarios in table readers.  We could fairly
easily reduce the lock level to something less than AEL if we just
taught seqscans, indexscans, etc that trying to read a page beyond
EOF is not an error.  (Reducing the lock level to the point where
we could allow concurrent *writers* is a much harder problem, I think.
But to ameliorate the issues for standbys, we just need to allow
concurrent readers.)  And we'd have to do something about readers
possibly loading doomed pages back into shmem before the truncation
happens; maybe that can be fixed just by truncating first and flushing
buffers second?

I think the $64 question is whether we're giving up any meaningful degree
of error detection if we allow read-beyond-EOF to not be an error.  If we
conclude that we're not, maybe it wouldn't be a very big patch?

            regards, tom lane


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Drop type "smgr"?
Next
From: Tom Lane
Date:
Subject: Re: Drop type "smgr"?