On Fri, Aug 7, 2020 at 9:33 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > On Sat, Aug 1, 2020 at 1:53 AM Andres Freund <andres@anarazel.de> wrote:
> >> We could also just use pg_class.relpages. It'll probably mostly be
> >> accurate enough?
>
> > Don't we need the accurate 'number of blocks' if we want to invalidate
> > all the buffers? Basically, I think we need to perform BufTableLookup
> > for all the blocks in the relation and then Invalidate all buffers.
>
> Yeah, there is no room for "good enough" here. If a dirty buffer remains
> in the system, the checkpointer will eventually try to flush it, and fail
> (because there's no file to write it to), and then checkpointing will be
> stuck. So we cannot afford to risk missing any buffers.
>
Right, this reminds me of the discussion we had last time on this
topic where we decided that we can't even rely on using smgrnblocks to
find the exact number of blocks because lseek might lie about the EOF
position [1]. So, we anyway need some mechanism to push the
information related to the "to be truncated or dropped relations" to
the background worker (checkpointer and or others) to avoid flush
issues. But, maybe it is better to push the responsibility of
invalidating the buffers for truncated/dropped relation to the
background process. However, I feel for some cases where relation size
is greater than the number of shared buffers there might not be much
benefit in pushing this operation to background unless there are
already a few other relation entries (for dropped relations) so that
cost of scanning the buffers can be amortized.
[1] - https://www.postgresql.org/message-id/16664.1435414204%40sss.pgh.pa.us
--
With Regards,
Amit Kapila.