Re: drop/truncate table sucks for large values of shared buffers - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: drop/truncate table sucks for large values of shared buffers
Date
Msg-id CAA4eK1JyKYq2E8L3DeRE7LVUkEu5UTMFTz-ULMuv6NZyQkV0eg@mail.gmail.com
Whole thread Raw
In response to Re: drop/truncate table sucks for large values of shared buffers  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sun, Jun 28, 2015 at 9:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > On 27 June 2015 at 15:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I don't like this too much because it will fail badly if the caller
> >> is wrong about the maximum possible page number for the table, which
> >> seems not exactly far-fetched.  (For instance, remember those kernel bugs
> >> we've seen that cause lseek to lie about the EOF position?)
>
> > If that is true, then our reliance on lseek elsewhere could also cause data
> > loss, for example by failing to scan data during a seq scan.
>
> The lseek point was a for-example, not the entire universe of possible
> problem sources for this patch.  (Also, underestimating the EOF point in
> a seqscan is normally not an issue since any rows in a just-added page
> are by definition not visible to the scan's snapshot.

How do we ensure that just-added page is before or after the scan's snapshot?
If it is before, then the above point mentioned by Simon is valid.  Does this
mean that all other usages of smgrnblocks()/mdnblocks() is safe with respect
to this issue or the consequences will not be so bad as for this usage?

>  But I digress.)
>
> > The consequences of failure of lseek in this case are nowhere near as dire,
> > since by definition the data is being destroyed by the user.
>
> I'm not sure what you consider "dire", but missing a dirty buffer
> belonging to the to-be-destroyed table would result in the system being
> permanently unable to checkpoint, because attempts to write out the buffer
> to the no-longer-extant file would fail.

So another idea here could be that if instead of failing, we just ignore the
error in case the the object (to which that page belongs) doesn't exist and
we can make Drop free by not invalidating from shared_buffers in case of
Drop/Truncate.  I think this might not be sane idea as we need to have a
way to do lookup of objects from checkpoint and need to handle the case
where same Oid could be assigned to new objects (after wraparound?). 



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Refactoring pgbench.c
Next
From: Robert Haas
Date:
Subject: Re: anole: assorted stability problems