Re: optimizing vacuum truncation scans - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: optimizing vacuum truncation scans
Date
Msg-id CAMkU=1xjKWHMb2+OGf6eiDrkhOMZPVP8PWX=2k=L6=7qd1bVDA@mail.gmail.com
Whole thread Raw
In response to Re: optimizing vacuum truncation scans  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: optimizing vacuum truncation scans  (Simon Riggs <simon@2ndQuadrant.com>)
Re: optimizing vacuum truncation scans  (Haribabu Kommi <kommi.haribabu@gmail.com>)
List pgsql-hackers
On Mon, Jul 27, 2015 at 1:40 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 22 July 2015 at 17:11, Jeff Janes <jeff.janes@gmail.com> wrote:
On Wed, Jul 22, 2015 at 6:59 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jun 29, 2015 at 1:54 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> Attached is a patch that implements the vm scan for truncation.  It
> introduces a variable to hold the last blkno which was skipped during the
> forward portion.  Any blocks after both this blkno and after the last
> inspected nonempty page (which the code is already tracking) must have been
> observed to be empty by the current vacuum.  Any other process rendering the
> page nonempty are required to clear the vm bit, and no other process can set
> the bit again during the vacuum's lifetime.  So if the bit is still set, the
> page is still empty without needing to inspect it.

Urgh.  So if we do this, that forever precludes having HOT pruning set
the all-visible bit. 

I wouldn't say forever, as it would be easy to revert the change if something more important came along that conflicted with it. 

I think what is being said here is that someone is already using this technique, or if not, then we actively want to encourage them to do so as an extension or as a submission to core.

In that case, I think the rely-on-VM technique sinks again, sorry Jim, Jeff. Probably needs code comments added.

Sure, that sounds like the consensus.  The VM method was very efficient, but I agree it is pretty fragile and restricting.
 

That does still leave the prefetch technique, so all is not lost.

Can we see a patch with just prefetch, probably with a simple choice of stride? Thanks.

I probably won't get back to it this commit fest, so it can be set to returned with feedback.  But if anyone has good ideas for how to set the stride (or detect that it is on SSD and so is pointless to try) I'd love to hear about them anytime.

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Geoff Winkless
Date:
Subject: Re: ON CONFLICT DO UPDATE using EXCLUDED.column gives an error about mismatched types
Next
From: Tom Lane
Date:
Subject: Re: Minimum tuple threshold to decide last pass of VACUUM