Re: Improving replay of XLOG_BTREE_VACUUM records - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Improving replay of XLOG_BTREE_VACUUM records
Date
Msg-id 20150902221658.GC8555@awork2.anarazel.de
Whole thread Raw
In response to Re: Improving replay of XLOG_BTREE_VACUUM records  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
On 2015-09-02 22:46:59 +0100, Greg Stark wrote:
> On Sun, Jul 26, 2015 at 1:46 PM, Andres Freund <andres@anarazel.de> wrote:
> > To me it sounds like this shouldn't go through the full ReadBuffer()
> > rigamarole. That code is already complex enough, and here it's really
> > not needed. I think it'll be much easier to review - and actually faster
> > in many cases to simply have something like
> >
> > bool
> > BufferInCache(Relation, ForkNumber, BlockNumber)
> > {
> >     /* XXX: setup tag, hash, partition */
> >
> >     LWLockAcquire(newPartitionLock, LW_SHARED);
> >     buf_id = BufTableLookup(&newTag, newHash);
> >     LWLockRelease(newPartitionLock);
> >     return buf_id != -1;
> > }
> >
> > and then fall back to the normal ReadBuffer() when it's in cache.
> 
> 
> I guess I'm missing something but how is this API useful? As soon as
> its returned it the result might be invalid since before you actually
> make use of the return value another process could come and read in
> and pin the page in question. Is this part of some interlock where you
> only have to know it was unpinned at some point in time since some
> other event?

Yes. We're talking about this block:    for (blkno = xlrec->lastBlockVacuumed + 1; blkno < thisblkno; blkno++)    {
  /*         * We use RBM_NORMAL_NO_LOG mode because it's not an error         * condition to see all-zero pages.  The
originalbtvacuumpage         * scan would have skipped over all-zero pages, noting them in FSM         * but not
botheringto initialize them just yet; so we mustn't         * throw an error here.  (We could skip acquiring the
cleanuplock         * if PageIsNew, but it's probably not worth the cycles to test.)         *         * XXX we don't
actuallyneed to read the block, we just need to         * confirm it is unpinned. If we had a special call into the
   * buffer manager we could optimise this so that if the block is         * not in shared_buffers we confirm it as
unpinned.        */        buffer = XLogReadBufferExtended(thisrnode, MAIN_FORKNUM, blkno,
         RBM_NORMAL_NO_LOG);        if (BufferIsValid(buffer))        {            LockBufferForCleanup(buffer);
   UnlockReleaseBuffer(buffer);        }    }}
 


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore
Next
From: Andres Freund
Date:
Subject: Re: Allow a per-tablespace effective_io_concurrency setting