Re: Page replacement algorithm in buffer cache - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: Page replacement algorithm in buffer cache
Date
Msg-id CAHyXU0wKmB5WXS+hAt2QmGfpTNdsz4Dx1nR3GCKvsNBOzt2ZbQ@mail.gmail.com
Whole thread Raw
In response to Re: Page replacement algorithm in buffer cache  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Page replacement algorithm in buffer cache  (Ants Aasma <ants@cybertec.at>)
List pgsql-hackers
On Fri, Mar 22, 2013 at 3:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Merlin Moncure <mmoncure@gmail.com> writes:
>> I think there is some very low hanging optimization fruit in the clock
>> sweep loop.   first and foremost, I see no good reason why when
>> scanning pages we have to spin and wait on a buffer in order to
>> pedantically adjust usage_count.  some simple refactoring there could
>> set it up so that a simple TAS (or even a TTAS with the first test in
>> front of the cache line lock as we done automatically in x86 IIRC)
>> could guard the buffer and, in the event of any lock detected, simply
>> move on to the next candidate without messing around with that buffer
>> at all.   This could construed as a 'trylock' variant of a spinlock
>> and might help out with cases where an especially hot buffer is
>> locking up the sweep.  This is exploiting the fact that from
>> StrategyGetBuffer we don't need a *particular* buffer, just *a*
>> buffer.
>
> Hm.  You could argue in fact that if there's contention for the buffer
> header, that's proof that it's busy and shouldn't have its usage count
> decremented.  So this seems okay from a logical standpoint.
>
> However, I'm not real sure that it's possible to do a conditional
> spinlock acquire that doesn't create just as much hardware-level
> contention as a full acquire (ie, TAS is about as bad whether it
> gets the lock or not).  So the actual benefit is a bit less clear.

well if you do a non-locking test first you could at least avoid some
cases (and, if you get the answer wrong, so what?) by jumping to the
next buffer immediately.  if the non locking test comes good, only
then do you do a hardware TAS.

you could in fact go further and dispense with all locking in front of
usage_count, on the premise that it's only advisory and not a real
refcount.  so you only then lock if/when it's time to select a
candidate buffer, and only then when you did a non locking test first.this would of course require some amusing
adjustmentsto various
 
logical checks (usage_count <= 0, heh).

merlin



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Page replacement algorithm in buffer cache
Next
From: Daniel Farina
Date:
Subject: Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)