Re: CLOG contention - Mailing list pgsql-hackers

From Robert Haas
Subject Re: CLOG contention
Date
Msg-id CA+TgmoZ48y4B0TcYrEGrSXuwCB3meJWmRQPQA5nekyhXO_+9mw@mail.gmail.com
Whole thread Raw
In response to Re: CLOG contention  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: CLOG contention
Re: CLOG contention
List pgsql-hackers
On Wed, Dec 21, 2011 at 11:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Agreed, the question is whether 32 is enough to fix the problem for
> anything except this one benchmark.

Right.  My thought on that topic is that it depends on what you mean
by "fix".  It's clearly NOT possible to keep enough CLOG buffers
around to cover the entire range of XID space that might get probed,
at least not without some massive rethinking of the infrastructure.
It seems that the amount of space that might need to be covered there
is at least on the order of vacuum_freeze_table_age, which is to say
150 million by default.  At 32K txns/page, that would require almost
5K pages, which is a lot more than 8.

On the other hand, if we just want to avoid having more requests
simultaneously in flight than we have buffers, so that backends don't
need to wait for an available buffer before beginning their I/O, then
something on the order of the number of CPUs in the machine is likely
sufficient.  I'll do a little more testing and see if I can figure out
where the tipping point is on this 32-core box.

>> One fairly low-impact option might be to make the cache
>> less than fully associative - e.g. given N buffers, a page with pageno
>> % 4 == X is only allowed to be in a slot numbered between (N/4)*X and
>> (N/4)*(X+1)-1.  That likely would be counterproductive at N = 8 but
>> might be OK at larger values.
>
> I'm inclined to think that that specific arrangement wouldn't be good.
> The normal access pattern for CLOG is, I believe, an exponentially
> decaying probability-of-access for each page as you go further back from
> current.  We have a hack to pin the current (latest) page into SLRU all
> the time, but you want the design to be such that the next-to-latest
> page is most likely to still be around, then the second-latest, etc.
>
> If I'm reading your equation correctly then the most recent pages would
> compete against each other, not against much older pages, which is
> exactly the wrong thing.  Perhaps what you actually meant to say was
> that all pages with the same number mod 4 are in one bucket, which would
> be better,

That's what I meant.  I think the formula works out to that, but in
any case it's what I meant.  :-)

>  but still not really ideal: for instance the next-to-latest
> page could end up getting removed while say the third-latest page is
> still there because it's in a different associative bucket that's under
> less pressure.

Well, sure.  But who is to say that's bad?  I think you can find a way
to throw stones at any given algorithm we might choose to implement.
For example, if you contrive things so that you repeatedly access the
same old CLOG pages cyclically: 1,2,3,4,5,6,7,8,1,2,3,4,5,6,7,8,...

...then our existing LRU algorithm will be anti-optimal, because we'll
keep the latest page plus the most recently accessed 7 old pages in
memory, and every lookup will fault out the page that the next lookup
is about to need.  If you're not that excited about that happening in
real life, neither am I.  But neither am I that excited about your
scenario: if the next-to-last page gets kicked out, there are a whole
bunch of pages -- maybe 8, if you imagine 32 buffers split 4 ways --
that have been accessed more recently than that next-to-last page.  So
it wouldn't be resident in an 8-buffer pool either.  Maybe the last
page was mostly transactions updating some infrequently-accessed
table, and we don't really need that page right now.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: CLOG contention
Next
From: Tom Lane
Date:
Subject: Re: CLOG contention