Home > mailing lists

Re: CLOG contention - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: CLOG contention
Date	December 21, 2011 15:48:40
Msg-id	6637.1324486103@sss.pgh.pa.us Whole thread Raw
In response to	Re: CLOG contention (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: CLOG contention
List	pgsql-hackers

Tree view

Robert Haas <robertmhaas@gmail.com> writes:
> I think there probably are some scalability limits to the current
> implementation, but also I think we could probably increase the
> current value modestly with something less than a total rewrite.
> Linearly scanning the slot array won't scale indefinitely, but I think
> it will scale to more than 8 elements.  The performance results I
> posted previously make it clear that 8 -> 32 is a net win at least on
> that system.

Agreed, the question is whether 32 is enough to fix the problem for
anything except this one benchmark.

> One fairly low-impact option might be to make the cache
> less than fully associative - e.g. given N buffers, a page with pageno
> % 4 == X is only allowed to be in a slot numbered between (N/4)*X and
> (N/4)*(X+1)-1.  That likely would be counterproductive at N = 8 but
> might be OK at larger values.

I'm inclined to think that that specific arrangement wouldn't be good.
The normal access pattern for CLOG is, I believe, an exponentially
decaying probability-of-access for each page as you go further back from
current.  We have a hack to pin the current (latest) page into SLRU all
the time, but you want the design to be such that the next-to-latest
page is most likely to still be around, then the second-latest, etc.

If I'm reading your equation correctly then the most recent pages would
compete against each other, not against much older pages, which is
exactly the wrong thing.  Perhaps what you actually meant to say was
that all pages with the same number mod 4 are in one bucket, which would
be better, but still not really ideal: for instance the next-to-latest
page could end up getting removed while say the third-latest page is
still there because it's in a different associative bucket that's under
less pressure.

But possibly we could fix that with some other variant of the idea.
I certainly agree that strict LRU isn't an essential property here,
so long as we have a design that is matched to the expected access
pattern statistics.

> We could also switch to using a hash
> table but that seems awfully heavy-weight.

Yeah.  If we're not going to go to hundreds of CLOG buffers, which
I think probably wouldn't be useful, then hashing is unlikely to be the
best answer.

> The real question is how to decide how many buffers to create.  You
> suggested a formula based on shared_buffers, but what would that
> formula be?  I mean, a typical large system is going to have 1,048,576
> shared buffers, and it probably needs less than 0.1% of that amount of
> CLOG buffers.

Well, something like "0.1% with minimum of 8 and max of 32" might be
reasonable.  What I'm mainly fuzzy about is the upper limit.
        regards, tom lane

pgsql-hackers by date:

From: Robert Haas
Date: 21 December 2011, 15:47:08
Subject: Re: Cursor behavior

From: Simon Riggs
Date: 21 December 2011, 16:13:01
Subject: Re: CLOG contention

Re: CLOG contention - Mailing list pgsql-hackers

Previous

Next