Re: CLOG contention - Mailing list pgsql-hackers

From Robert Haas
Subject Re: CLOG contention
Date
Msg-id CA+Tgmob0LToG7FdopEwSGxrMLOCZCRs8Vq4zxczjccdG1uKX1g@mail.gmail.com
Whole thread Raw
In response to Re: CLOG contention  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: CLOG contention
List pgsql-hackers
On Wed, Dec 21, 2011 at 2:05 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Wed, Dec 21, 2011 at 3:24 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I think there probably are some scalability limits to the current
>> implementation, but also I think we could probably increase the
>> current value modestly with something less than a total rewrite.
>> Linearly scanning the slot array won't scale indefinitely, but I think
>> it will scale to more than 8 elements.  The performance results I
>> posted previously make it clear that 8 -> 32 is a net win at least on
>> that system.
>
> Agreed to that, but I don't think its nearly enough.
>
>> One fairly low-impact option might be to make the cache
>> less than fully associative - e.g. given N buffers, a page with pageno
>> % 4 == X is only allowed to be in a slot numbered between (N/4)*X and
>> (N/4)*(X+1)-1.  That likely would be counterproductive at N = 8 but
>> might be OK at larger values.
>
> Which is pretty much the same as saying, yes, lets partition the clog
> as I suggested, but by a different route.
>
>> We could also switch to using a hash
>> table but that seems awfully heavy-weight.
>
> Which is a re-write of SLRU ground up and inapproriate for most SLRU
> usage. We'd get partitioning "for free" as long as we re-write.

I'm not sure what your point is here.  I feel like this is on the edge
of turning into an argument, and if we're going to have an argument
I'd like to know what we're arguing about.  I am not arguing that
under no circumstances should we partition anything related to CLOG,
nor am I trying to deny you credit for your ideas.  I'm merely saying
that the specific plan of having multiple SLRUs for CLOG doesn't
appeal to me -- mostly because I think it will make life difficult for
pg_upgrade without any compensating advantage.  If we're going to go
that route, I'd rather build something into the SLRU machinery
generally that allows for the cache to be less than fully-associative,
with all of the savings in terms of lock contention that this entails.Such a system could be used by any SLRU, not just
CLOG,if it proved 
to be helpful; and it would avoid any on-disk changes, with, as far as
I can see, basically no downside.

That having been said, Tom isn't convinced that any form of
partitioning is the right way to go, and since Tom often has good
ideas, I'd like to explore his notions of how we might fix this
problem other than via some form of partitioning before we focus in on
partitioning.  Partitioning may ultimately be the right way to go, but
let's keep an open mind: this thread is only 14 hours old.  The only
things I'm completely convinced of at this point are (1) we need more
CLOG buffers (but I don't know exactly how many) and (2) the current
code isn't designed to manage large numbers of buffers (but I don't
know exactly where it starts to fall over).

If I'm completely misunderstanding the point of your email, please set
me straight (gently).

Thanks,

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: CLOG contention
Next
From: Greg Smith
Date:
Subject: Re: Page Checksums