Re: RFC: bufmgr locking changes - Mailing list pgsql-hackers
From | Neil Conway |
---|---|
Subject | Re: RFC: bufmgr locking changes |
Date | |
Msg-id | 87n08fh5b4.fsf@mailbox.samurai.com Whole thread Raw |
In response to | Re: RFC: bufmgr locking changes (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: RFC: bufmgr locking changes
|
List | pgsql-hackers |
(Sorry Tom, I was meaning to reply to you once I'd had a chance to revise the bufmgr patch; since that seems a fair waysoff, I figured it would be better to respond now.) Tom Lane <tgl@sss.pgh.pa.us> writes: > Neil Conway <neilc@samurai.com> writes: >> we now hold the buffer's meta data lock while doing buffer I/O. > > The latter is a really bad idea IMHO. The io_in_progress lock can be > held for eons (in CPU terms) and should not be blocking people who > simply want to bump their refcount up and down. My reasoning was that the contention for the per-buffer meta data lock should be pretty low. The io_in_progress lock is held when we're either faulting a page in or flushing a page out. In the first case, we can't actually use the buffer no matter how we do the locking (its content is incomplete), so there's no effective loss in concurrency. In the second case, what kinds of concurrent activity can we allow on the buffer? (We can allow reads, of course, but I don't believe we can allow writes.) However, I'll think some more on this, you (and Jan, who raised this point a while ago via IRC) are probably correct. > It's possible that you could combine the io_in_progress lock with the > cntx_lock Yeah, that's a possibility. >> - Remove SetBufferCommitInfoNeedsSave(). AFAICS, this is now >> completely equivalent to WriteNoReleaseBuffer(), so I just removed >> the former and replaced all the calls to it with calls to the later. > > The reason I've kept the separation was as a form of documentation as to > the reason for each write. Although they currently do the same thing, > that might not always be true. I'd prefer not to eliminate the > distinction from the source code --- though I'd not object if you want > to make SetBufferCommitInfoNeedsSave a macro that invokes the other > routine. Ok, fair enough -- I've changed SetBufferCommitInfoNeedsSave() to be a macro for WriteNoReleaseBuffer(). >> - Make 'PrivateRefCount' an array of uint16s, rather than longs. This >> saves 2 bits * shared_buffers per backend on 32-bit machines and 6 >> bits * shared_buffers per backend on some 64-bit machines. It means >> a given backend can only pin a single buffer 65,636 times, but that >> should be more than enough. Similarly, made LocalRefCount an array >> of uint16s. > > I think both of these are ill-considered micro-optimization. How do you > know that the pin count can't exceed 64K? Consider recursive plpgsql > functions for a likely counterexample. Fair enough -- I couldn't conceive of an actual scenario in which a single backend would acquire > 64K pins on a single buffer, but I'll take your word that it's not so far fetched. However, there is still room for improvement, IMHO: on a machine with 64-bit longs, we're plainly allocating 4 bytes more than we need do. Any objection if I change these to arrays of int32? > Please put that back. It is there to avoid unnecessary acquisitions > of buffer locks during UnlockBuffers (which is executed during any > transaction abort). Without it, you will be needing to lock every > single buffer during an abort in order to check its flags. It seems bizarre that we need to iterate through a few thousand array elements just to do some lock cleanup. I'll take a closer look at this... -Neil
pgsql-hackers by date: