Re: Notes on lock table spilling - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Notes on lock table spilling |
Date | |
Msg-id | 1112654862.16721.861.camel@localhost.localdomain Whole thread Raw |
In response to | Re: Notes on lock table spilling (Alvaro Herrera <alvherre@dcc.uchile.cl>) |
List | pgsql-hackers |
On Mon, 2005-04-04 at 16:27 -0400, Alvaro Herrera wrote: > On Mon, Apr 04, 2005 at 07:08:11PM +0100, Simon Riggs wrote: > > > IIRC there is not another major system that spills locks to disk and > > there's a big reason: performance is very poor. Other systems accept > > some limitations in order to avoid that. Oracle takes locks and holds > > them within the block itself, which then need to have block cleanup > > performed on them next time they're read. DB2 has a lock table which > > doesn't grow in size or spill to disk, ever. > > We can have the in-memory hash table grow to be "enough for most > applications"; say, have it store 10000 locks without resorting to > spilling. As a secondary storage there would be the in-memory buffers > for the spill area; comfortably another 5000 locks. Because there is no > need to have the lock table persist across system crashes (except in > case we implement 2PC), there's no need to fsync() or XLog this > information. So you would not touch disk until you have acquired 15000 > locks or so. This doesn't really address your concern, because for > spilling logic to work we may need to evict any transaction's lock, not > just our own. Thus we can't guarantee that only the heavy locker's > locks are spilled (spilt?) Well, thanks but still not happy. I'd like to find a way to penalise only the heavy locker (and anybody unlucky enough to request rows that have been locked by them). One idea: if we had an in-memory hash, then an on-disk b-tree we'd be able to limit the number of locks any backend could have in memory but allow locks to spill to disk for them. > So, we now have four ideas for solving the shared-row locking > problem: Five > > A. Using the lock manager to lock tuples. > This requires some sort of spill-to-disk logic. I'm worried about extra LWlock contention which we really don't need. > B. Not using the lock manager to lock tuples. > This can be done by using Bruce's Phantom Xids idea. > > C. Using a hybrid approach. > This can be done using the unused space in heap pages, as proposed by > Paul Tillotson, and falling back to the lock manager. Thus we still > need some spilling logic. I'm still not clear how would this work > given the current page layout and XLog requirements. Well, that might work. Each lock is associated with a transaction, so we can test the lock applicability the same way we read row visibility. Setting a lock would not alter the dirty flag on the block. If a block does get written, we can "vacuum" the old locks next time the lock table is used. If the on-block lock table is full, we know we need to check the on-disk lock structure. > D. Using a lock manager process. With a separate process, we don't need > any of this, because it can use the amount of memory it sees fit. We > may have some communicational overhead which may make the idea > unfeasible. But of course, we would retain a shared memory area for > "caching" (so we wouldn't need to touch the process until we are holding > lots of locks.) Does anyone thinks this is an idea worth pursuing? Or > rather, does anyone see huge showstoppers here? (Another process: hmmm, getting crowded isn't it?) E. Implement a lock mode that doesn't lock all the rows that have been touched, just the current rolling window: less locks, less need to spill to disk, so less need to find really good solution. Perhaps its time to revisit what the benefits are of doing *any* of the above are again. Maybe there's a different way to do just 80% of those requirements with much less pain all round. Best Regards, Simon Riggs
pgsql-hackers by date: