Re: Notes on lock table spilling - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Notes on lock table spilling
Date
Msg-id 1112654862.16721.861.camel@localhost.localdomain
Whole thread Raw
In response to Re: Notes on lock table spilling  (Alvaro Herrera <alvherre@dcc.uchile.cl>)
List pgsql-hackers
On Mon, 2005-04-04 at 16:27 -0400, Alvaro Herrera wrote:
> On Mon, Apr 04, 2005 at 07:08:11PM +0100, Simon Riggs wrote:
> 
> > IIRC there is not another major system that spills locks to disk and
> > there's a big reason: performance is very poor. Other systems accept
> > some limitations in order to avoid that. Oracle takes locks and holds
> > them within the block itself, which then need to have block cleanup
> > performed on them next time they're read. DB2 has a lock table which
> > doesn't grow in size or spill to disk, ever.
> 
> We can have the in-memory hash table grow to be "enough for most
> applications"; say, have it store 10000 locks without resorting to
> spilling.  As a secondary storage there would be the in-memory buffers
> for the spill area; comfortably another 5000 locks.  Because there is no
> need to have the lock table persist across system crashes (except in
> case we implement 2PC), there's no need to fsync() or XLog this
> information.  So you would not touch disk until you have acquired 15000
> locks or so.  This doesn't really address your concern, because for
> spilling logic to work we may need to evict any transaction's lock, not
> just our own.  Thus we can't guarantee that only the heavy locker's
> locks are spilled (spilt?)

Well, thanks but still not happy. I'd like to find a way to penalise
only the heavy locker (and anybody unlucky enough to request rows that
have been locked by them). One idea: if we had an in-memory hash, then
an on-disk b-tree we'd be able to limit the number of locks any backend
could have in memory but allow locks to spill to disk for them. 

> So, we now have four ideas for solving the shared-row locking
> problem:

Five

> 
> A. Using the lock manager to lock tuples.
>    This requires some sort of spill-to-disk logic.

I'm worried about extra LWlock contention which we really don't need.

> B. Not using the lock manager to lock tuples.
>    This can be done by using Bruce's Phantom Xids idea.
> 
> C. Using a hybrid approach.
>    This can be done using the unused space in heap pages, as proposed by
>    Paul Tillotson, and falling back to the lock manager.  Thus we still
>    need some spilling logic.  I'm still not clear how would this work
>    given the current page layout and XLog requirements.

Well, that might work. Each lock is associated with a transaction, so we
can test the lock applicability the same way we read row visibility.
Setting a lock would not alter the dirty flag on the block. If a block
does get written, we can "vacuum" the old locks next time the lock table
is used. If the on-block lock table is full, we know we need to check
the on-disk lock structure.

> D. Using a lock manager process.  With a separate process, we don't need
>    any of this, because it can use the amount of memory it sees fit.  We
>    may have some communicational overhead which may make the idea
>    unfeasible.  But of course, we would retain a shared memory area for
>    "caching" (so we wouldn't need to touch the process until we are holding
>    lots of locks.) Does anyone thinks this is an idea worth pursuing?  Or
>    rather, does anyone see huge showstoppers here?

(Another process: hmmm, getting crowded isn't it?)

E. Implement a lock mode that doesn't lock all the rows that have been
touched, just the current rolling window: less locks, less need to spill
to disk, so less need to find really good solution.

Perhaps its time to revisit what the benefits are of doing *any* of the
above are again. Maybe there's a different way to do just 80% of those
requirements with much less pain all round.

Best Regards, Simon Riggs



pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: [GENERAL] plPHP in core?
Next
From: Neil Conway
Date:
Subject: Re: [PATCHES] DELETE ... USING