Thread: this is in plain text (row level locks)

this is in plain text (row level locks)

From

"Jenny -"

Date:

23 July 2003, 16:48:32

Iam trying to acquire rowlevel locks in postgresql. I try doing this:
'select * from students where name='Larry' for update;
But by looking at the holding array of proclock , I've noticed that by doing 
this only
AccessShareLock gets acquired which is a table level lock.
How do I acquire rowlevelock and what fields of Lock or Proclock 
datastructures indicate it.
Thanks
Jenny

_________________________________________________________________
Protect your PC - get McAfee.com VirusScan Online  
http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963

Re: this is in plain text (row level locks)

From

Tom Lane

Date:

23 July 2003, 21:38:15

"Jenny -" <nat_lazy@hotmail.com> writes:
> Iam trying to acquire rowlevel locks in postgresql. I try doing this:
> 'select * from students where name='Larry' for update;
> But by looking at the holding array of proclock , I've noticed that by doing 
> this only
> AccessShareLock gets acquired which is a table level lock.

Row-level locks are not recorded in proclock --- they are implemented by
marking the individual tuple on-disk.  If we tried to record them in
shared memory, it'd be very easy to run out of shared memory, because
you could be holding row locks on a large number of tuples.
        regards, tom lane

Re: this is in plain text (row level locks)

From

Sailesh Krishnamurthy

Date:

23 July 2003, 22:14:35

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
   Tom> "Jenny -" <nat_lazy@hotmail.com> writes:   >> Iam trying to acquire rowlevel locks in postgresql. I try doing
>>this: 'select * from students where name='Larry' for update;   >> But by looking at the holding array of proclock ,
I'venoticed   >> that by doing this only AccessShareLock gets acquired which is   >> a table level lock.

   Tom> Row-level locks are not recorded in proclock --- they are   Tom> implemented by marking the individual tuple
on-disk. If we   Tom> tried to record them in shared memory, it'd be very easy to   Tom> run out of shared memory,
becauseyou could be holding row   Tom> locks on a large number of tuples.

Of course, other database systems do this without too much hassle
.. including relying on lock escalation (move up to page/table level
locks) when the number of locks grow too large. 

Does pgsql only record X locks on the individual tuples on-disk or
does it do so for S locks as well ? 

Not that I dislike the idea - Toby Lehman suggested this in his
Ph.D. thesis in the mid-eighties for main-memory databases (where you
don't take the write penalty).

-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh

Re: this is in plain text (row level locks)

From

Bruce Momjian

Date:

23 July 2003, 23:02:03

Sailesh Krishnamurthy wrote:
> >>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
> 
>     Tom> "Jenny -" <nat_lazy@hotmail.com> writes:
>     >> Iam trying to acquire rowlevel locks in postgresql. I try doing
>     >> this: 'select * from students where name='Larry' for update;
>     >> But by looking at the holding array of proclock , I've noticed
>     >> that by doing this only AccessShareLock gets acquired which is
>     >> a table level lock.
> 
>     Tom> Row-level locks are not recorded in proclock --- they are
>     Tom> implemented by marking the individual tuple on-disk.  If we
>     Tom> tried to record them in shared memory, it'd be very easy to
>     Tom> run out of shared memory, because you could be holding row
>     Tom> locks on a large number of tuples.
> 
> Of course, other database systems do this without too much hassle
> .. including relying on lock escalation (move up to page/table level
> locks) when the number of locks grow too large. 

I wouldn't say they do it with little hassle --- it is quite a pain, in
fact, at least for users.

> Does pgsql only record X locks on the individual tuples on-disk or
> does it do so for S locks as well ? 

We don't need to shared lock individual rows because of MVCC --- well,
we sort of do by recording our xid in our proc structure, so folks don't
change things underneath us.  We prevent expired rows from disappearing
from the disk by others looking at our proc start xid.


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: this is in plain text (row level locks)

From

Tom Lane

Date:

23 July 2003, 23:04:14

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Sailesh Krishnamurthy wrote:
>> Does pgsql only record X locks on the individual tuples on-disk or
>> does it do so for S locks as well ? 

> We don't need to shared lock individual rows because of MVCC --- well,
> we sort of do by recording our xid in our proc structure, so folks don't
> change things underneath us.  We prevent expired rows from disappearing
> from the disk by others looking at our proc start xid.

This is actually an issue though.  Row-level shared locks would be
really nice to have for foreign-key handling.  Right now we have to
use X locks for those, and that leads to deadlocking problems for
applications.
        regards, tom lane

Re: this is in plain text (row level locks)

From

Mike Mascari

Date:

24 July 2003, 00:40:08

Tom Lane wrote:

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> 
>>Sailesh Krishnamurthy wrote:
>>
>>>Does pgsql only record X locks on the individual tuples on-disk or
>>>does it do so for S locks as well ? 
> 
> 
>>We don't need to shared lock individual rows because of MVCC --- well,
>>we sort of do by recording our xid in our proc structure, so folks don't
>>change things underneath us.  We prevent expired rows from disappearing
>>from the disk by others looking at our proc start xid.
> 
> 
> This is actually an issue though.  Row-level shared locks would be
> really nice to have for foreign-key handling.  Right now we have to
> use X locks for those, and that leads to deadlocking problems for
> applications.

Yes! Yes! It's the last big hurdle for an otherwise excellent RI
implementation...

Just wanted "Joe-user's" enthusiasm for row-level S locks registered
somewhere... :-)

Mike Mascari
mascarm@mascari.com

Re: this is in plain text (row level locks)

From

Bruce Momjian

Date:

24 July 2003, 01:12:35

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Sailesh Krishnamurthy wrote:
> >> Does pgsql only record X locks on the individual tuples on-disk or
> >> does it do so for S locks as well ? 
> 
> > We don't need to shared lock individual rows because of MVCC --- well,
> > we sort of do by recording our xid in our proc structure, so folks don't
> > change things underneath us.  We prevent expired rows from disappearing
> > from the disk by others looking at our proc start xid.
> 
> This is actually an issue though.  Row-level shared locks would be
> really nice to have for foreign-key handling.  Right now we have to
> use X locks for those, and that leads to deadlocking problems for
> applications.

Is the plan to allow one backend to shared lock the row while others can
read it but not modify it, or is the idea to actually allow multiple
backends to record their shared status on the row?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: this is in plain text (row level locks)

From

Tom Lane

Date:

24 July 2003, 01:30:25

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> This is actually an issue though.  Row-level shared locks would be
>> really nice to have for foreign-key handling.  Right now we have to
>> use X locks for those, and that leads to deadlocking problems for
>> applications.

> Is the plan to allow one backend to shared lock the row while others can
> read it but not modify it, or is the idea to actually allow multiple
> backends to record their shared status on the row?

Plan?  We have no plan to fix this :-(.  But clearly there has to be
some way to tell which backends hold read locks on a shared-locked row,
else you can't tell if they've all dropped the lock or not.
        regards, tom lane

Re: this is in plain text (row level locks)

From

Bruce Momjian

Date:

24 July 2003, 02:22:10

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom Lane wrote:
> >> This is actually an issue though.  Row-level shared locks would be
> >> really nice to have for foreign-key handling.  Right now we have to
> >> use X locks for those, and that leads to deadlocking problems for
> >> applications.
> 
> > Is the plan to allow one backend to shared lock the row while others can
> > read it but not modify it, or is the idea to actually allow multiple
> > backends to record their shared status on the row?
> 
> Plan?  We have no plan to fix this :-(.  But clearly there has to be
> some way to tell which backends hold read locks on a shared-locked row,
> else you can't tell if they've all dropped the lock or not.

I suppose we could allow one backend to mark the page with a shared lock
for primary key purposes while others read it.  Does that buy us
anything?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: this is in plain text (row level locks)

From

Tom Lane

Date:

24 July 2003, 02:28:19

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I suppose we could allow one backend to mark the page with a shared lock
> for primary key purposes while others read it.  Does that buy us
> anything?

That doesn't work, unless you insist that the first backend can't exit
its transaction until all the other ones are done.  Which introduces its
own possibilities for deadlock --- but even worse, how does the first
backend *know* that the other ones are done?  You're right back where
you started: it has to be possible to tell which backends have
share-locked a particular row.
        regards, tom lane

Re: this is in plain text (row level locks)

From

Sailesh Krishnamurthy

Date:

24 July 2003, 03:45:50

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> That doesn't work, unless you insist that the first backend Tom> can't exit its transaction until all the
otherones are done. Tom> Which introduces its own possibilities for deadlock --- but Tom> even worse, how does the
firstbackend *know* that the other Tom> ones are done? You're right back where you started: it has Tom> to be
possibleto tell which backends have share-locked a Tom> particular row.

Is a count a solution ?

The first backend gets the S lock on the row - I'm assuming you plan
to do it by recording it on the tuple and not in a shared memory lock
table, which means that you might have to unnecessarily write an
unmodified page if its buffer pool frame is stolen.

The problem is that on commit time, you must carefully decrement the
count value of shared locks on any tuple that you own. This can be
accomplished by having each backend keep track of the list of files
and TIDs for any rows for which it acquired S locks. Is this the same
way that pgsql releases the X locks ?

Bruce, I don't disagree that MVCC has the very nice property that
writers don't block readers. However, I don't buy that 2-phase
locking, with lock escalation is either unworkable because of too many
locks, or causes any extra pain for the user application (apart from
the fact that writers not blocking readers gives you more concurrency
at some very minor overhead of not being strictly serializable).

--
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh

Re: this is in plain text (row level locks)

From

Rod Taylor

Date:

24 July 2003, 07:13:46

On Thu, 2003-07-24 at 02:45, Sailesh Krishnamurthy wrote:
> >>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>
>     Tom> That doesn't work, unless you insist that the first backend
>     Tom> can't exit its transaction until all the other ones are done.
>     Tom> Which introduces its own possibilities for deadlock --- but
>     Tom> even worse, how does the first backend *know* that the other
>     Tom> ones are done?  You're right back where you started: it has
>     Tom> to be possible to tell which backends have share-locked a
>     Tom> particular row.
>
> Is a count a solution ?

Almost.  Problem with a count is the difficulty decrementing the count
after a crash.  You would also need to store a log of some kind so you
know what to reset while starting up in error recovery mode which would
add a large amount of overhead.

It may be best to have a locking manager run as a separate process.
That way it could store locks in ram or spill over to disk. Now the
problem is communication overhead. It probably isn't any worse than
writing them to disk for long queries (actual disk activity resulting),
but short queries are probably going to notice.

Re: this is in plain text (row level locks)

From

Tom Lane

Date:

24 July 2003, 10:39:35

Rod Taylor <rbt@rbt.ca> writes:
> It may be best to have a locking manager run as a separate process.
> That way it could store locks in ram or spill over to disk.

Hmm, that might be workable.  We could imagine that in place of the 
HEAP_MARKED_FOR_UPDATE status bit, we have a "this row is possibly
locked" hint bit.  Only if you see the bit set do you need to query
the lock manager.  If the answer comes back that no lock is held,
you can clear the bit --- so no need for any painful "undo" stuff
after a crash, and no communication overhead in the normal case.
        regards, tom lane

Re: this is in plain text (row level locks)

From

Sailesh Krishnamurthy

Date:

24 July 2003, 19:01:48

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
   Tom> Rod Taylor <rbt@rbt.ca> writes:   >> It may be best to have a locking manager run as a separate   >> process.
Thatway it could store locks in ram or spill over to   >> disk.
 
   Tom> Hmm, that might be workable.  We could imagine that in place   Tom> of the HEAP_MARKED_FOR_UPDATE status bit,
wehave a "this row   Tom> is possibly locked" hint bit.  Only if you see the bit set do   Tom> you need to query the
lockmanager.  If the answer comes back
 

Why do you want to query the lock manager as a separate process ? 

Why not have the traditional approach of a lock table in shared
memory, growing and shrinking as appropriate, and have each individual
process update it (need to protect it with a latch of course).

-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh

Re: this is in plain text (row level locks)

From

Tom Lane

Date:

24 July 2003, 19:10:08

Sailesh Krishnamurthy <sailesh@cs.berkeley.edu> writes:
> Why not have the traditional approach of a lock table in shared
> memory, growing and shrinking as appropriate,

Because we can't grow shared memory.  Whatever size we get at startup is
what we're stuck with.  (I suppose we could try asking the kernel for
additional segments, but there's every likelihood that that will fail.)
        regards, tom lane

Re: this is in plain text (row level locks)

From

Sailesh Krishnamurthy

Date:

24 July 2003, 19:11:51

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
   Tom> Sailesh Krishnamurthy <sailesh@cs.berkeley.edu> writes:   >> Why not have the traditional approach of a lock
tablein shared   >> memory, growing and shrinking as appropriate,
 
   Tom> Because we can't grow shared memory.  Whatever size we get at   Tom> startup is what we're stuck with.  (I
supposewe could try   Tom> asking the kernel for additional segments, but there's every   Tom> likelihood that that
willfail.)
 

We implemented a Shared Memory MemoryContext using OSSP libmm (used in
Apache) for TelegraphCQ. 

If you think it's useful I can submit it as a patch.

-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh

Re: this is in plain text (row level locks)

From

Bruce Momjian

Date:

24 July 2003, 19:15:19

Sailesh Krishnamurthy wrote:
> >>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
> 
>     Tom> Sailesh Krishnamurthy <sailesh@cs.berkeley.edu> writes:
>     >> Why not have the traditional approach of a lock table in shared
>     >> memory, growing and shrinking as appropriate,
> 
>     Tom> Because we can't grow shared memory.  Whatever size we get at
>     Tom> startup is what we're stuck with.  (I suppose we could try
>     Tom> asking the kernel for additional segments, but there's every
>     Tom> likelihood that that will fail.)
> 
> We implemented a Shared Memory MemoryContext using OSSP libmm (used in
> Apache) for TelegraphCQ. 
> 
> If you think it's useful I can submit it as a patch.
> 
> -- 
> Pip-pip
> Sailesh
> http://www.cs.berkeley.edu/~sailesh
            ^^^^^^^^^^^^^^^
Watch out, that code from Berkeley usually is a mess.  :-)

Seriously, though, it would be good to see what you guys are up to.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: this is in plain text (row level locks)

From

Tom Lane

Date:

24 July 2003, 19:15:38

Sailesh Krishnamurthy <sailesh@cs.berkeley.edu> writes:
> We implemented a Shared Memory MemoryContext using OSSP libmm (used in
> Apache) for TelegraphCQ. 

How portable is that?  What kind of real-world performance do you get?
        regards, tom lane

Re: this is in plain text (row level locks)

From

Sailesh Krishnamurthy

Date:

24 July 2003, 19:54:21

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
   Tom> Sailesh Krishnamurthy <sailesh@cs.berkeley.edu> writes:   >> We implemented a Shared Memory MemoryContext using
OSSPlibmm   >> (used in Apache) for TelegraphCQ.
 
   Tom> How portable is that?  What kind of real-world performance do   Tom> you get?

As portable as libmm I guess. We've only tried it on Linux and MacOS
X. I understand that the Apache guys use it - so it should be fairly
portable. 

http://www.ossp.org/pkg/lib/mm/

Actually I've also tried it successfully on SPARC Solaris. 

I believe that a user of our first TelegraphCQ release (that we did
not announce) tried it on FreeBSD. Sean Chittenden has tried it on
FreeBSD but I'm not sure if he built it and got it to work properly.

As for our SHMemoryContext code itself, it's pretty much a straight
copy of the code in aset.c and mctx.c, with appropriate changes where
we call the mm_alloc functions and also use mm_lock and mm_unlock for
protection. 

As for performance, I'm in the midst of a performance study, but it's
not a micro-benchmark of the shmem allocator. After I get some numbers
I'll start profiling our code.

One problem is that on some platforms the locking implementation uses
the file-system. This is not in our control and we just trust mm to do
the best possible. 

Given that we rely on shared query execution (we run multiple queries
in a single adaptive query plan) we don't have much choice - we must
rely on shared memory.

-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh

Re: this is in plain text (row level locks)

From

Sailesh Krishnamurthy

Date:

24 July 2003, 20:00:13

>>>>> "Bruce" == Bruce Momjian <pgman@candle.pha.pa.us> writes:
   >> -- Pip-pip Sailesh http://www.cs.berkeley.edu/~sailesh
   Bruce>              ^^^^^^^^^^^^^^^ Watch out, that code from   Bruce> Berkeley usually is a mess.  :-)

LOL ! 

That's why we release the code - in the hope of getting real hackers
playing with it :-)
   Bruce> Seriously, though, it would be good to see what you guys   Bruce> are up to.

Our code is based on 7.3.2 pgsql and available at:         http://telegraph.cs.berkeley.edu/telegraphcq 

If the specific shm stuff is interesting, I can easily take it out and
submit it. If you prefer to just look at the code in the tarball
that's fine too. 

Or you can browse our cvs repository at:
  http://triplerock.cs.berkeley.edu:8080/viewcvs/viewcvs.cgi/

Look for shmctx.c and shmset.c in src/backend/utils/mmgr 

-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh

Re: this is in plain text (row level locks)

From

Bruce Momjian

Date:

31 July 2003, 01:17:32

I was thinking of adding to TODO:* Allow shared row locks for referential integrity

but how is that different from:
* Implement dirty reads and use them in RI triggers


---------------------------------------------------------------------------

Tom Lane wrote:
> Rod Taylor <rbt@rbt.ca> writes:
> > It may be best to have a locking manager run as a separate process.
> > That way it could store locks in ram or spill over to disk.
> 
> Hmm, that might be workable.  We could imagine that in place of the 
> HEAP_MARKED_FOR_UPDATE status bit, we have a "this row is possibly
> locked" hint bit.  Only if you see the bit set do you need to query
> the lock manager.  If the answer comes back that no lock is held,
> you can clear the bit --- so no need for any painful "undo" stuff
> after a crash, and no communication overhead in the normal case.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: this is in plain text (row level locks)

From

Tom Lane

Date:

31 July 2003, 01:24:03

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I was thinking of adding to TODO:
>     * Allow shared row locks for referential integrity
> but how is that different from:
>     * Implement dirty reads and use them in RI triggers

It'd be a completely different approach to solving the FK locking
problem.  I wouldn't think we'd do both.

Personally I'd feel more comfortable with a shared-lock approach, if we
could work out the scalability issues.  Dirty reads seem ... well ...
dirty.
        regards, tom lane

Re: this is in plain text (row level locks)

From

Bruce Momjian

Date:

31 July 2003, 01:50:09

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I was thinking of adding to TODO:
> >     * Allow shared row locks for referential integrity
> > but how is that different from:
> >     * Implement dirty reads and use them in RI triggers
> 
> It'd be a completely different approach to solving the FK locking
> problem.  I wouldn't think we'd do both.
> 
> Personally I'd feel more comfortable with a shared-lock approach, if we
> could work out the scalability issues.  Dirty reads seem ... well ...
> dirty.

TODO updated:
* Implement dirty reads or shared locks and use them in RI  triggers


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: this is in plain text (row level locks)

From

Sailesh Krishnamurthy

Date:

31 July 2003, 02:57:18

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
   Tom> Personally I'd feel more comfortable with a shared-lock   Tom> approach, if we could work out the scalability
issues. Dirty   Tom> reads seem ... well ...  dirty.
 

Tom

I was going to do some experiments to measure the costs of our shared
memory MemoryContext.

Is there something specific you'd be interested in ? My first goal is
mainly to measure the amount of time it takes to allocate and
deallocate shared memory in the contested and uncontested cases.

-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh

Re: this is in plain text (row level locks)

From

"Jenny -"

Date:

05 August 2003, 14:57:35



>From: Tom Lane <tgl@sss.pgh.pa.us>
>To: "Jenny -" <nat_lazy@hotmail.com>
>Subject: Re: [HACKERS] this is in plain text (row level locks) Date: Sat, 
>02 Aug 2003 23:28:30 -0400
>
> > if row-level locks are not recorded in proclock or any other shared 
>memory
> > datastructuers, then why does lockmode (array or ints) of proclock 
>indicate
> > that an AccessShareLock is acquired when a row is locked by 
>application.?
>
>That's a table lock --- it's independent of row locks.  It's there
>mostly to ensure someone doesn't delete the whole table out from under
>you.
>
>            regards, tom lane

so even though the application locks a row in a table, table-level locks are 
automatically taken by postgesql ? why is that?
thanks

_________________________________________________________________
The new MSN 8: advanced junk mail protection and 2 months FREE*  
http://join.msn.com/?page=features/junkmail

Re: this is in plain text (row level locks)

From

Tom Lane

Date:

05 August 2003, 16:02:23

"Jenny -" <nat_lazy@hotmail.com> writes:
> so even though the application locks a row in a table, table-level locks are 
> automatically taken by postgesql ? why is that?

So that the table doesn't disappear while you're trying to scan it.  (Or
afterwards --- a row-level lock wouldn't be noticed by DROP TABLE.)

Note that AccessShareLock is a pretty weak kind of lock, and holding it
does not prevent most other operations.
        regards, tom lane