Thread: Debugging deadlocks

Debugging deadlocks

From
"Guy Rouillier"
Date:
I'm getting the following in the server log:

2005-03-27 06:04:21 GMT estat DETAIL:  Process 20928 waits for ShareLock
on transaction 7751823; blocked by process 20929.
    Process 20929 waits for ShareLock on transaction 7768115;
blocked by process 20928.
2005-03-27 06:04:21 GMT estat CONTEXT:  SQL statement "SELECT 1 FROM
ONLY "rumba"."service_plane" x WHERE "service_plane_id" = $1 FOR UPDATE
OF x"
    SQL statement " INSERT INTO FIVE_MIN_STATS_200503 (traffic_id,
service_plane_id, datestamp, sample_bucket_no, service_id,
data_collector_device_id, bit_delta, packet_delta, bit_rate,
packet_rate,   bit_drop_delta, packet_drop_delta, bit_drop_rate,
packet_drop_rate, updated)  VALUES (
'1','4','2005-03-21','1','MV008816','3',                0, 0, 0,
0,0,0,0,0,'N' )"
    PL/pgSQL function "initialize_five_minute_samples" line 34 at
execute statement
    SQL statement "SELECT  INITIALIZE_FIVE_MINUTE_SAMPLES( $1 ,  $2
,  $3 ,  $4 ,  $5 , 1, 288)"
    PL/pgSQL function "ins_updt_five_min_sample" line 28 at perform

FIVE_MIN_STATS_200503 has a foreign key into "rumba"."service_plane".
The service_plane table is a reference table, i.e., a fixed set of
values used only to validate foreign keys.  So the code doesn't have any
update statements on that table.  I'm assuming PostgreSQL is generating
that SQL to validate the foreign key.  But why is it selecting for
update?

--
Guy Rouillier


Re: Debugging deadlocks

From
Michael Fuhr
Date:
On Sun, Mar 27, 2005 at 12:54:28AM -0600, Guy Rouillier wrote:
> I'm getting the following in the server log:
>
> 2005-03-27 06:04:21 GMT estat DETAIL:  Process 20928 waits for ShareLock
> on transaction 7751823; blocked by process 20929.
>     Process 20929 waits for ShareLock on transaction 7768115;
> blocked by process 20928.
> 2005-03-27 06:04:21 GMT estat CONTEXT:  SQL statement "SELECT 1 FROM
> ONLY "rumba"."service_plane" x WHERE "service_plane_id" = $1 FOR UPDATE
> OF x"
...
> The service_plane table is a reference table, i.e., a fixed set of
> values used only to validate foreign keys.  So the code doesn't have any
> update statements on that table.  I'm assuming PostgreSQL is generating
> that SQL to validate the foreign key.  But why is it selecting for
> update?

To make sure the referenced key can't change until the transaction
completes and the referencing row becomes visible to other transactions
(or is rolled back) -- otherwise other transactions could change
or delete the referenced key and not know they'd be breaking your
referential integrity.  The current implementation supports only
exclusive row-level locks (SELECT FOR UPDATE), but I think Alvaro
might be working on shared row-level locks for a future release.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Debugging deadlocks

From
"Qingqing Zhou"
Date:
"Michael Fuhr" <mike@fuhr.org> writes
> To make sure the referenced key can't change until the transaction
> completes and the referencing row becomes visible to other transactions
> (or is rolled back) -- otherwise other transactions could change
> or delete the referenced key and not know they'd be breaking your
> referential integrity.  The current implementation supports only
> exclusive row-level locks (SELECT FOR UPDATE), but I think Alvaro
> might be working on shared row-level locks for a future release.
>

The code to process RI could be optimized a little bit. See the following
case:

user -1-
test=# begin;
BEGIN
test=# delete from a where i = 2;
DELETE 1

                user -2-
                test=# update a set i = 3 where i = 1;
                ERROR:  update or delete on "a" violates foreign key
constraint "b_i_fkey" on "b"
                DETAIL:  Key (i)=(1) is still referenced from table "b".
                test=# update a set i = 2 where i = 1;
                /* User 2 will be blocked here */

Blocking user 2 is strange and not necessary? Since the sever should first
check the where-clause (this is true in ordinary UPDATE). If not, just
report an error as the fomer statement.

Regards,
Qingqing




Re: Debugging deadlocks

From
Stephan Szabo
Date:
On Sun, 27 Mar 2005, Qingqing Zhou wrote:

>
> "Michael Fuhr" <mike@fuhr.org> writes
> > To make sure the referenced key can't change until the transaction
> > completes and the referencing row becomes visible to other transactions
> > (or is rolled back) -- otherwise other transactions could change
> > or delete the referenced key and not know they'd be breaking your
> > referential integrity.  The current implementation supports only
> > exclusive row-level locks (SELECT FOR UPDATE), but I think Alvaro
> > might be working on shared row-level locks for a future release.
> >
>
> The code to process RI could be optimized a little bit. See the following
> case:
>
> user -1-
> test=# begin;
> BEGIN
> test=# delete from a where i = 2;
> DELETE 1
>
>                 user -2-
>                 test=# update a set i = 3 where i = 1;
>                 ERROR:  update or delete on "a" violates foreign key
> constraint "b_i_fkey" on "b"
>                 DETAIL:  Key (i)=(1) is still referenced from table "b".
>                 test=# update a set i = 2 where i = 1;
>                 /* User 2 will be blocked here */
>
> Blocking user 2 is strange and not necessary? Since the sever should first
> check the where-clause (this is true in ordinary UPDATE). If not, just
> report an error as the fomer statement.

Well, that's not the foreign key necessarily. I don't have a machine to
test on at the moment (machine currently dead), but I think the same
happens without a foreign key constraint due to the unique/primary key
constraint on a.i.

Re: Debugging deadlocks

From
"Guy Rouillier"
Date:
Michael Fuhr wrote:
> On Sun, Mar 27, 2005 at 12:54:28AM -0600, Guy Rouillier wrote:
>> I'm getting the following in the server log:
>>
>> 2005-03-27 06:04:21 GMT estat DETAIL:  Process 20928 waits for
>> ShareLock on transaction 7751823; blocked by process 20929.
>>     Process 20929 waits for ShareLock on transaction 7768115;
blocked by
>> process 20928. 2005-03-27 06:04:21 GMT estat CONTEXT:  SQL statement
>> "SELECT 1 FROM ONLY "rumba"."service_plane" x WHERE
>> "service_plane_id" = $1 FOR UPDATE OF x"
> ...
>> The service_plane table is a reference table, i.e., a fixed set of
>> values used only to validate foreign keys.  So the code doesn't have
>> any update statements on that table.  I'm assuming PostgreSQL is
>> generating that SQL to validate the foreign key.  But why is it
>> selecting for update?
>
> To make sure the referenced key can't change until the transaction
> completes and the referencing row becomes visible to other
> transactions (or is rolled back) -- otherwise other transactions
> could change or delete the referenced key and not know they'd be
> breaking your referential integrity.  The current implementation
> supports only exclusive row-level locks (SELECT FOR UPDATE), but I
> think Alvaro might be working on shared row-level locks for a future
> release.

Michael, thanks for the reply.  The current approach seems pretty...
fragile.  I encountered this deadlock because I have two tables (call
them T1 and T2), each with a foreign key to the same reference table.
I'm inserting about 2 million rows a day into each of these two tables.
Rows arrive in logical batches, so to help out performance, I only
commit after inserting every N rows. The deadlock arises because thread1
inserts some set of rows into T1 while thread2 inserts a set of rows
into T2.  The cardinality of the reference table is very small (only
about 10 rows) so the inserts into T1 and T2 are virtually guaranteed to
both reference the same values.

I understand this is easy for me to say since I'm not writing the PG
code, but multiple clients should be able to read reference tables
simultaneously.  Seems like a "prevent write" lock should be placed on
the rows in the reference table rather than an "exclusive" lock.  That
way as many clients as necessary can read the values, but if someone
comes along and tries to change a row, just that one client gets an
error.

With the current implementation, it appears I need to either (1) always
commit after every inserted row, or (2) single thread my entire insert
logic.  Neither of these two alternatives is very desirable.  And it is
only a partial fix (which will work in my case since I'm the only one
updating this database.)  In the general case though, where other
programmers may be writing code that I may not even know about to update
parts of this database, avoiding this type of deadlock becomes very
difficult.  It pretty much requires that everyone know what everyone
else is doing.

--
Guy Rouillier


Re: Debugging deadlocks

From
Alvaro Herrera
Date:
On Sun, Mar 27, 2005 at 06:02:25PM -0600, Guy Rouillier wrote:

> With the current implementation, it appears I need to either (1) always
> commit after every inserted row, or (2) single thread my entire insert
> logic.  Neither of these two alternatives is very desirable.

I think a usual workaround is to declare the contraints INITIALLY
DEFERRED.  This will delay the check until commit time, so the time
window to deadlock is smaller.  There still is a possibility though, so
you need to take it into account.  It occurs to me that if you control
all insertion threads, you could try to serialize access to COMMIT in
order to make the chance of deadlock even smaller.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Industry suffers from the managerial dogma that for the sake of stability
and continuity, the company should be independent of the competence of
individual employees."                                      (E. Dijkstra)

Re: Debugging deadlocks

From
"Qingqing Zhou"
Date:
"Stephan Szabo" <sszabo@megazone.bigpanda.com> writes
>
> Well, that's not the foreign key necessarily. I don't have a machine to
> test on at the moment (machine currently dead), but I think the same
> happens without a foreign key constraint due to the unique/primary key
> constraint on a.i.

I see. That's more reasonable - the executor will first wait to see if the
constraint on itself satifies, then do the RI check.

Thanks,
Qingqing



Re: Debugging deadlocks

From
Martijn van Oosterhout
Date:
On Sun, Mar 27, 2005 at 06:02:25PM -0600, Guy Rouillier wrote:
> With the current implementation, it appears I need to either (1) always
> commit after every inserted row, or (2) single thread my entire insert
> logic.  Neither of these two alternatives is very desirable.  And it is
> only a partial fix (which will work in my case since I'm the only one
> updating this database.)  In the general case though, where other
> programmers may be writing code that I may not even know about to update
> parts of this database, avoiding this type of deadlock becomes very
> difficult.  It pretty much requires that everyone know what everyone
> else is doing.

The other possibility is to sort the rows on the foreign key as they
are inserted. Then the locks will always be taken in the same order and
hence can never deadlock.

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: Debugging deadlocks

From
"Guy Rouillier"
Date:
Alvaro Herrera wrote:
> On Sun, Mar 27, 2005 at 06:02:25PM -0600, Guy Rouillier wrote:
>
>> With the current implementation, it appears I need to either (1)
>> always commit after every inserted row, or (2) single thread my
>> entire insert logic.  Neither of these two alternatives is very
>> desirable.
>
> I think a usual workaround is to declare the contraints INITIALLY
> DEFERRED.  This will delay the check until commit time, so the time
> window to deadlock is smaller.  There still is a possibility though,
> so you need to take it into account.  It occurs to me that if you
> control all insertion threads, you could try to serialize access to
> COMMIT in order to make the chance of deadlock even smaller.

I changed the definition of the two tables with the common foreign key
to use DEFERRABLE INITIALLY DEFERRED.  I'm no longer getting a deadlock,
but I'm getting another error that may just be masking the same problem.
I'd like to understand your caveat about deadlocks still being possible.
To help me understand I'll draw a simplistic timeline.  A and B are
transactions on the two different tables:

A  !--------------------!
B           !------------------!

At the time that A commits, what is the effect on B?  B has not yet
attempted to lock any rows in the ref table (the one with the primary
key referenced by the foreign keys.)  So as I understand it, B should be
unaffected by A's commit.  Correct?  Does the deadlock possibility arise
if A and B should try to commit at the same instant?

--
Guy Rouillier


Re: Debugging deadlocks

From
frank@joerdens.de
Date:
On Sun, Mar 27, 2005 at 01:37:44AM -0700, Michael Fuhr wrote:
[]
> The current implementation supports only
> exclusive row-level locks (SELECT FOR UPDATE), but I think Alvaro
> might be working on shared row-level locks for a future release.

Hmm ... are you saying that SELECT FOR UPDATE exquires an exclusive lock
on the row in question in the sense that it conflicts with other
*readers* trying to access that row? The documentation would appear to
say otherwise:

-- snip --
Row-level locks do not affect data  querying; they block writers to the
same row  only. To acquire a row-level lock on a row without actually
modifying the row, select the row with SELECT FOR  UPDATE.
-- snap --

(from: http://www.vitavoom.com/postgresql-docs/explicit-locking.html)

and

-- snip --
A row-level lock on a specific row is automatically acquired when the
row is updated (or deleted or marked for update). The lock is held until
the transaction commits or rolls back. Row-level locks do not affect
data querying; they block writers to the same row only.
-- snap --

(from: http://www.postgresql.org/docs/7.4/static/explicit-locking.html)

but then I don't understand the original poster's problem at all, since
his queries are only *reading* the referenced tables??

Regards,

Frank

Re: Debugging deadlocks

From
Michael Fuhr
Date:
On Wed, Mar 30, 2005 at 10:59:52PM +0200, frank@joerdens.de wrote:
> On Sun, Mar 27, 2005 at 01:37:44AM -0700, Michael Fuhr wrote:
> > The current implementation supports only
> > exclusive row-level locks (SELECT FOR UPDATE), but I think Alvaro
> > might be working on shared row-level locks for a future release.
>
> Hmm ... are you saying that SELECT FOR UPDATE exquires an exclusive lock
> on the row in question in the sense that it conflicts with other
> *readers* trying to access that row? The documentation would appear to
> say otherwise:

I'm saying that foreign key checks use SELECT FOR UPDATE to ensure
that the referenced key doesn't change while the transaction is
pending, and that SELECT FOR UPDATE conflicts with other SELECT FOR
UPDATE queries.  Therefore, if concurrent transactions insert records
into a table that has a non-deferred foreign key constraint, and
if the foreign key values are the same, then one of the transactions
will block.  Example:

CREATE TABLE foo (
    fooid  integer PRIMARY KEY
);

CREATE TABLE bar (
    barid  serial PRIMARY KEY,
    fooid  integer NOT NULL REFERENCES foo
);

INSERT INTO foo (fooid) VALUES (1);

If we now have two transactions that both insert records into bar
with the same value for fooid, then one of the transactions will
block:

T1: BEGIN;
T2: BEGIN;
T1: INSERT INTO bar (fooid) VALUES (1);
T2: INSERT INTO bar (fooid) VALUES (1);  -- blocks

Transaction T2 blocks because both transactions have done something
like "SELECT 1 FROM foo WHERE fooid = 1 FOR UPDATE" to ensure that
the referenced key can't change until the transaction completes.
So even though both transactions are only reading the referenced
table (foo), one has blocked the other.

Note that SELECT queries without FOR UPDATE won't block -- it's
other FOR UPDATE queries that block, even though they're only
reading.

I think Alvaro is working on a new locking mechanism that will allow
transactions to prevent a record from being modified without blocking
other transactions doing the same.

Alvaro (or somebody else), please correct me if I'm mistaken, but
that's what I've understood from discussions elsewhere.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Debugging deadlocks

From
Alvaro Herrera
Date:
On Wed, Mar 30, 2005 at 02:30:58PM -0700, Michael Fuhr wrote:

> I think Alvaro is working on a new locking mechanism that will allow
> transactions to prevent a record from being modified without blocking
> other transactions doing the same.

Yeah, and it does work.  (I posted the patch two days ago.)

alvherre=# create table a (a serial primary key);
NOTICE:  CREATE TABLE creará una secuencia implícita «a_a_seq» para la columna serial «a.a»
NOTICE:  CREATE TABLE / PRIMARY KEY creará el índice implícito «a_pkey» para la tabla «a»
CREATE TABLE
alvherre=# create table b (a int references a);
CREATE TABLE
alvherre=# insert into a default values;
INSERT 0 1
alvherre=# insert into a default values;
INSERT 0 1
alvherre=# begin;
BEGIN
alvherre=# insert into b values (1);
INSERT 0 1

Session 2:
alvherre=# begin;
BEGIN
alvherre=# insert into b values (2);
INSERT 0 1
alvherre=# insert into b values (1);
INSERT 0 1


Session 1:
lvherre=# insert into b values (2);
INSERT 0 1
alvherre=# commit;
COMMIT

Session 2:
alvherre=# commit;
COMMIT


You'll notice if you do that on any released version it will deadlock ...


Now this can't be applied right away because it's easy to run "out of
memory" (shared memory for the lock table).  Say, a delete or update
that touches 10000 tuples does not work.  I'm currently working on a
proposal to allow the lock table to spill to disk ...

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"La persona que no quería pecar / estaba obligada a sentarse
 en duras y empinadas sillas    / desprovistas, por cierto
 de blandos atenuantes"                          (Patricio Vogel)

Re: Debugging deadlocks

From
Greg Stark
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:

> Now this can't be applied right away because it's easy to run "out of
> memory" (shared memory for the lock table).  Say, a delete or update
> that touches 10000 tuples does not work.  I'm currently working on a
> proposal to allow the lock table to spill to disk ...

Is that true even if I'm updating/deleting 1,000 tuples that all reference the
same foreign key? It seems like that should only need a single lock per
(sub)transaction_id per referenced foreign key.

How is this handled currently? Is your patch any worse than the current
behaviour?

--
greg

Re: Debugging deadlocks

From
Alvaro Herrera
Date:
On Wed, Mar 30, 2005 at 05:41:04PM -0500, Greg Stark wrote:
>
> Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
>
> > Now this can't be applied right away because it's easy to run "out of
> > memory" (shared memory for the lock table).  Say, a delete or update
> > that touches 10000 tuples does not work.  I'm currently working on a
> > proposal to allow the lock table to spill to disk ...
>
> Is that true even if I'm updating/deleting 1,000 tuples that all reference the
> same foreign key? It seems like that should only need a single lock per
> (sub)transaction_id per referenced foreign key.

Well, in that case you need 1000 PROCLOCK objects, all pointing to the
same LOCK object.  But it still uses shared memory.


> How is this handled currently? Is your patch any worse than the current
> behaviour?

With my patch it's useless without a provision to spill the lock table.
The current situation is that we don't use the lock table to lock
tuples; instead we mark them on disk, in the tuple itself.  So we can't
really mark a tuple more than once (because we have only one bit to
mark); that's why we limit tuple locking to exclusive locking (there's
no way to mark a tuple with more than one shared lock).

With my patch we need a lot of memory for each tuple locked.  This needs
to be shared memory.  Since shared memory is limited, we can't grab an
arbitrary number of locks simultaneously.  Thus, deleting a whole table
can fail.  You haven't ever seen Postgres failing in a DELETE FROM
table, have you?

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Java is clearly an example of a money oriented programming"  (A. Stepanov)

Re: Debugging deadlocks

From
Vivek Khera
Date:
On Mar 30, 2005, at 4:47 PM, Alvaro Herrera wrote:

> Now this can't be applied right away because it's easy to run "out of
> memory" (shared memory for the lock table).  Say, a delete or update
> that touches 10000 tuples does not work.  I'm currently working on a
> proposal to allow the lock table to spill to disk ...
>

That would be most excellent, as I have deadlocks from transactions
that do upwards of 100k inserts selecting data from other tables.


Re: Debugging deadlocks

From
Greg Stark
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:

> On Wed, Mar 30, 2005 at 05:41:04PM -0500, Greg Stark wrote:
> >
> > Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> >
> > Is that true even if I'm updating/deleting 1,000 tuples that all reference the
> > same foreign key? It seems like that should only need a single lock per
> > (sub)transaction_id per referenced foreign key.
>
> Well, in that case you need 1000 PROCLOCK objects, all pointing to the
> same LOCK object.  But it still uses shared memory.

Why? What do you need these PROCLOCK objects for? You're never going to do
anything to these locks but release them all together.

> > How is this handled currently? Is your patch any worse than the current
> > behaviour?
>
> With my patch it's useless without a provision to spill the lock table.
> The current situation is that we don't use the lock table to lock
> tuples; instead we mark them on disk, in the tuple itself.  So we can't
> really mark a tuple more than once (because we have only one bit to
> mark); that's why we limit tuple locking to exclusive locking (there's
> no way to mark a tuple with more than one shared lock).

For reference, the way Oracle does it (as I understand it) it locks them on
disk by reserving a fixed amount of space for each block to record locks. If
that space is exhausted then it has some sort of failover mechanism.

I think in practice you rarely need more than 1 or 2 lockers. So this works
quite well most of the time. But then you're paying the cost in lower i/o
throughput all the time.

> With my patch we need a lot of memory for each tuple locked.  This needs
> to be shared memory.  Since shared memory is limited, we can't grab an
> arbitrary number of locks simultaneously.  Thus, deleting a whole table
> can fail.  You haven't ever seen Postgres failing in a DELETE FROM
> table, have you?

--
greg

Re: Debugging deadlocks

From
"Guy Rouillier"
Date:
Alvaro Herrera wrote:
>
> Now this can't be applied right away because it's easy to run "out of
> memory" (shared memory for the lock table).  Say, a delete or update
> that touches 10000 tuples does not work.  I'm currently working on a
> proposal to allow the lock table to spill to disk ...

While not always true, in many cases the cardinality of the referenced
(parent) table is small compared to that of the referencing (child)
table.  Does locking require a separate lock record for each tuple in
the child table, or just one for each tuple in the parent table with a
reference count?  For example, the scenario I started this thread with
had two child tables referencing rows in a common parent table.  For a
given parent tuple, a single "prevent write" lock with a reference count
would seem sufficient.

--
Guy Rouillier


Re: Debugging deadlocks

From
Alvaro Herrera
Date:
On Thu, Mar 31, 2005 at 06:54:31PM -0600, Guy Rouillier wrote:
> Alvaro Herrera wrote:
> >
> > Now this can't be applied right away because it's easy to run "out of
> > memory" (shared memory for the lock table).  Say, a delete or update
> > that touches 10000 tuples does not work.  I'm currently working on a
> > proposal to allow the lock table to spill to disk ...
>
> While not always true, in many cases the cardinality of the referenced
> (parent) table is small compared to that of the referencing (child)
> table.  Does locking require a separate lock record for each tuple in
> the child table, or just one for each tuple in the parent table with a
> reference count?

Just one.  (LOCALLOCK, which is private to each backend, stores how many
times we hold a lock.)

I just realized we not only need to be able to spill LOCK struct to
disk, but also PROCLOCK ... am I right?

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
La web junta la gente porque no importa que clase de mutante sexual seas,
tienes millones de posibles parejas. Pon "buscar gente que tengan sexo con
ciervos incendiándose", y el computador dirá "especifique el tipo de ciervo"
(Jason Alexander)

Re: Debugging deadlocks

From
frank@joerdens.de
Date:
On Sun, Mar 27, 2005 at 12:54:28AM -0600, Guy Rouillier wrote:
[]
> The service_plane table is a reference table, i.e., a fixed set of
> values used only to validate foreign keys.  So the code doesn't have any
> update statements on that table.

And idea that just came up around here that sounds like a pretty neat
workaround, which we're gonna try, is to drop the foreign key
constraints, and just use a check constraint for the allowed values. If
the cardinality of the reference table is small, this is much faster
than using foreign keys and solves your problem.  With the drawback that
if you update the reference table (if you keep it), you mustn't forget
to also update the check constraints in more than one place.

Rgds, Frank

Re: Debugging deadlocks

From
Bruno Wolff III
Date:
On Fri, Apr 01, 2005 at 10:37:11 +0200,
  frank@joerdens.de wrote:
>
> And idea that just came up around here that sounds like a pretty neat
> workaround, which we're gonna try, is to drop the foreign key
> constraints, and just use a check constraint for the allowed values. If
> the cardinality of the reference table is small, this is much faster
> than using foreign keys and solves your problem.  With the drawback that
> if you update the reference table (if you keep it), you mustn't forget
> to also update the check constraints in more than one place.

Using domains is a good way to keep column constraints in just one place.

Re: Debugging deadlocks

From
Paul Tillotson
Date:
Alvaro,

I suppose there must be reasons not to do this, but have you considered
using the "slack" space (empty space) in an ordinary table "heap" page
to store share-locks on the tuples in that page?  (If not enough space
is available then you would still need to use the spilled-to-disk btree
structure.)

Maybe there could also be something that you put in the disk page that
means "transaction X has all the tuples in this page share-locked,"
since I imagine it will usually be the case that if very many of them
are locked, then they are all locked.

With this method, checking for a lock would be slightly more complicated
since you would need to check the disk page in which the tuple resides
first, and the spill-to-disk structure afterwards.

It seems that if this would work, you would be able to share-lock every
tuple in a table (or a large fraction of them) without a lot of extra IO
provided that the pages were not 100 % full.  (Which is almost never the
case in postgresql anyway.)

Paul Tillotson

Alvaro Herrera wrote:

>On Thu, Mar 31, 2005 at 06:54:31PM -0600, Guy Rouillier wrote:
>
>
>>Alvaro Herrera wrote:
>>
>>
>>>Now this can't be applied right away because it's easy to run "out of
>>>memory" (shared memory for the lock table).  Say, a delete or update
>>>that touches 10000 tuples does not work.  I'm currently working on a
>>>proposal to allow the lock table to spill to disk ...
>>>
>>>
>>While not always true, in many cases the cardinality of the referenced
>>(parent) table is small compared to that of the referencing (child)
>>table.  Does locking require a separate lock record for each tuple in
>>the child table, or just one for each tuple in the parent table with a
>>reference count?
>>
>>
>
>Just one.  (LOCALLOCK, which is private to each backend, stores how many
>times we hold a lock.)
>
>I just realized we not only need to be able to spill LOCK struct to
>disk, but also PROCLOCK ... am I right?
>
>
>


Re: Debugging deadlocks

From
Alvaro Herrera
Date:
On Fri, Apr 01, 2005 at 07:41:54PM -0500, Paul Tillotson wrote:

Hi,

> I suppose there must be reasons not to do this, but have you considered
> using the "slack" space (empty space) in an ordinary table "heap" page
> to store share-locks on the tuples in that page?  (If not enough space
> is available then you would still need to use the spilled-to-disk btree
> structure.)

No, I hadn't considered it.  However this seems hard to do, because the
"slack space" does not have a start point; on each page, elements
(tuples) grow from the back to the front, while element pointers grow
in the reverse direction.  So I don't see how would this be done.

I've been discussing this with Bruce who has an idea completely
different from mine (basically to extend the current tuple locking
system instead of extending the lock manager).  I'm still unsure about
the whole thing.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Ni aun el genio muy grande llegaría muy lejos
si tuviera que sacarlo todo de su propio interior" (Goethe)

Re: Debugging deadlocks

From
Paul Tillotson
Date:
>>I suppose there must be reasons not to do this, but have you considered
>>using the "slack" space (empty space) in an ordinary table "heap" page
>>to store share-locks on the tuples in that page?  (If not enough space
>>is available then you would still need to use the spilled-to-disk btree
>>structure.)
>>
>>
>
>No, I hadn't considered it.  However this seems hard to do, because the
>"slack space" does not have a start point; on each page, elements
>(tuples) grow from the back to the front, while element pointers grow
>in the reverse direction.  So I don't see how would this be done.
>
>
Would it work for an updater, who finds that the locks list (currently
located in the middle of the empty space) is "in the way" of a new tuple
that he wants to insert, to take some kind of lock, move the whole list
up or down (spilling some of these locks to the disk if no more space is
available), and release it again?

Could the lock(s) on a particular tuple be tacked onto the end of that
tuple?  (I think tuples are stored variable-width anyway, aren't they?)
I suppose this is conceptually equivalent to adding a variable width
system column which gets TOASTED when it gets too wide.  I suppose this
would make taking a lock as expensive as doing an update, though, right?

Paul Tillotson


Re: Debugging deadlocks

From
Alvaro Herrera
Date:
On Fri, Apr 01, 2005 at 10:14:07PM -0500, Paul Tillotson wrote:

> Would it work for an updater, who finds that the locks list (currently
> located in the middle of the empty space) is "in the way" of a new tuple
> that he wants to insert, to take some kind of lock, move the whole list
> up or down (spilling some of these locks to the disk if no more space is
> available), and release it again?

Well, at that point you need to take a lock in order to be able to
manage locks.  Managing not to step on your own feet in that scenario
is complex, to say the least, if not downright impossible.

Another problem with this approach is that it would be practically
impossible for a process to release all its locks when it finishes.

No, we have to build this infrastructure purely using LWLocks, which
means, no normal relations involved.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"The Postgresql hackers have what I call a "NASA space shot" mentality.
 Quite refreshing in a world of "weekend drag racer" developers."
(Scott Marlowe)

Re: Debugging deadlocks

From
Tom Lane
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> On Fri, Apr 01, 2005 at 10:14:07PM -0500, Paul Tillotson wrote:
>> ...

> Well, at that point you need to take a lock in order to be able to
> manage locks.  Managing not to step on your own feet in that scenario
> is complex, to say the least, if not downright impossible.

I looked at Paul's first message and thought "nah, that won't work
because ... because ... hmm ... hmmm ..."

We currently store tuple locks on the same page as the tuples (ie, in
the tuple headers) and need no extra locks to do so.  Certainly it
still has to have a spill mechanism, but the thought that is attractive
to me is that until you are forced to spill, you do not have to take any
system-wide lock, only a page-level lock.  So it could have very good
average performance.

> Another problem with this approach is that it would be practically
> impossible for a process to release all its locks when it finishes.

There is no explicit release of tuple-level locks in the current setup.
Need it be different in Paul's idea?

            regards, tom lane

Re: Debugging deadlocks

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> > On Fri, Apr 01, 2005 at 10:14:07PM -0500, Paul Tillotson wrote:
> >> ...
>
> > Well, at that point you need to take a lock in order to be able to
> > manage locks.  Managing not to step on your own feet in that scenario
> > is complex, to say the least, if not downright impossible.
>
> I looked at Paul's first message and thought "nah, that won't work
> because ... because ... hmm ... hmmm ..."

For what it's worth, this would be very similar to how Oracle handles such
locks. In that case there's a fixed amount of space per page reserved in
advance for storing locks.

Actually I think it's a bit more complicated. But that's the idea.


--
greg

Re: Debugging deadlocks

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> I looked at Paul's first message and thought "nah, that won't work
>> because ... because ... hmm ... hmmm ..."

> For what it's worth, this would be very similar to how Oracle handles such
> locks.

[ slightly alarmed ]  Do they have a patent on the way they do it?

> In that case there's a fixed amount of space per page reserved in
> advance for storing locks.

Paul's idea involved using a variable amount of space (ie, whatever's
free) ... which might be sufficient to escape whatever hypothetical
patent Oracle or someone else might hypothetically have on this idea.
Or not.  Pardon me for feeling a bit skittish at the moment.

            regards, tom lane

Re: Debugging deadlocks

From
Paul Tillotson
Date:
Greg Stark wrote:

>Tom Lane <tgl@sss.pgh.pa.us> writes:
>
>
>>Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
>>
>>
>>>On Fri, Apr 01, 2005 at 10:14:07PM -0500, Paul Tillotson wrote:
>>>
>>>
>>>>...
>>>>
>>>>
>>>Well, at that point you need to take a lock in order to be able to
>>>manage locks.  Managing not to step on your own feet in that scenario
>>>is complex, to say the least, if not downright impossible.
>>>
>>>
>>I looked at Paul's first message and thought "nah, that won't work
>>because ... because ... hmm ... hmmm ..."
>>
>>
>
>For what it's worth, this would be very similar to how Oracle handles such
>locks. In that case there's a fixed amount of space per page reserved in
>advance for storing locks.
>
>Actually I think it's a bit more complicated. But that's the idea.
>
>
While we're at it, does anyone know how DB2 and MSSQL server handle
row-level locks?

Paul

Re: Debugging deadlocks

From
Alvaro Herrera
Date:
On Fri, Apr 01, 2005 at 11:02:36PM -0500, Tom Lane wrote:

[Cc: to -hackers]

> We currently store tuple locks on the same page as the tuples (ie, in
> the tuple headers) and need no extra locks to do so.  Certainly it
> still has to have a spill mechanism, but the thought that is attractive
> to me is that until you are forced to spill, you do not have to take any
> system-wide lock, only a page-level lock.  So it could have very good
> average performance.

If we go this way maybe we should abandon the idea of using the standard
lock manager to lock tuples, which is what can be found on the patch I
posted.  Or maybe not, and just have the lock manager store the locks on
the page himself -- but it will have to know about the buffer, so it
will be in some sense a break in opacity (of API between the two).

One possible way to do it would be having a OffsetNumber stored in the
page header, and if it's not InvalidOffsetNumber then it points to a
"tuple" that holds

struct
{
    OffsetNumber nextLock;
    LOCK lock
}

So a locker would check the chain of locks and stop when it sees
InvalidOffsetNumber.

If there is no free space on the page, what should we do?  Store the
lock into the main hash table?

Another problem would be the XLog.  On heap operations, do we register
exactly where (position in the page) a tuple was stored, or just the
fact that it was stored?  If the latter, then there's no problem.  If
the former, then on the next REDO the records wouldn't match (==> PANIC)
-- unless we logged the lockings as well.

Reading the code I see we do log the offset numbers, so that's a problem
:-( ... maybe we could work around that by moving the pd_lower without
using line pointers; not sure if that's doable.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Porque Kim no hacia nada, pero, eso sí,
con extraordinario éxito" ("Kim", Kipling)

Re: Debugging deadlocks

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> > Tom Lane <tgl@sss.pgh.pa.us> writes:
> >> I looked at Paul's first message and thought "nah, that won't work
> >> because ... because ... hmm ... hmmm ..."
>
> > For what it's worth, this would be very similar to how Oracle handles such
> > locks.
>
> [ slightly alarmed ]  Do they have a patent on the way they do it?

Do we really want to know?...

A few minutes searching on a patent search site doesn't turn up anything
relevant though. It might have been part of Oracle's design from such an early
stage that they never thought it was patentable. It's not clear to me that it
is for that matter. The general idea of storing locks with the objects being
locked isn't really anything novel.

--
greg

Re: Debugging deadlocks

From
Edmund Bacon
Date:
bruno@wolff.to (Bruno Wolff III) writes:

> Using domains is a good way to keep column constraints in just one place.
>

Speaking of domains, how do you find out what the range of a domain
is?

eg:

test=# create domain fruit as text
check( value in ('apple', 'orange', 'banana', 'pear'));
CREATE DOMAIN
test=# \dD fruit
         List of domains
 Schema | Name  | Type | Modifier
--------+-------+------+----------
 public | fruit | text |
(1 row)

test=# \dD+ fuit
         List of domains
 Schema | Name  | Type | Modifier
--------+-------+------+----------
 public | fruit | text |
(1 row)

A quick look through pg_catalog doesn't suggest anything that would
return the check conditions -   Is there any way to do this?

--
Remove -42 for email

Re: Debugging deadlocks

From
Michael Fuhr
Date:
On Fri, Apr 01, 2005 at 09:27:40AM -0700, Edmund Bacon wrote:
>
> Speaking of domains, how do you find out what the range of a domain
> is?

I'm not sure if there's a better way, but this appears to work:

SELECT pg_get_constraintdef(oid, TRUE)
FROM pg_constraint
WHERE contypid = 'fruit'::regtype;

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Debugging deadlocks

From
"Guy Rouillier"
Date:
frank@joerdens.de wrote:
> On Sun, Mar 27, 2005 at 12:54:28AM -0600, Guy Rouillier wrote: []
>> The service_plane table is a reference table, i.e., a fixed set of
>> values used only to validate foreign keys.  So the code doesn't have
>> any update statements on that table.
>
> And idea that just came up around here that sounds like a pretty neat
> workaround, which we're gonna try, is to drop the foreign key
> constraints, and just use a check constraint for the allowed values.
> If the cardinality of the reference table is small, this is much
> faster than using foreign keys and solves your problem.  With the
> drawback that if you update the reference table (if you keep it), you
> mustn't forget to also update the check constraints in more than one
> place.

Frank and Bruno, thanks for the idea.  I'm going to give this a try.  I
tried the INITIALLY DEFERRED and I'm getting some strange duplicate key
errors.  This sounds like a better idea in this scenario, as our
reference table only has a dozen unique values.

--
Guy Rouillier