Thread: Deadlock. Referential Integrity checks select for update?

Deadlock. Referential Integrity checks select for update?

From
Grant McLean
Date:
Hi

First a simple question ...

I have a table "access_log" that has foreign keys "app_id" and
"app_user_id" that reference the "application_type" and "app_user"
tables.  When I insert into "access_log", the referential integrity
triggers generate these queries:

  SELECT 1 FROM ONLY "public"."application_type" x
  WHERE "app_id" = $1 FOR UPDATE OF x

  SELECT 1 FROM ONLY "public"."app_user" x
  WHERE "app_user_id" = $1 FOR UPDATE OF x

I don't understand why these two queries include "FOR UPDATE".  Can
anyone explain that?


That question may in fact be a red herring as far as my real problem
goes, so I'll describe that too ...

I have a web application in which transactions die with database
deadlocks under load.  I have isolated the problem to when two
different types of transaction occur in parallel:

  Transaction A:
    create a new record in "session" and get its ID then
    create a new record in "access_log" and get its ID

  Transaction B:
    retrieve an existing record from "session" using its ID then
    create a new record in "access_log" and get its ID

In practice, the initial select from session in transaction B doesn't
affect the deadlock, so I've omitted it here.

The theoretical parallel sequence of queries is:

  A1 insert_session         B1 insert_access_log
  A2 get_session_id         B2 get_access_log_id
  A3 insert_access_log      B3 commit
  A4 get_access_log_id
  A5 commit

The postgresql log show this is the sequence that causes deadlock:

  A1  insert_session
  riA  select from app_user for update
                            B1  insert_access_log
                            riB  select from app_type for update
                            riB  select from app_user for update
                            << Process B blocked >>
  A2  get_session_id
  A3  insert_access_log
  riA  select from app_type for update
                       << DEADLOCK >>
  << Process A killed >>

"riA" and "riB" refer to system-generated statements which check
referential integrity.

Neither transaction A nor transaction B ever updates either the
application_type or app_user tables and the deadlock only seems
to happen as a result of the referential integrity checks locking
the parent tables for update.  So it seems that even if I'm careful
to always do my updates in a consistent order to avoid deadlock,
the referential integrity constraints can cause deadlock themselves.


Here's the snippet from the postgresql log:

15:37:15 [20480] LOG:  statement: select getdatabaseencoding()
15:37:15 [20480] LOG:  statement: begin
15:37:15 [20481] LOG:  statement: select getdatabaseencoding()
15:37:15 [20481] LOG:  statement: begin
15:37:40 [20480] LOG:  statement: INSERT INTO session (app_user_id,
session_data) VALUES ('1', '')
15:37:40 [20480] LOG:  statement: SELECT 1 FROM ONLY "public"."app_user"
x WHERE "app_user_id" = $1 FOR UPDATE OF x

15:37:51 [20481] LOG:  statement:
  INSERT INTO access_log (app_id, app_user_id, log_data)
  VALUES ('1', '1', 'Log data here')

15:37:51 [20481] LOG:  statement: SELECT 1 FROM ONLY
"public"."application_type" x WHERE "app_id" = $1 FOR UPDATE OF x
15:37:51 [20481] LOG:  statement: SELECT 1 FROM ONLY "public"."app_user"
x WHERE "app_user_id" = $1 FOR UPDATE OF x

15:38:04 [20480] LOG:  statement: SELECT currval('seq_session_id')

15:38:17 [20480] LOG:  statement:
  INSERT INTO access_log (app_id, app_user_id, log_data)
  VALUES ('1', '1', 'Log data here')

15:38:17 [20480] LOG:  statement: SELECT 1 FROM ONLY
"public"."application_type" x WHERE "app_id" = $1 FOR UPDATE OF x

15:38:20 [20480] ERROR:  deadlock detected
DETAIL:  Process 20480 waits for ShareLock on transaction 74278; blocked
by process 20481.
    Process 20481 waits for ShareLock on transaction 74275; blocked by
process 20480.
15:38:20 [20480] LOG:  statement: rollback




And here's a short script to create the database:

#!/bin/sh

DBNAME=deadpool

if psql -l | grep -q $DBNAME
then
    dropdb $DBNAME || exit
fi

createdb $DBNAME

psql -d $DBNAME <<EOF

  create sequence seq_access_log_id;

  create sequence seq_session_id;

  create table app_user (
    app_user_id          INT4   not null,
    login_name           TEXT   not null,
    constraint PK_APP_USER primary key (app_user_id)
  );

  create table application_type (
    app_id               INT4   not null,
    app_desc             TEXT   not null,
    constraint PK_APPLICATION_TYPE primary key (app_id)
  );

  create table session (
    session_id           INT8   not null default
nextval('seq_session_id'),
    app_user_id          INT4   not null,
    session_data         TEXT   not null,
    constraint PK_SESSION primary key (session_id),
    constraint FK_SESS_REFERENCE_APP_ foreign key (app_user_id)
      references app_user (app_user_id)
      deferrable
  );

  create table access_log (
    access_log_id        INT8   not null default
nextval('seq_access_log_id'),
    app_id               INT4   not null,
    app_user_id          INT4   not null,
    log_data             TEXT   null,
    constraint PK_ACCESS_LOG primary key (access_log_id),
    constraint FK_ACCE_REFERENCE_APPL foreign key (app_id)
      references application_type (app_id)
      deferrable,
    constraint FK_ACCE_REFERENCE_APP_ foreign key (app_user_id)
      references app_user (app_user_id)
      deferrable
  );

  create index access_log_app_type_fk on access_log (
    app_id
  );

  create index access_log_app_user_id on access_log (
    app_user_id
  );

  insert into application_type values (1, 'Public Web App');

  insert into app_user values (1, 'anonymous');

EOF


I also have a Perl script that steps through the two transactions in the
appropriate order and causes the deadlock every time.


Any assistance anyone can offer would be appreciated.

Regards
Grant



Re: Deadlock. Referential Integrity checks select for update?

From
Tom Lane
Date:
Grant McLean <grant@catalyst.net.nz> writes:
> I don't understand why these two queries include "FOR UPDATE".

The point is to lock the referenced rows so they cannot be deleted
before the referencing transaction commits.  FOR UPDATE is an overly
strong lock, but we do not presently have any weaker lock that will
serve.  You can find lots more about this issue in the archives...

            regards, tom lane