Thread: pg_advisory_locks in a multithreaded application context

pg_advisory_locks in a multithreaded application context

From

Vincent Ficet

Date:

29 June 2011, 12:12:38

Hello,

I'm having some trouble trying to use postgresql locks. After trying
several options and reading the postgresql online documentation, I still
can't figure out what I'm doing wrong. Here's the use case:

A multithreaded application collecting adapter firmwares on a network
loads data into the following table:

CREATE TABLE firmware (
    id SERIAL NOT NULL,
    type CHARACTER VARYING(32),
    version CHARACTER VARYING(30),
    build_id INTEGER,
    date CHARACTER VARYING(25),
    ps_id CHARACTER VARYING(25)
);

Typically, there are a few hundred adapters, but only 5 firmwares (many
adapters should have the same firmware if the sysadmins did their jobs
properly ;-) ).

Only a single entry is required per firmware (many separate adapters can
share the same firmware by pointing to the appropriate firmware id field).
To make sure that only one entry is created per firmware, I use the
following trigger:

CREATE TRIGGER firmware_pre_insert_trigger
    BEFORE INSERT ON firmware
    FOR EACH ROW
    EXECUTE PROCEDURE firmware_pre_insert_trigger_cb();

CREATE FUNCTION firmware_pre_insert_trigger_cb() RETURNS TRIGGER
    AS $_$
DECLARE
    fw_id INT;
BEGIN

SELECT fw.id FROM firmware fw INTO fw_id
WHERE (fw.type = new.type AND
       fw.version = new.version AND
       fw.build_id = new.build_id AND
       fw.date = new.date AND
       fw.ps_id = new.ps_id);

IF fw_id IS NULL THEN
   -- create the non-existing firmware
   RETURN new;
ELSE
   -- skip firmware which already exists
   RETURN NULL;
END IF;

END;
$_$
    LANGUAGE PLPGSQL;

When a thread wishes to add a firmware after discovering one adapter, it
executes the following code:

PERFORM pg_advisory_lock(1);

INSERT INTO firmware (type, version, build_id, date, ps_id)
VALUES (chip_type, firm_version, firm_build_id, firm_date, firm_ps_id);

PERFORM pg_advisory_unlock(1);

Unfortunately, I still get duplicated entries using advisory locks, and
they don't seem to lock anything at all...

On the other hand, If I use builtin locks as follows without the
trigger, I get deadlocks in the server logs:

LOCK TABLE firmware IN SHARE MODE;

SELECT f.id INTO firmware_id FROM firmware f
WHERE (f.type = chip_type AND
       f.version = firm_version AND
       f.build_id = firm_build_id AND
       f.date = firm_date AND
       f.ps_id = firm_ps_id)
LIMIT 1;

IF firmware_id IS NULL THEN
   INSERT INTO firmware (type, version, build_id, date, ps_id)
   VALUES (chip_type, firm_version, firm_build_id, firm_date, firm_ps_id);
END IF;

Here's the log:

DETAIL:  Process 7643 waits for RowExclusiveLock on relation 21060 of
database 20535; blocked by process 7593.
        Process 7593 waits for RowExclusiveLock on relation 21060 of
database 20535; blocked by process 7643.
        Process 7643: SELECT
add_firmware('0x08003800013731aa','hca','512.1792.0',0,'0920-10-06','BL_0010030001000');
        Process 7593: SELECT
add_firmware('0x08003800013734b0','hca','512.1792.0',0,'0920-10-06','BL_0010030001000');
HINT:  See server log for query details.
QUERY:  INSERT INTO firmware (type, version, build_id, date, ps_id)
           VALUES (chip_type, firm_version, firm_build_id, firm_date,
firm_ps_id)

Looking at #13.3.3. Deadlocks on
http://www.postgresql.org/docs/9.0/interactive/explicit-locking.html, I
guess I'm probably in the same situation, but I can't figure out how
that applies to threads and more importantly, how to deal with it.
Note that each of the 16 threads uses its own connection to the database
backend (postgresql 9.0.4)

Serializing calls to add_firmware() on the client side by putting them
in the same thread does not help much either, as there is not guarantee
that the requests will be executed in a serialized manner on the server
side. In this case, I also get deadlocks, although less frequently...

Any idea is more than welcome ;-)

Thanks

Vincent

Re: pg_advisory_locks in a multithreaded application context

From

Merlin Moncure

Date:

29 June 2011, 12:48:36

On Wed, Jun 29, 2011 at 10:11 AM, Vincent Ficet
<jean-vincent.ficet@bull.net> wrote:
> Hello,
>
> I'm having some trouble trying to use postgresql locks. After trying
> several options and reading the postgresql online documentation, I still
> can't figure out what I'm doing wrong. Here's the use case:
>
> A multithreaded application collecting adapter firmwares on a network
> loads data into the following table:
>
> CREATE TABLE firmware (
>    id SERIAL NOT NULL,
>    type CHARACTER VARYING(32),
>    version CHARACTER VARYING(30),
>    build_id INTEGER,
>    date CHARACTER VARYING(25),
>    ps_id CHARACTER VARYING(25)
> );
>
> Typically, there are a few hundred adapters, but only 5 firmwares (many
> adapters should have the same firmware if the sysadmins did their jobs
> properly ;-) ).
>
> Only a single entry is required per firmware (many separate adapters can
> share the same firmware by pointing to the appropriate firmware id field).
> To make sure that only one entry is created per firmware, I use the
> following trigger:
>
> CREATE TRIGGER firmware_pre_insert_trigger
>    BEFORE INSERT ON firmware
>    FOR EACH ROW
>    EXECUTE PROCEDURE firmware_pre_insert_trigger_cb();
>
> CREATE FUNCTION firmware_pre_insert_trigger_cb() RETURNS TRIGGER
>    AS $_$
> DECLARE
>    fw_id INT;
> BEGIN
>
> SELECT fw.id FROM firmware fw INTO fw_id
> WHERE (fw.type = new.type AND
>       fw.version = new.version AND
>       fw.build_id = new.build_id AND
>       fw.date = new.date AND
>       fw.ps_id = new.ps_id);
>
> IF fw_id IS NULL THEN
>   -- create the non-existing firmware
>   RETURN new;
> ELSE
>   -- skip firmware which already exists
>   RETURN NULL;
> END IF;
>
> END;
> $_$
>    LANGUAGE PLPGSQL;
>
> When a thread wishes to add a firmware after discovering one adapter, it
> executes the following code:
>
> PERFORM pg_advisory_lock(1);
>
> INSERT INTO firmware (type, version, build_id, date, ps_id)
> VALUES (chip_type, firm_version, firm_build_id, firm_date, firm_ps_id);
>
> PERFORM pg_advisory_unlock(1);

Advisory lock is not going to work here.  You are releasing the lock
before the transaction resolves and that leaves a window for  second
transaction to do the 'select' and not see the data because it hasn't
committed yet.

> Unfortunately, I still get duplicated entries using advisory locks, and
> they don't seem to lock anything at all...
>
> On the other hand, If I use builtin locks as follows without the
> trigger, I get deadlocks in the server logs:
>
> LOCK TABLE firmware IN SHARE MODE;

well a sharelock certainly isn't going to work.  share blocks row
exclusive (see http://www.postgresql.org/docs/current/static/explicit-locking.html#LOCKING-TABLES),
so two transactions can simultaneously get a share lock and wait for
each other to to resolve to get the exclusive lock on a row.
'EXCLUSIVE' would be better (although that would effectively serialize
the transactions).

merlin

Re: pg_advisory_locks in a multithreaded application context

From

Vincent Ficet

Date:

30 June 2011, 05:31:30

Merlin Moncure wrote:
> On Wed, Jun 29, 2011 at 10:11 AM, Vincent Ficet
> <jean-vincent.ficet@bull.net> wrote:
>
>> Hello,
>>
>> I'm having some trouble trying to use postgresql locks. After trying
>> several options and reading the postgresql online documentation, I still
>> can't figure out what I'm doing wrong. Here's the use case:
>>
>> A multithreaded application collecting adapter firmwares on a network
>> loads data into the following table:
>>
>> CREATE TABLE firmware (
>>    id SERIAL NOT NULL,
>>    type CHARACTER VARYING(32),
>>    version CHARACTER VARYING(30),
>>    build_id INTEGER,
>>    date CHARACTER VARYING(25),
>>    ps_id CHARACTER VARYING(25)
>> );
>>
>> Typically, there are a few hundred adapters, but only 5 firmwares (many
>> adapters should have the same firmware if the sysadmins did their jobs
>> properly ;-) ).
>>
>> Only a single entry is required per firmware (many separate adapters can
>> share the same firmware by pointing to the appropriate firmware id field).
>> To make sure that only one entry is created per firmware, I use the
>> following trigger:
>>
>> CREATE TRIGGER firmware_pre_insert_trigger
>>    BEFORE INSERT ON firmware
>>    FOR EACH ROW
>>    EXECUTE PROCEDURE firmware_pre_insert_trigger_cb();
>>
>> CREATE FUNCTION firmware_pre_insert_trigger_cb() RETURNS TRIGGER
>>    AS $_$
>> DECLARE
>>    fw_id INT;
>> BEGIN
>>
>> SELECT fw.id FROM firmware fw INTO fw_id
>> WHERE (fw.type = new.type AND
>>       fw.version = new.version AND
>>       fw.build_id = new.build_id AND
>>       fw.date = new.date AND
>>       fw.ps_id = new.ps_id);
>>
>> IF fw_id IS NULL THEN
>>   -- create the non-existing firmware
>>   RETURN new;
>> ELSE
>>   -- skip firmware which already exists
>>   RETURN NULL;
>> END IF;
>>
>> END;
>> $_$
>>    LANGUAGE PLPGSQL;
>>
>> When a thread wishes to add a firmware after discovering one adapter, it
>> executes the following code:
>>
>> PERFORM pg_advisory_lock(1);
>>
>> INSERT INTO firmware (type, version, build_id, date, ps_id)
>> VALUES (chip_type, firm_version, firm_build_id, firm_date, firm_ps_id);
>>
>> PERFORM pg_advisory_unlock(1);
>>
>
> Advisory lock is not going to work here.  You are releasing the lock
> before the transaction resolves and that leaves a window for  second
> transaction to do the 'select' and not see the data because it hasn't
> committed yet.
>
>
>> Unfortunately, I still get duplicated entries using advisory locks, and
>> they don't seem to lock anything at all...
>>
>> On the other hand, If I use builtin locks as follows without the
>> trigger, I get deadlocks in the server logs:
>>
>> LOCK TABLE firmware IN SHARE MODE;
>>
>
> well a sharelock certainly isn't going to work.  share blocks row
> exclusive (see http://www.postgresql.org/docs/current/static/explicit-locking.html#LOCKING-TABLES),
> so two transactions can simultaneously get a share lock and wait for
> each other to to resolve to get the exclusive lock on a row.
> 'EXCLUSIVE' would be better (although that would effectively serialize
> the transactions).
>
>
Thanks for the tip.

It now works fine using a SHARE UPDATE EXCLUSIVE lock in the PRE INSERT
trigger. This does not conflict with the  ROW EXCLUSIVE lock which is
implicitely taken by the INSERT statement in add_firmware() function.

BTW I think there might be a thread safety issue in postgres, as I often
get a segfault when deadlocks occur (prior to applying the fix I just
described). For example, the following deadlock situation:

    DETAIL:  Process 7643 waits for RowExclusiveLock on relation 21060
of database 20535; blocked by process 7593.
            Process 7593 waits for RowExclusiveLock on relation 21060 of
database 20535; blocked by process 7643.
            Process 7643: SELECT
add_firmware('0x08003800013731aa','hca','512.1792.0',0,'0920-10-06','BL_0010030001000');
            Process 7593: SELECT
add_firmware('0x08003800013734b0','hca','512.1792.0',0,'0920-10-06','BL_0010030001000');
    HINT:  See server log for query details.
    QUERY:  INSERT INTO firmware (type, version, build_id, date, ps_id)
               VALUES (chip_type, firm_version, firm_build_id,
firm_date, firm_ps_id)
    CONTEXT:  PL/pgSQL function "add_firmware" line 31 at SQL statement

Triggers:

#0  0x0000003c46725742 in __strncpy_ssse3 () from /lib64/libc.so.6
#1  0x00007ffff5f1617f in pqParseInput3 (conn=0x7ffff0000da0) at
fe-protocol3.c:209
#2  0x00007ffff5f0cae7 in parseInput (conn=0x7ffff0000da0) at fe-exec.c:1493
#3  0x00007ffff5f0cc01 in PQgetResult (conn=0x7ffff0000da0) at
fe-exec.c:1568
#4  0x00007ffff5f0d26b in PQexecFinish (conn=0x7ffff0000da0) at
fe-exec.c:1807
#5  0x00007ffff5f0cee4 in PQexec (conn=0x7ffff0000da0,
query=0x7ffff612d340 "SELECT 1") at fe-exec.c:1648
#6  0x00007ffff612c344 in dbd_ping () from
/home/vficet/X86_64/usr/lib/dbd/libdbdpgsql.so

Stack traces often occur in different flavours, but can always be
correlated with messages such as:

DBI error -9: unexpected field count in "D" message
DBI error -9: message contents do not agree with length in message type
"T" server sent data ("D" message) without prior row description ("T"
message)

I'll try to set up a minimalist reproducer and see what can be done for
that.
Note that this happens with both postgresql 8.4.7 and 9.0.4.

Cheers,

Vincent


> merlin
>
>

Re: pg_advisory_locks in a multithreaded application context

From

Merlin Moncure

Date:

30 June 2011, 10:33:51

On Thu, Jun 30, 2011 at 3:30 AM, Vincent Ficet
<jean-vincent.ficet@bull.net> wrote:
> Merlin Moncure wrote:
>> On Wed, Jun 29, 2011 at 10:11 AM, Vincent Ficet
>> <jean-vincent.ficet@bull.net> wrote:
>>
>>> Hello,
>>>
>>> I'm having some trouble trying to use postgresql locks. After trying
>>> several options and reading the postgresql online documentation, I still
>>> can't figure out what I'm doing wrong. Here's the use case:
>>>
>>> A multithreaded application collecting adapter firmwares on a network
>>> loads data into the following table:
>>>
>>> CREATE TABLE firmware (
>>>    id SERIAL NOT NULL,
>>>    type CHARACTER VARYING(32),
>>>    version CHARACTER VARYING(30),
>>>    build_id INTEGER,
>>>    date CHARACTER VARYING(25),
>>>    ps_id CHARACTER VARYING(25)
>>> );
>>>
>>> Typically, there are a few hundred adapters, but only 5 firmwares (many
>>> adapters should have the same firmware if the sysadmins did their jobs
>>> properly ;-) ).
>>>
>>> Only a single entry is required per firmware (many separate adapters can
>>> share the same firmware by pointing to the appropriate firmware id field).
>>> To make sure that only one entry is created per firmware, I use the
>>> following trigger:
>>>
>>> CREATE TRIGGER firmware_pre_insert_trigger
>>>    BEFORE INSERT ON firmware
>>>    FOR EACH ROW
>>>    EXECUTE PROCEDURE firmware_pre_insert_trigger_cb();
>>>
>>> CREATE FUNCTION firmware_pre_insert_trigger_cb() RETURNS TRIGGER
>>>    AS $_$
>>> DECLARE
>>>    fw_id INT;
>>> BEGIN
>>>
>>> SELECT fw.id FROM firmware fw INTO fw_id
>>> WHERE (fw.type = new.type AND
>>>       fw.version = new.version AND
>>>       fw.build_id = new.build_id AND
>>>       fw.date = new.date AND
>>>       fw.ps_id = new.ps_id);
>>>
>>> IF fw_id IS NULL THEN
>>>   -- create the non-existing firmware
>>>   RETURN new;
>>> ELSE
>>>   -- skip firmware which already exists
>>>   RETURN NULL;
>>> END IF;
>>>
>>> END;
>>> $_$
>>>    LANGUAGE PLPGSQL;
>>>
>>> When a thread wishes to add a firmware after discovering one adapter, it
>>> executes the following code:
>>>
>>> PERFORM pg_advisory_lock(1);
>>>
>>> INSERT INTO firmware (type, version, build_id, date, ps_id)
>>> VALUES (chip_type, firm_version, firm_build_id, firm_date, firm_ps_id);
>>>
>>> PERFORM pg_advisory_unlock(1);
>>>
>>
>> Advisory lock is not going to work here.  You are releasing the lock
>> before the transaction resolves and that leaves a window for  second
>> transaction to do the 'select' and not see the data because it hasn't
>> committed yet.
>>
>>
>>> Unfortunately, I still get duplicated entries using advisory locks, and
>>> they don't seem to lock anything at all...
>>>
>>> On the other hand, If I use builtin locks as follows without the
>>> trigger, I get deadlocks in the server logs:
>>>
>>> LOCK TABLE firmware IN SHARE MODE;
>>>
>>
>> well a sharelock certainly isn't going to work.  share blocks row
>> exclusive (see http://www.postgresql.org/docs/current/static/explicit-locking.html#LOCKING-TABLES),
>> so two transactions can simultaneously get a share lock and wait for
>> each other to to resolve to get the exclusive lock on a row.
>> 'EXCLUSIVE' would be better (although that would effectively serialize
>> the transactions).
>>
>>
> Thanks for the tip.
>
> It now works fine using a SHARE UPDATE EXCLUSIVE lock in the PRE INSERT
> trigger. This does not conflict with the  ROW EXCLUSIVE lock which is
> implicitely taken by the INSERT statement in add_firmware() function.
>
> BTW I think there might be a thread safety issue in postgres, as I often
> get a segfault when deadlocks occur (prior to applying the fix I just
> described). For example, the following deadlock situation:
>
>    DETAIL:  Process 7643 waits for RowExclusiveLock on relation 21060
> of database 20535; blocked by process 7593.
>            Process 7593 waits for RowExclusiveLock on relation 21060 of
> database 20535; blocked by process 7643.
>            Process 7643: SELECT
> add_firmware('0x08003800013731aa','hca','512.1792.0',0,'0920-10-06','BL_0010030001000');
>            Process 7593: SELECT
> add_firmware('0x08003800013734b0','hca','512.1792.0',0,'0920-10-06','BL_0010030001000');
>    HINT:  See server log for query details.
>    QUERY:  INSERT INTO firmware (type, version, build_id, date, ps_id)
>               VALUES (chip_type, firm_version, firm_build_id,
> firm_date, firm_ps_id)
>    CONTEXT:  PL/pgSQL function "add_firmware" line 31 at SQL statement
>
> Triggers:
>
> #0  0x0000003c46725742 in __strncpy_ssse3 () from /lib64/libc.so.6
> #1  0x00007ffff5f1617f in pqParseInput3 (conn=0x7ffff0000da0) at
> fe-protocol3.c:209
> #2  0x00007ffff5f0cae7 in parseInput (conn=0x7ffff0000da0) at fe-exec.c:1493
> #3  0x00007ffff5f0cc01 in PQgetResult (conn=0x7ffff0000da0) at
> fe-exec.c:1568
> #4  0x00007ffff5f0d26b in PQexecFinish (conn=0x7ffff0000da0) at
> fe-exec.c:1807
> #5  0x00007ffff5f0cee4 in PQexec (conn=0x7ffff0000da0,
> query=0x7ffff612d340 "SELECT 1") at fe-exec.c:1648
> #6  0x00007ffff612c344 in dbd_ping () from
> /home/vficet/X86_64/usr/lib/dbd/libdbdpgsql.so
>
> Stack traces often occur in different flavours, but can always be
> correlated with messages such as:
>
> DBI error -9: unexpected field count in "D" message
> DBI error -9: message contents do not agree with length in message type
> "T" server sent data ("D" message) without prior row description ("T"
> message)
>
> I'll try to set up a minimalist reproducer and see what can be done for
> that.
> Note that this happens with both postgresql 8.4.7 and 9.0.4.

is the segfault happening in the client? it looks like you have a
thread safety issue either in dbd or in your code...

merlin

Re: pg_advisory_locks in a multithreaded application context

From

Vincent Ficet

Date:

30 June 2011, 10:50:39

Merlin Moncure wrote:
> On Thu, Jun 30, 2011 at 3:30 AM, Vincent Ficet
> <jean-vincent.ficet@bull.net> wrote:
>
>> Merlin Moncure wrote:
>>
>>> On Wed, Jun 29, 2011 at 10:11 AM, Vincent Ficet
>>> <jean-vincent.ficet@bull.net> wrote:
>>>
>>>
>>>> Hello,
>>>>
>>>> I'm having some trouble trying to use postgresql locks. After trying
>>>> several options and reading the postgresql online documentation, I still
>>>> can't figure out what I'm doing wrong. Here's the use case:
>>>>
>>>> A multithreaded application collecting adapter firmwares on a network
>>>> loads data into the following table:
>>>>
>>>> CREATE TABLE firmware (
>>>>    id SERIAL NOT NULL,
>>>>    type CHARACTER VARYING(32),
>>>>    version CHARACTER VARYING(30),
>>>>    build_id INTEGER,
>>>>    date CHARACTER VARYING(25),
>>>>    ps_id CHARACTER VARYING(25)
>>>> );
>>>>
>>>> Typically, there are a few hundred adapters, but only 5 firmwares (many
>>>> adapters should have the same firmware if the sysadmins did their jobs
>>>> properly ;-) ).
>>>>
>>>> Only a single entry is required per firmware (many separate adapters can
>>>> share the same firmware by pointing to the appropriate firmware id field).
>>>> To make sure that only one entry is created per firmware, I use the
>>>> following trigger:
>>>>
>>>> CREATE TRIGGER firmware_pre_insert_trigger
>>>>    BEFORE INSERT ON firmware
>>>>    FOR EACH ROW
>>>>    EXECUTE PROCEDURE firmware_pre_insert_trigger_cb();
>>>>
>>>> CREATE FUNCTION firmware_pre_insert_trigger_cb() RETURNS TRIGGER
>>>>    AS $_$
>>>> DECLARE
>>>>    fw_id INT;
>>>> BEGIN
>>>>
>>>> SELECT fw.id FROM firmware fw INTO fw_id
>>>> WHERE (fw.type = new.type AND
>>>>       fw.version = new.version AND
>>>>       fw.build_id = new.build_id AND
>>>>       fw.date = new.date AND
>>>>       fw.ps_id = new.ps_id);
>>>>
>>>> IF fw_id IS NULL THEN
>>>>   -- create the non-existing firmware
>>>>   RETURN new;
>>>> ELSE
>>>>   -- skip firmware which already exists
>>>>   RETURN NULL;
>>>> END IF;
>>>>
>>>> END;
>>>> $_$
>>>>    LANGUAGE PLPGSQL;
>>>>
>>>> When a thread wishes to add a firmware after discovering one adapter, it
>>>> executes the following code:
>>>>
>>>> PERFORM pg_advisory_lock(1);
>>>>
>>>> INSERT INTO firmware (type, version, build_id, date, ps_id)
>>>> VALUES (chip_type, firm_version, firm_build_id, firm_date, firm_ps_id);
>>>>
>>>> PERFORM pg_advisory_unlock(1);
>>>>
>>>>
>>> Advisory lock is not going to work here.  You are releasing the lock
>>> before the transaction resolves and that leaves a window for  second
>>> transaction to do the 'select' and not see the data because it hasn't
>>> committed yet.
>>>
>>>
>>>
>>>> Unfortunately, I still get duplicated entries using advisory locks, and
>>>> they don't seem to lock anything at all...
>>>>
>>>> On the other hand, If I use builtin locks as follows without the
>>>> trigger, I get deadlocks in the server logs:
>>>>
>>>> LOCK TABLE firmware IN SHARE MODE;
>>>>
>>>>
>>> well a sharelock certainly isn't going to work.  share blocks row
>>> exclusive (see http://www.postgresql.org/docs/current/static/explicit-locking.html#LOCKING-TABLES),
>>> so two transactions can simultaneously get a share lock and wait for
>>> each other to to resolve to get the exclusive lock on a row.
>>> 'EXCLUSIVE' would be better (although that would effectively serialize
>>> the transactions).
>>>
>>>
>>>
>> Thanks for the tip.
>>
>> It now works fine using a SHARE UPDATE EXCLUSIVE lock in the PRE INSERT
>> trigger. This does not conflict with the  ROW EXCLUSIVE lock which is
>> implicitely taken by the INSERT statement in add_firmware() function.
>>
>> BTW I think there might be a thread safety issue in postgres, as I often
>> get a segfault when deadlocks occur (prior to applying the fix I just
>> described). For example, the following deadlock situation:
>>
>>    DETAIL:  Process 7643 waits for RowExclusiveLock on relation 21060
>> of database 20535; blocked by process 7593.
>>            Process 7593 waits for RowExclusiveLock on relation 21060 of
>> database 20535; blocked by process 7643.
>>            Process 7643: SELECT
>> add_firmware('0x08003800013731aa','hca','512.1792.0',0,'0920-10-06','BL_0010030001000');
>>            Process 7593: SELECT
>> add_firmware('0x08003800013734b0','hca','512.1792.0',0,'0920-10-06','BL_0010030001000');
>>    HINT:  See server log for query details.
>>    QUERY:  INSERT INTO firmware (type, version, build_id, date, ps_id)
>>               VALUES (chip_type, firm_version, firm_build_id,
>> firm_date, firm_ps_id)
>>    CONTEXT:  PL/pgSQL function "add_firmware" line 31 at SQL statement
>>
>> Triggers:
>>
>> #0  0x0000003c46725742 in __strncpy_ssse3 () from /lib64/libc.so.6
>> #1  0x00007ffff5f1617f in pqParseInput3 (conn=0x7ffff0000da0) at
>> fe-protocol3.c:209
>> #2  0x00007ffff5f0cae7 in parseInput (conn=0x7ffff0000da0) at fe-exec.c:1493
>> #3  0x00007ffff5f0cc01 in PQgetResult (conn=0x7ffff0000da0) at
>> fe-exec.c:1568
>> #4  0x00007ffff5f0d26b in PQexecFinish (conn=0x7ffff0000da0) at
>> fe-exec.c:1807
>> #5  0x00007ffff5f0cee4 in PQexec (conn=0x7ffff0000da0,
>> query=0x7ffff612d340 "SELECT 1") at fe-exec.c:1648
>> #6  0x00007ffff612c344 in dbd_ping () from
>> /home/vficet/X86_64/usr/lib/dbd/libdbdpgsql.so
>>
>> Stack traces often occur in different flavours, but can always be
>> correlated with messages such as:
>>
>> DBI error -9: unexpected field count in "D" message
>> DBI error -9: message contents do not agree with length in message type
>> "T" server sent data ("D" message) without prior row description ("T"
>> message)
>>
>> I'll try to set up a minimalist reproducer and see what can be done for
>> that.
>> Note that this happens with both postgresql 8.4.7 and 9.0.4.
>>
>
> is the segfault happening in the client?
Yes, it's on the client side.
> it looks like you have a
> thread safety issue either in dbd or in your code...
>
OK, I'll try to set up a minimalist reproducer. Is there a postgresql
non regression test to check for thread safety on the client side ? What
do the postgresql developpers use for that ?

Cheers.

Vincent
> merlin
>
>

Re: pg_advisory_locks in a multithreaded application context

From

Merlin Moncure

Date:

30 June 2011, 18:06:59

On Thu, Jun 30, 2011 at 8:49 AM, Vincent Ficet
<jean-vincent.ficet@bull.net> wrote:
> Merlin Moncure wrote:
>> On Thu, Jun 30, 2011 at 3:30 AM, Vincent Ficet
>> <jean-vincent.ficet@bull.net> wrote:
>>
>>> Merlin Moncure wrote:
>>>
>>>> On Wed, Jun 29, 2011 at 10:11 AM, Vincent Ficet
>>>> <jean-vincent.ficet@bull.net> wrote:
>>>>
>>>>
>>>>> Hello,
>>>>>
>>>>> I'm having some trouble trying to use postgresql locks. After trying
>>>>> several options and reading the postgresql online documentation, I still
>>>>> can't figure out what I'm doing wrong. Here's the use case:
>>>>>
>>>>> A multithreaded application collecting adapter firmwares on a network
>>>>> loads data into the following table:
>>>>>
>>>>> CREATE TABLE firmware (
>>>>>    id SERIAL NOT NULL,
>>>>>    type CHARACTER VARYING(32),
>>>>>    version CHARACTER VARYING(30),
>>>>>    build_id INTEGER,
>>>>>    date CHARACTER VARYING(25),
>>>>>    ps_id CHARACTER VARYING(25)
>>>>> );
>>>>>
>>>>> Typically, there are a few hundred adapters, but only 5 firmwares (many
>>>>> adapters should have the same firmware if the sysadmins did their jobs
>>>>> properly ;-) ).
>>>>>
>>>>> Only a single entry is required per firmware (many separate adapters can
>>>>> share the same firmware by pointing to the appropriate firmware id field).
>>>>> To make sure that only one entry is created per firmware, I use the
>>>>> following trigger:
>>>>>
>>>>> CREATE TRIGGER firmware_pre_insert_trigger
>>>>>    BEFORE INSERT ON firmware
>>>>>    FOR EACH ROW
>>>>>    EXECUTE PROCEDURE firmware_pre_insert_trigger_cb();
>>>>>
>>>>> CREATE FUNCTION firmware_pre_insert_trigger_cb() RETURNS TRIGGER
>>>>>    AS $_$
>>>>> DECLARE
>>>>>    fw_id INT;
>>>>> BEGIN
>>>>>
>>>>> SELECT fw.id FROM firmware fw INTO fw_id
>>>>> WHERE (fw.type = new.type AND
>>>>>       fw.version = new.version AND
>>>>>       fw.build_id = new.build_id AND
>>>>>       fw.date = new.date AND
>>>>>       fw.ps_id = new.ps_id);
>>>>>
>>>>> IF fw_id IS NULL THEN
>>>>>   -- create the non-existing firmware
>>>>>   RETURN new;
>>>>> ELSE
>>>>>   -- skip firmware which already exists
>>>>>   RETURN NULL;
>>>>> END IF;
>>>>>
>>>>> END;
>>>>> $_$
>>>>>    LANGUAGE PLPGSQL;
>>>>>
>>>>> When a thread wishes to add a firmware after discovering one adapter, it
>>>>> executes the following code:
>>>>>
>>>>> PERFORM pg_advisory_lock(1);
>>>>>
>>>>> INSERT INTO firmware (type, version, build_id, date, ps_id)
>>>>> VALUES (chip_type, firm_version, firm_build_id, firm_date, firm_ps_id);
>>>>>
>>>>> PERFORM pg_advisory_unlock(1);
>>>>>
>>>>>
>>>> Advisory lock is not going to work here.  You are releasing the lock
>>>> before the transaction resolves and that leaves a window for  second
>>>> transaction to do the 'select' and not see the data because it hasn't
>>>> committed yet.
>>>>
>>>>
>>>>
>>>>> Unfortunately, I still get duplicated entries using advisory locks, and
>>>>> they don't seem to lock anything at all...
>>>>>
>>>>> On the other hand, If I use builtin locks as follows without the
>>>>> trigger, I get deadlocks in the server logs:
>>>>>
>>>>> LOCK TABLE firmware IN SHARE MODE;
>>>>>
>>>>>
>>>> well a sharelock certainly isn't going to work.  share blocks row
>>>> exclusive (see http://www.postgresql.org/docs/current/static/explicit-locking.html#LOCKING-TABLES),
>>>> so two transactions can simultaneously get a share lock and wait for
>>>> each other to to resolve to get the exclusive lock on a row.
>>>> 'EXCLUSIVE' would be better (although that would effectively serialize
>>>> the transactions).
>>>>
>>>>
>>>>
>>> Thanks for the tip.
>>>
>>> It now works fine using a SHARE UPDATE EXCLUSIVE lock in the PRE INSERT
>>> trigger. This does not conflict with the  ROW EXCLUSIVE lock which is
>>> implicitely taken by the INSERT statement in add_firmware() function.
>>>
>>> BTW I think there might be a thread safety issue in postgres, as I often
>>> get a segfault when deadlocks occur (prior to applying the fix I just
>>> described). For example, the following deadlock situation:
>>>
>>>    DETAIL:  Process 7643 waits for RowExclusiveLock on relation 21060
>>> of database 20535; blocked by process 7593.
>>>            Process 7593 waits for RowExclusiveLock on relation 21060 of
>>> database 20535; blocked by process 7643.
>>>            Process 7643: SELECT
>>> add_firmware('0x08003800013731aa','hca','512.1792.0',0,'0920-10-06','BL_0010030001000');
>>>            Process 7593: SELECT
>>> add_firmware('0x08003800013734b0','hca','512.1792.0',0,'0920-10-06','BL_0010030001000');
>>>    HINT:  See server log for query details.
>>>    QUERY:  INSERT INTO firmware (type, version, build_id, date, ps_id)
>>>               VALUES (chip_type, firm_version, firm_build_id,
>>> firm_date, firm_ps_id)
>>>    CONTEXT:  PL/pgSQL function "add_firmware" line 31 at SQL statement
>>>
>>> Triggers:
>>>
>>> #0  0x0000003c46725742 in __strncpy_ssse3 () from /lib64/libc.so.6
>>> #1  0x00007ffff5f1617f in pqParseInput3 (conn=0x7ffff0000da0) at
>>> fe-protocol3.c:209
>>> #2  0x00007ffff5f0cae7 in parseInput (conn=0x7ffff0000da0) at fe-exec.c:1493
>>> #3  0x00007ffff5f0cc01 in PQgetResult (conn=0x7ffff0000da0) at
>>> fe-exec.c:1568
>>> #4  0x00007ffff5f0d26b in PQexecFinish (conn=0x7ffff0000da0) at
>>> fe-exec.c:1807
>>> #5  0x00007ffff5f0cee4 in PQexec (conn=0x7ffff0000da0,
>>> query=0x7ffff612d340 "SELECT 1") at fe-exec.c:1648
>>> #6  0x00007ffff612c344 in dbd_ping () from
>>> /home/vficet/X86_64/usr/lib/dbd/libdbdpgsql.so
>>>
>>> Stack traces often occur in different flavours, but can always be
>>> correlated with messages such as:
>>>
>>> DBI error -9: unexpected field count in "D" message
>>> DBI error -9: message contents do not agree with length in message type
>>> "T" server sent data ("D" message) without prior row description ("T"
>>> message)
>>>
>>> I'll try to set up a minimalist reproducer and see what can be done for
>>> that.
>>> Note that this happens with both postgresql 8.4.7 and 9.0.4.
>>>
>>
>> is the segfault happening in the client?
> Yes, it's on the client side.
>> it looks like you have a
>> thread safety issue either in dbd or in your code...
>>
> OK, I'll try to set up a minimalist reproducer. Is there a postgresql
> non regression test to check for thread safety on the client side ? What
> do the postgresql developpers use for that ?


well, you mainly need to make sure libpq is built for thread safety
(ldd libpq and check for libpthread) and that the code using libpq is
not accessing the same connection from > 1 thread at one time -- it's
up to you to synchronize access to a connection among threads.

merlin

Re: pg_advisory_locks in a multithreaded application context

From

Vincent Ficet

Date:

05 July 2011, 04:14:06

Hello Merlin,

Merlin Moncure wrote:
> On Thu, Jun 30, 2011 at 8:49 AM, Vincent Ficet
> <jean-vincent.ficet@bull.net> wrote:
>
>> Merlin Moncure wrote:
>>
>>> On Thu, Jun 30, 2011 at 3:30 AM, Vincent Ficet
>>> <jean-vincent.ficet@bull.net> wrote:
>>>
>>>
>>>> Merlin Moncure wrote:
>>>>
>>>>
>>>>> On Wed, Jun 29, 2011 at 10:11 AM, Vincent Ficet
>>>>> <jean-vincent.ficet@bull.net> wrote:
>>>>>
>>>>>
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I'm having some trouble trying to use postgresql locks. After trying
>>>>>> several options and reading the postgresql online documentation, I still
>>>>>> can't figure out what I'm doing wrong. Here's the use case:
>>>>>>
>>>>>> A multithreaded application collecting adapter firmwares on a network
>>>>>> loads data into the following table:
>>>>>>
>>>>>> CREATE TABLE firmware (
>>>>>>    id SERIAL NOT NULL,
>>>>>>    type CHARACTER VARYING(32),
>>>>>>    version CHARACTER VARYING(30),
>>>>>>    build_id INTEGER,
>>>>>>    date CHARACTER VARYING(25),
>>>>>>    ps_id CHARACTER VARYING(25)
>>>>>> );
>>>>>>
>>>>>> Typically, there are a few hundred adapters, but only 5 firmwares (many
>>>>>> adapters should have the same firmware if the sysadmins did their jobs
>>>>>> properly ;-) ).
>>>>>>
>>>>>> Only a single entry is required per firmware (many separate adapters can
>>>>>> share the same firmware by pointing to the appropriate firmware id field).
>>>>>> To make sure that only one entry is created per firmware, I use the
>>>>>> following trigger:
>>>>>>
>>>>>> CREATE TRIGGER firmware_pre_insert_trigger
>>>>>>    BEFORE INSERT ON firmware
>>>>>>    FOR EACH ROW
>>>>>>    EXECUTE PROCEDURE firmware_pre_insert_trigger_cb();
>>>>>>
>>>>>> CREATE FUNCTION firmware_pre_insert_trigger_cb() RETURNS TRIGGER
>>>>>>    AS $_$
>>>>>> DECLARE
>>>>>>    fw_id INT;
>>>>>> BEGIN
>>>>>>
>>>>>> SELECT fw.id FROM firmware fw INTO fw_id
>>>>>> WHERE (fw.type = new.type AND
>>>>>>       fw.version = new.version AND
>>>>>>       fw.build_id = new.build_id AND
>>>>>>       fw.date = new.date AND
>>>>>>       fw.ps_id = new.ps_id);
>>>>>>
>>>>>> IF fw_id IS NULL THEN
>>>>>>   -- create the non-existing firmware
>>>>>>   RETURN new;
>>>>>> ELSE
>>>>>>   -- skip firmware which already exists
>>>>>>   RETURN NULL;
>>>>>> END IF;
>>>>>>
>>>>>> END;
>>>>>> $_$
>>>>>>    LANGUAGE PLPGSQL;
>>>>>>
>>>>>> When a thread wishes to add a firmware after discovering one adapter, it
>>>>>> executes the following code:
>>>>>>
>>>>>> PERFORM pg_advisory_lock(1);
>>>>>>
>>>>>> INSERT INTO firmware (type, version, build_id, date, ps_id)
>>>>>> VALUES (chip_type, firm_version, firm_build_id, firm_date, firm_ps_id);
>>>>>>
>>>>>> PERFORM pg_advisory_unlock(1);
>>>>>>
>>>>>>
>>>>>>
>>>>> Advisory lock is not going to work here.  You are releasing the lock
>>>>> before the transaction resolves and that leaves a window for  second
>>>>> transaction to do the 'select' and not see the data because it hasn't
>>>>> committed yet.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Unfortunately, I still get duplicated entries using advisory locks, and
>>>>>> they don't seem to lock anything at all...
>>>>>>
>>>>>> On the other hand, If I use builtin locks as follows without the
>>>>>> trigger, I get deadlocks in the server logs:
>>>>>>
>>>>>> LOCK TABLE firmware IN SHARE MODE;
>>>>>>
>>>>>>
>>>>>>
>>>>> well a sharelock certainly isn't going to work.  share blocks row
>>>>> exclusive (see http://www.postgresql.org/docs/current/static/explicit-locking.html#LOCKING-TABLES),
>>>>> so two transactions can simultaneously get a share lock and wait for
>>>>> each other to to resolve to get the exclusive lock on a row.
>>>>> 'EXCLUSIVE' would be better (although that would effectively serialize
>>>>> the transactions).
>>>>>
>>>>>
>>>>>
>>>>>
>>>> Thanks for the tip.
>>>>
>>>> It now works fine using a SHARE UPDATE EXCLUSIVE lock in the PRE INSERT
>>>> trigger. This does not conflict with the  ROW EXCLUSIVE lock which is
>>>> implicitely taken by the INSERT statement in add_firmware() function.
>>>>
>>>> BTW I think there might be a thread safety issue in postgres, as I often
>>>> get a segfault when deadlocks occur (prior to applying the fix I just
>>>> described). For example, the following deadlock situation:
>>>>
>>>>    DETAIL:  Process 7643 waits for RowExclusiveLock on relation 21060
>>>> of database 20535; blocked by process 7593.
>>>>            Process 7593 waits for RowExclusiveLock on relation 21060 of
>>>> database 20535; blocked by process 7643.
>>>>            Process 7643: SELECT
>>>> add_firmware('0x08003800013731aa','hca','512.1792.0',0,'0920-10-06','BL_0010030001000');
>>>>            Process 7593: SELECT
>>>> add_firmware('0x08003800013734b0','hca','512.1792.0',0,'0920-10-06','BL_0010030001000');
>>>>    HINT:  See server log for query details.
>>>>    QUERY:  INSERT INTO firmware (type, version, build_id, date, ps_id)
>>>>               VALUES (chip_type, firm_version, firm_build_id,
>>>> firm_date, firm_ps_id)
>>>>    CONTEXT:  PL/pgSQL function "add_firmware" line 31 at SQL statement
>>>>
>>>> Triggers:
>>>>
>>>> #0  0x0000003c46725742 in __strncpy_ssse3 () from /lib64/libc.so.6
>>>> #1  0x00007ffff5f1617f in pqParseInput3 (conn=0x7ffff0000da0) at
>>>> fe-protocol3.c:209
>>>> #2  0x00007ffff5f0cae7 in parseInput (conn=0x7ffff0000da0) at fe-exec.c:1493
>>>> #3  0x00007ffff5f0cc01 in PQgetResult (conn=0x7ffff0000da0) at
>>>> fe-exec.c:1568
>>>> #4  0x00007ffff5f0d26b in PQexecFinish (conn=0x7ffff0000da0) at
>>>> fe-exec.c:1807
>>>> #5  0x00007ffff5f0cee4 in PQexec (conn=0x7ffff0000da0,
>>>> query=0x7ffff612d340 "SELECT 1") at fe-exec.c:1648
>>>> #6  0x00007ffff612c344 in dbd_ping () from
>>>> /home/vficet/X86_64/usr/lib/dbd/libdbdpgsql.so
>>>>
>>>> Stack traces often occur in different flavours, but can always be
>>>> correlated with messages such as:
>>>>
>>>> DBI error -9: unexpected field count in "D" message
>>>> DBI error -9: message contents do not agree with length in message type
>>>> "T" server sent data ("D" message) without prior row description ("T"
>>>> message)
>>>>
>>>> I'll try to set up a minimalist reproducer and see what can be done for
>>>> that.
>>>> Note that this happens with both postgresql 8.4.7 and 9.0.4.
>>>>
>>>>
>>> is the segfault happening in the client?
>>>
>> Yes, it's on the client side.
>>
>>> it looks like you have a
>>> thread safety issue either in dbd or in your code...
>>>
>>>
>> OK, I'll try to set up a minimalist reproducer. Is there a postgresql
>> non regression test to check for thread safety on the client side ? What
>> do the postgresql developpers use for that ?
>>
>
>
> well, you mainly need to make sure libpq is built for thread safety
> (ldd libpq and check for libpthread) and that the code using libpq is
> not accessing the same connection from > 1 thread at one time -- it's
> up to you to synchronize access to a connection among threads.
>
Yes you were right, this bug was caused by several threads accessing the
same connection in some error handling parts of our code, hence the
segfault.

Thanks for your help in this matter,

Cheers,

Vincent
> merlin
>
>