Thread: Duplicate key violation

Duplicate key violation

From
Brian Wipf
Date:
I got a duplicate key violation when the following query was performed:

INSERT INTO category_product_visible (category_id, product_id)
        SELECT     cp.category_id, cp.product_id
        FROM     category_product cp
        WHERE     cp.product_id = $1 AND
            not exists (
                select     'x'
                from     category_product_visible cpv
                where     cpv.product_id = cp.product_id and
                    cpv.category_id = cp.category_id
            );

This is despite the fact the insert is written to only insert rows
that do not already exist. The second time the same query was run it
went through okay. This makes me think there is some kind of race
condition, which I didn't think was possible with PostgreSQL's MVCC
implementation. I'm unable to duplicate the problem now and the error
only occurred once in weeks of use. This is on PostgreSQL 8.2.1
running on openSUSE Linux 10.2. Slony-I 1.2.6 is being used for
replication to a single slave database.

I'll monitor the problem and if it recurs, I'll rebuild the primary
key index. Perhaps the category_product_visible_pkey index was/is
corrupted in some way.

Brian Wipf
<brian@clickspace.com>


The exact error was:
select process_pending_changes(); FAILED!!! Message: ERROR: duplicate
key violates unique constraint "category_product_visible_pkey"
CONTEXT: SQL statement "INSERT INTO category_product_visible
(category_id, product_id) SELECT cp.category_id, cp.product_id FROM
category_product cp WHERE cp.product_id = $1 AND not exists ( select
'x from category_product_visible cpv where cpv.product_id =
cp.product_id an cpv.category_id = cp.category_id);" PL/pgSQL
function "insert_cpv" line 3 at SQL statement PL/pgSQL function
"process_mp_change" line 15 at assignment PL/pgSQL function
"process_pending_changes" line 13 at assignment

The insert_cpv(...) function and table definitions follow. I can
provide any other information required.

CREATE FUNCTION insert_cpv(
    my_product_id     int
) RETURNS boolean AS $$
DECLARE
BEGIN
    INSERT INTO category_product_visible (category_id, product_id)
        SELECT     cp.category_id, cp.product_id
        FROM     category_product cp
        WHERE     cp.product_id = $1 AND
            not exists (
                select     'x'
                from     category_product_visible cpv
                where     cpv.product_id = cp.product_id and
                    cpv.category_id = cp.category_id
            );
    return found;
END;
$$ LANGUAGE plpgSQL;

\d category_product
   Table "public.category_product"
    Column    |  Type   | Modifiers
-------------+---------+-----------
category_id | integer | not null
product_id  | integer | not null
Indexes:
     "x_category_product_pk" PRIMARY KEY, btree (category_id,
product_id)
     "x_category_product__category_id_fk_idx" btree (category_id)
     "x_category_product__product_id_fk_idx" btree (product_id)
Foreign-key constraints:
     "x_category_product_category_fk" FOREIGN KEY (category_id)
REFERENCES category(category_id) DEFERRABLE INITIALLY DEFERRED
     "x_category_product_product_fk" FOREIGN KEY (product_id)
REFERENCES product(product_id) DEFERRABLE INITIALLY DEFERRED
Triggers:
     _ssprod_replication_cluster_logtrigger_17 AFTER INSERT OR DELETE
OR UPDATE ON category_product FOR EACH ROW EXECUTE PROCEDURE
_ssprod_replication_cluster.logtrigger('_ssprod_replication_cluster',
'17', 'kk')
     category_product_trigger BEFORE INSERT OR DELETE ON
category_product FOR EACH ROW EXECUTE PROCEDURE
category_product_trigger()

\d category_product_visible
          Table "public.category_product_visible"
        Column        |          Type          | Modifiers
---------------------+------------------------+-----------
category_id         | integer                | not null
product_id          | integer                | not null
Indexes:
     "category_product_visible_pkey" PRIMARY KEY, btree (category_id,
product_id)
     "category_product_visible__product_id_fk_idx" btree (product_id)
Triggers:
     _ssprod_replication_cluster_logtrigger_18 AFTER INSERT OR DELETE
OR UPDATE ON category_product_visible FOR EACH ROW EXECUTE PROCEDURE
_ssprod_replication_cluster.logtrigger('_ssprod_replication_cluster',
'18', 'kvvvvvkvvvvvv')


Re: Duplicate key violation

From
Tom Lane
Date:
Brian Wipf <brian@clickspace.com> writes:
> I got a duplicate key violation when the following query was performed:
> INSERT INTO category_product_visible (category_id, product_id)
>         SELECT     cp.category_id, cp.product_id
>         FROM     category_product cp
>         WHERE     cp.product_id = $1 AND
>             not exists (
>                 select     'x'
>                 from     category_product_visible cpv
>                 where     cpv.product_id = cp.product_id and
>                     cpv.category_id = cp.category_id
>             );

> This is despite the fact the insert is written to only insert rows
> that do not already exist. The second time the same query was run it
> went through okay. This makes me think there is some kind of race
> condition, which I didn't think was possible with PostgreSQL's MVCC
> implementation.

If you're doing more than one of these concurrently, then of course
there's a race condition: the NOT EXISTS is testing for nonexistence
as of the query snapshot.  If two sessions do this concurrently then
they'll try to insert the same rows and one of them is gonna fail.

            regards, tom lane

Re: Duplicate key violation

From
"Adam Rich"
Date:
Sounds like you'll either need an explicit "LOCK TABLE"
command, set your transaction isolation to serializable,
or use advisory locking.

http://www.postgresql.org/docs/8.2/interactive/explicit-locking.html#LOC
KING-TABLES
http://www.postgresql.org/docs/8.2/interactive/transaction-iso.html#XACT
-SERIALIZABLE
http://www.postgresql.org/docs/8.2/interactive/functions-admin.html#FUNC
TIONS-ADVISORY-LOCKS



-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Tom Lane
Sent: Thursday, January 25, 2007 6:21 PM
To: Brian Wipf
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Duplicate key violation


Brian Wipf <brian@clickspace.com> writes:
> I got a duplicate key violation when the following query was
performed:
> INSERT INTO category_product_visible (category_id, product_id)
>         SELECT     cp.category_id, cp.product_id
>         FROM     category_product cp
>         WHERE     cp.product_id = $1 AND
>             not exists (
>                 select     'x'
>                 from     category_product_visible cpv
>                 where     cpv.product_id = cp.product_id
and
>                     cpv.category_id = cp.category_id
>             );

> This is despite the fact the insert is written to only insert rows
> that do not already exist. The second time the same query was run it
> went through okay. This makes me think there is some kind of race
> condition, which I didn't think was possible with PostgreSQL's MVCC
> implementation.

If you're doing more than one of these concurrently, then of course
there's a race condition: the NOT EXISTS is testing for nonexistence
as of the query snapshot.  If two sessions do this concurrently then
they'll try to insert the same rows and one of them is gonna fail.

            regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend


Re: Duplicate key violation

From
"Joris Dobbelsteen"
Date:
>-----Original Message-----
>From: pgsql-general-owner@postgresql.org
>[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Brian Wipf
>Sent: donderdag 25 januari 2007 22:42
>To: pgsql-general@postgresql.org
>Subject: [GENERAL] Duplicate key violation
>
>I got a duplicate key violation when the following query was performed:
>
>INSERT INTO category_product_visible (category_id, product_id)
>        SELECT     cp.category_id, cp.product_id
>        FROM     category_product cp
>        WHERE     cp.product_id = $1 AND
>            not exists (
>                select     'x'
>                from     category_product_visible cpv
>                where     cpv.product_id =
>cp.product_id and
>                    cpv.category_id = cp.category_id
>            );
>
>This is despite the fact the insert is written to only insert
>rows that do not already exist. The second time the same query
>was run it went through okay. This makes me think there is
>some kind of race condition, which I didn't think was possible
>with PostgreSQL's MVCC implementation. I'm unable to duplicate
>the problem now and the error only occurred once in weeks of
>use. This is on PostgreSQL 8.2.1 running on openSUSE Linux
>10.2. Slony-I 1.2.6 is being used for replication to a single
>slave database.
>
[snip]

This section is relevant:
http://www.postgresql.org/docs/8.2/interactive/transaction-iso.html

In the default isolation level "Read commited" you are protected against
"dirty reads".
You are not protected against "nonrepeatable reads" and "phantom reads".

In fact if you start a transaction now, others are not prevented from
inserting records. This can result in a situation where you did not find
the record, since someone else has just instead it after your
transaction was started.

This is not a race condition, but a side-effect.

- Joris