Thread: On duplicate ignore

On duplicate ignore

From
"Gnanakumar"
Date:
Hi,

Ours is a web-based application.  We're trying to implement ON DUPLICATE
IGNORE for one of our application table, named EMAILLIST.  After a quick
Google search, I'm finding the following "easy & convenient" single SQL
statement syntax to follow with:

INSERT INTO EMAILLIST (EMAIL)
       SELECT 'john@example.net'
       WHERE NOT EXISTS (SELECT 1 FROM EMAILLIST WHERE EMAIL =
'john@example.net');

My question is, in a single threaded INSERT, this will *definitely* work.
Since ours is a web-based application, will this work out in a concurrent
multi-threaded environment too?  In other words, will it be truly unique
when INSERT calls are concurrent?

Regards,
Gnanam



Re: On duplicate ignore

From
Atul Goel
Date:
Just create a unique index on EMAIL column and handle error if it comes

Thanks,
Atul Goel

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Gnanakumar
Sent: 18 January 2012 11:04
To: pgsql-general@postgresql.org
Subject: [GENERAL] On duplicate ignore

Hi,

Ours is a web-based application.  We're trying to implement ON DUPLICATE IGNORE for one of our application table, named
EMAILLIST. After a quick Google search, I'm finding the following "easy & convenient" single SQL statement syntax to
followwith: 

INSERT INTO EMAILLIST (EMAIL)
       SELECT 'john@example.net'
       WHERE NOT EXISTS (SELECT 1 FROM EMAILLIST WHERE EMAIL = 'john@example.net');

My question is, in a single threaded INSERT, this will *definitely* work.
Since ours is a web-based application, will this work out in a concurrent multi-threaded environment too?  In other
words,will it be truly unique when INSERT calls are concurrent? 

Regards,
Gnanam



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

The information contained in this email is strictly confidential and for the use of the addressee only, unless
otherwiseindicated. If you are not the intended recipient, please do not read, copy, use or disclose to others this
messageor any attachment. Please also notify the sender by replying to this email or by telephone (+44 (0)20 7896 0011)
andthen delete the email and any copies of it. Opinions, conclusions (etc) that do not relate to the official business
ofthis company shall be understood as neither given nor endorsed by it. IG Group Holdings plc is a company registered
inEngland and Wales under number 04677092. VAT registration number 761 2978 07. Registered Office: Cannon Bridge House,
25Dowgate Hill, London EC4R 2YA. Listed on the London Stock Exchange. Its subsidiaries IG Markets Limited and IG Index
Limitedare authorised and regulated by the Financial Services Authority (IG Markets Limited FSA registration number
195355and IG Index Limited FSA registration number 114059). 

Re: On duplicate ignore

From
"Gnanakumar"
Date:
> Just create a unique index on EMAIL column and handle error if it comes

Thanks for your suggestion.  Of course, I do understand that this could be
enforced/imposed at the database-level at any time.  But I'm trying to find
out whether this could be solved at the application layer itself.  Any
thoughts/ideas?



Re: On duplicate ignore

From
Dmitriy Igrishin
Date:
Hey Gnanakumar,

2012/1/18 Gnanakumar <gnanam@zoniac.com>
> Just create a unique index on EMAIL column and handle error if it comes

Thanks for your suggestion.  Of course, I do understand that this could be
enforced/imposed at the database-level at any time.  But I'm trying to find
out whether this could be solved at the application layer itself.  Any
thoughts/ideas?
Exactly at the application level you just need to ignore
an unique constraint violation error reported by the backend.
You may also wrap INSERT statement in the PL/pgSQL
function or in the DO statement and catch the exception
generated by the backend.

// Dmitriy.


Re: On duplicate ignore

From
Atul Goel
Date:
No way you can make sure at application level. Think in sense of an uncommitted row and other session inserting at the
samemoment in time. 


Thanks,
Atul Goel


-----Original Message-----
From: Gnanakumar [mailto:gnanam@zoniac.com]
Sent: 18 January 2012 12:59
To: Atul Goel; pgsql-general@postgresql.org
Subject: RE: [GENERAL] On duplicate ignore

> Just create a unique index on EMAIL column and handle error if it
> comes

Thanks for your suggestion.  Of course, I do understand that this could be enforced/imposed at the database-level at
anytime.  But I'm trying to find out whether this could be solved at the application layer itself.  Any thoughts/ideas? 



The information contained in this email is strictly confidential and for the use of the addressee only, unless
otherwiseindicated. If you are not the intended recipient, please do not read, copy, use or disclose to others this
messageor any attachment. Please also notify the sender by replying to this email or by telephone (+44 (0)20 7896 0011)
andthen delete the email and any copies of it. Opinions, conclusions (etc) that do not relate to the official business
ofthis company shall be understood as neither given nor endorsed by it. IG Group Holdings plc is a company registered
inEngland and Wales under number 04677092. VAT registration number 761 2978 07. Registered Office: Cannon Bridge House,
25Dowgate Hill, London EC4R 2YA. Listed on the London Stock Exchange. Its subsidiaries IG Markets Limited and IG Index
Limitedare authorised and regulated by the Financial Services Authority (IG Markets Limited FSA registration number
195355and IG Index Limited FSA registration number 114059). 

Re: On duplicate ignore

From
Florian Weimer
Date:
* Gnanakumar:

>> Just create a unique index on EMAIL column and handle error if it comes
>
> Thanks for your suggestion.  Of course, I do understand that this could be
> enforced/imposed at the database-level at any time.  But I'm trying to find
> out whether this could be solved at the application layer itself.  Any
> thoughts/ideas?

If you use serializable transactions in PostgreSQL 9.1, you can
implement such constraints in the application without additional
locking.  However, with concurrent writes and without an index, the rate
of detected serialization violations and resulting transactions aborts
will be high.

--
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstraße 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99

Re: On duplicate ignore

From
Scott Marlowe
Date:
On Thu, Jan 19, 2012 at 7:54 AM, Florian Weimer <fweimer@bfk.de> wrote:
> * Gnanakumar:
>
>>> Just create a unique index on EMAIL column and handle error if it comes
>>
>> Thanks for your suggestion.  Of course, I do understand that this could be
>> enforced/imposed at the database-level at any time.  But I'm trying to find
>> out whether this could be solved at the application layer itself.  Any
>> thoughts/ideas?
>
> If you use serializable transactions in PostgreSQL 9.1, you can
> implement such constraints in the application without additional
> locking.  However, with concurrent writes and without an index, the rate
> of detected serialization violations and resulting transactions aborts
> will be high.

No, you sadly can't.  PostgreSQL doesn't yet support proper predicate
locking to allow the application to be sure that the OP's original
statement, and ones like it, don't have a race condition.  A unique
index is the only way to be sure.

Re: On duplicate ignore

From
Scott Marlowe
Date:
On Thu, Jan 19, 2012 at 9:49 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Thu, Jan 19, 2012 at 7:54 AM, Florian Weimer <fweimer@bfk.de> wrote:
>> * Gnanakumar:
>>
>>>> Just create a unique index on EMAIL column and handle error if it comes
>>>
>>> Thanks for your suggestion.  Of course, I do understand that this could be
>>> enforced/imposed at the database-level at any time.  But I'm trying to find
>>> out whether this could be solved at the application layer itself.  Any
>>> thoughts/ideas?
>>
>> If you use serializable transactions in PostgreSQL 9.1, you can
>> implement such constraints in the application without additional
>> locking.  However, with concurrent writes and without an index, the rate
>> of detected serialization violations and resulting transactions aborts
>> will be high.
>
> No, you sadly can't.  PostgreSQL doesn't yet support proper predicate
> locking to allow the application to be sure that the OP's original
> statement, and ones like it, don't have a race condition.  A unique
> index is the only way to be sure.

Wait, did 9.1 implement proper predicate locking to allow this?  If so
I apologize for being out of the loop on the new versions.

Re: On duplicate ignore

From
Florian Weimer
Date:
* Scott Marlowe:

> On Thu, Jan 19, 2012 at 7:54 AM, Florian Weimer <fweimer@bfk.de> wrote:
>> * Gnanakumar:
>>
>>>> Just create a unique index on EMAIL column and handle error if it comes
>>>
>>> Thanks for your suggestion.  Of course, I do understand that this could be
>>> enforced/imposed at the database-level at any time.  But I'm trying to find
>>> out whether this could be solved at the application layer itself.  Any
>>> thoughts/ideas?
>>
>> If you use serializable transactions in PostgreSQL 9.1, you can
>> implement such constraints in the application without additional
>> locking.  However, with concurrent writes and without an index, the rate
>> of detected serialization violations and resulting transactions aborts
>> will be high.
>
> No, you sadly can't.  PostgreSQL doesn't yet support proper predicate
> locking to allow the application to be sure that the OP's original
> statement, and ones like it, don't have a race condition.  A unique
> index is the only way to be sure.

Huh?  This was one of the major new features in PostgreSQL 9.1.

--
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstraße 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99

Re: On duplicate ignore

From
Lincoln Yeoh
Date:
At 10:54 PM 1/19/2012, Florian Weimer wrote:
>* Gnanakumar:
>
> >> Just create a unique index on EMAIL column and handle error if it comes
> >
> > Thanks for your suggestion.  Of course, I do understand that this could be
> > enforced/imposed at the database-level at any time.  But I'm trying to find
> > out whether this could be solved at the application layer itself.  Any
> > thoughts/ideas?
>
>If you use serializable transactions in PostgreSQL 9.1, you can
>implement such constraints in the application without additional
>locking.  However, with concurrent writes and without an index, the rate
>of detected serialization violations and resulting transactions aborts
>will be high.

Would writing application-side code to handle those transaction
aborts in 9.1 be much easier than writing code to handle transaction
aborts/DB exceptions due to unique constraint violations? These
transaction aborts have to be handled differently (e.g. retried for X
seconds/Y tries) from other sort of transaction aborts (not retried).

Otherwise I don't see the benefit of this feature for this scenario.
Unless of course you get significantly better performance by not
having a unique constraint.

If insert performance is not an issue and code simplicity is
preferred, one could lock the table (with an exclusive lock mode),
then do the selects and inserts, that way your code can assume that
any transaction aborts are due to actual problems rather than
concurrency. Which often means less code to write :).

Regards,
Link.





Re: On duplicate ignore

From
Florian Weimer
Date:
* Lincoln Yeoh:

>>If you use serializable transactions in PostgreSQL 9.1, you can
>>implement such constraints in the application without additional
>>locking.  However, with concurrent writes and without an index, the rate
>>of detected serialization violations and resulting transactions aborts
>>will be high.
>
> Would writing application-side code to handle those transaction aborts
> in 9.1 be much easier than writing code to handle transaction
> aborts/DB exceptions due to unique constraint violations? These
> transaction aborts have to be handled differently (e.g. retried for X
> seconds/Y tries) from other sort of transaction aborts (not retried).

There's a separate error code, so it's easier to deal with in theory.
However, I don't think that's sufficient justification for removing the
unique constraints.

> Otherwise I don't see the benefit of this feature for this
> scenario. Unless of course you get significantly better performance by
> not having a unique constraint.

Performance is worse.

> If insert performance is not an issue and code simplicity is
> preferred, one could lock the table (with an exclusive lock mode),
> then do the selects and inserts, that way your code can assume that
> any transaction aborts are due to actual problems rather than
> concurrency. Which often means less code to write :).

Choosing the right lock is a bit tricky because you usually want to
block INSERTs only.  Explicit locks on a hash of the unique column,
using pg_advisory_xact_lock, are often an alternative.

--
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstraße 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99

Re: On duplicate ignore

From
Lincoln Yeoh
Date:
At 04:27 PM 1/20/2012, Florian Weimer wrote:
>* Lincoln Yeoh:
>
> >>If you use serializable transactions in PostgreSQL 9.1, you can
> >>implement such constraints in the application without additional
> >>locking.  However, with concurrent writes and without an index, the rate
> >>of detected serialization violations and resulting transactions aborts
> >>will be high.
> >
> > Would writing application-side code to handle those transaction aborts
> > in 9.1 be much easier than writing code to handle transaction
> > aborts/DB exceptions due to unique constraint violations? These
> > transaction aborts have to be handled differently (e.g. retried for X
> > seconds/Y tries) from other sort of transaction aborts (not retried).
>
>There's a separate error code, so it's easier to deal with in theory.

Is there a simple way to get postgresql to retry a transaction, or
does the application have to actually reissue all the necessary
statements again?

I'd personally prefer to use locking and selects to avoid transaction
aborts whether due to unique constraint violations or due to
serialization violations.

But I'm lazy ;).

Regards,
Link.


Re: On duplicate ignore

From
Florian Weimer
Date:
* Lincoln Yeoh:

> Is there a simple way to get postgresql to retry a transaction, or
> does the application have to actually reissue all the necessary
> statements again?

The application has to re-run the transaction, which might result in the
execution of different statements.  In the INSERT-or-UPDATE case, the
new attempt will have to use an UPDATE instead of an INSERT, so replying
the statements verbatim will not work.

> I'd personally prefer to use locking and selects to avoid transaction
> aborts whether due to unique constraint violations or due to
> serialization violations.

Once you address the restart issue, transactional code is simpler and
easier to check for correctness.

Restarting transactions has other benefits, too.  For instance, you can
restart your PostgreSQL server process, and your applications will just
keep running.

--
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstraße 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99