Thread: duplicate key violates unique constraint

duplicate key violates unique constraint

From
"Spiegelberg, Greg"
Date:
List,
 
We're using 8.2.1 in RedHat ES 4 here and I have a simple two column table: data_keys1.
 
                         Table "public.data_keys1"
 Column |  Type  |                        Modifiers
--------+--------+---------------------------------------------------------
 id     | bigint | not null default nextval('data_keys1_id_seq'::regclass)
 key1   | text   |
Indexes:
    "data_keys1_pkey" PRIMARY KEY, btree (id)
I also have a C program using libpq that populates this table via a PREPARE'd statement within a transaction.  Recently, while this table was being loaded, it encountered an error I hadn't seen before and is a bit confusing.
 
2007-05-09 09:27:07 EDT [22853] : LOG:  statement: EXECUTE insertKey1('vgdisplay');
2007-05-09 09:27:07 EDT [22853] : DETAIL:  prepare: PREPARE insertKey1(text) AS INSERT INTO public.data_keys1 (key1) values ($1);
2007-05-09 09:27:07 EDT [22853] : ERROR:  duplicate key violates unique constraint "data_keys1_pkey"
2007-05-09 09:27:07 EDT [22853] : STATEMENT:  EXECUTE insertKey1('vgdisplay');
 
As you can see, it's only providing the key1 column and the sequence is providing the value for the column with the constraint.  How can this be happening?  There were no other transactions, commits, individual inserts happenning at the same time or within several seconds of this one.
 
I'd like to explore sequences a bit more b/c as they are implemented in PostgreSQL is a little confusing.  Normally, I'd start another thread but it may have some bearing here.
 
 
Sequences...
 
Now, I don't know if this just hasn't been tested or is a documented feature (a.k.a. bug) but something does not seem right here.  In this test case I'm able to get the same sequence ID's via two psql connections to the same database on the same sequence.
 
Connect to "db" in two different psql sessions (I'll prefix them below with 1: and 2:) and in one create the table
 
1: db=# CREATE TABLE t1 ( s serial, i int);
1: db=# \d t1
                         Table "public.t1"
 Column |  Type   |                   Modifiers
--------+---------+------------------------------------------------
 s      | integer | not null default nextval('t1_s_seq'::regclass)
 i      | integer |

1: db=# SELECT * FROM t1_s_seq;
 sequence_name | last_value | increment_by |      max_value      | min_value | cache_value | log_cnt | is_cycled | is_called
---------------+------------+--------------+---------------------+-----------+-------------+---------+-----------+-----------
 t1_s_seq      |         12 |            1 | 9223372036854775807 |         1 |           1 |      32 | f         | t
(1 row)
 
So, cache on t1_s_seq is set to 1.  Not sure why the max_value is so high when the column was specified as a 'serial' not a 'serial8' but perhaps I should check the code out and submit a patch for that.  I digress.
 
Check out the value for t1_s_seq on connection #1.
 
1: db=# select nextval('t1_s_seq');
 nextval
---------
       1
(1 row)
 
And check the value for t1_s_seq on connection #2.
 
2: db=# select nextval('t1_s_seq');
 nextval
---------
       2
(1 row)
 
So far, so good.  Now start a transaction on connection #1, advance t1_s_seq by 10 but don't commit;
 
1: db=# begin;
BEGIN
1: db=# select setval('t1_s_seq', currval('t1_s_seq')+10);
 setval
--------
     11
(1 row)
 
And check the current value for t1_s_seq on connection #2.
 
2: db=# select currval('t1_s_seq');
 nextval
---------
       2
(1 row)
 
 
That's expected since the transaction on connection #1 hasn't been commited.  Now commit the transaction on connection #1 and check it's current value.
 
1: db=# commit;
COMMIT
1: db=# select currval('t1_s_seq');
 currval
---------
      11
(1 row)
 
Again, expected.  Now let's check the current value on connection #2 again.
 
2: db=# select currval('t1_s_seq');
 currval
---------
       2
(1 row)
 
This is where I take issue with the output.  I'm not sure what benefit cache value has as the client should, in this case, consulted with the backend as too the value of the sequence.
 
Sequences are suppose to be unique but in this case it seems that may not always be the case.  Sequences have some kind of odd relationship (no pun intended) with transactions in that they are in some cases in sync regardless of the connection or query and in other situations, such as above, are out of sync.
 
Just for fun, select on connection #2 the nextval of the sequence.
 
2: db=# select nextval('t1_s_seq');
 nextval
---------
      12
(1 row)
 
*boogle*
 
What have I done wrong here?  Does it have any bearing on my unique constraint error?  I wouldn't think so but I haven't found any other possible explanation.
 
TIA,
Greg
 
--
 Greg Spiegelberg
 Manager, Product Development
 ISOdx Solutions, a division of Cranel, Inc.
 614.318.4314, office
 614.431.8388, fax
 
 
 

Re: duplicate key violates unique constraint

From
Carol Walter
Date:
Greg,

Since Postgres allows you to insert a value into a field that is designated as a sequence number, is it possible that one was entered that the sequence tried to recreate?

And this is more than a question than an answer, but I thought that sequences had to designated as data type "serial."

Carol Walter

On May 10, 2007, at 9:38 AM, Spiegelberg, Greg wrote:

List,
 
We're using 8.2.1 in RedHat ES 4 here and I have a simple two column table: data_keys1.
 
                         Table "public.data_keys1"
 Column |  Type  |                        Modifiers
--------+--------+---------------------------------------------------------
 id     | bigint | not null default nextval('data_keys1_id_seq'::regclass)
 key1   | text   |
Indexes:
    "data_keys1_pkey" PRIMARY KEY, btree (id)
I also have a C program using libpq that populates this table via a PREPARE'd statement within a transaction.  Recently, while this table was being loaded, it encountered an error I hadn't seen before and is a bit confusing.
 
2007-05-09 09:27:07 EDT [22853] : LOG:  statement: EXECUTE insertKey1('vgdisplay');
2007-05-09 09:27:07 EDT [22853] : DETAIL:  prepare: PREPARE insertKey1(text) AS INSERT INTO public.data_keys1 (key1) values ($1);
2007-05-09 09:27:07 EDT [22853] : ERROR:  duplicate key violates unique constraint "data_keys1_pkey"
2007-05-09 09:27:07 EDT [22853] : STATEMENT:  EXECUTE insertKey1('vgdisplay');
 
As you can see, it's only providing the key1 column and the sequence is providing the value for the column with the constraint.  How can this be happening?  There were no other transactions, commits, individual inserts happenning at the same time or within several seconds of this one.
 
I'd like to explore sequences a bit more b/c as they are implemented in PostgreSQL is a little confusing.  Normally, I'd start another thread but it may have some bearing here.
 
 
Sequences...
 
Now, I don't know if this just hasn't been tested or is a documented feature (a.k.a. bug) but something does not seem right here.  In this test case I'm able to get the same sequence ID's via two psql connections to the same database on the same sequence.
 
Connect to "db" in two different psql sessions (I'll prefix them below with 1: and 2:) and in one create the table
 
1: db=# CREATE TABLE t1 ( s serial, i int);
1: db=# \d t1
                         Table "public.t1"
 Column |  Type   |                   Modifiers
--------+---------+------------------------------------------------
 s      | integer | not null default nextval('t1_s_seq'::regclass)
 i      | integer |

1: db=# SELECT * FROM t1_s_seq;
 sequence_name | last_value | increment_by |      max_value      | min_value | cache_value | log_cnt | is_cycled | is_called
---------------+------------+--------------+---------------------+-----------+-------------+---------+-----------+-----------
 t1_s_seq      |         12 |            1 | 9223372036854775807 |         1 |           1 |      32 | f         | t
(1 row)
 
So, cache on t1_s_seq is set to 1.  Not sure why the max_value is so high when the column was specified as a 'serial' not a 'serial8' but perhaps I should check the code out and submit a patch for that.  I digress.
 
Check out the value for t1_s_seq on connection #1.
 
1: db=# select nextval('t1_s_seq');
 nextval
---------
       1
(1 row)
 
And check the value for t1_s_seq on connection #2.
 
2: db=# select nextval('t1_s_seq');
 nextval
---------
       2
(1 row)
 
So far, so good.  Now start a transaction on connection #1, advance t1_s_seq by 10 but don't commit;
 
1: db=# begin;
BEGIN
1: db=# select setval('t1_s_seq', currval('t1_s_seq')+10);
 setval
--------
     11
(1 row)
 
And check the current value for t1_s_seq on connection #2.
 
2: db=# select currval('t1_s_seq');
 nextval
---------
       2
(1 row)
 
 
That's expected since the transaction on connection #1 hasn't been commited.  Now commit the transaction on connection #1 and check it's current value.
 
1: db=# commit;
COMMIT
1: db=# select currval('t1_s_seq');
 currval
---------
      11
(1 row)
 
Again, expected.  Now let's check the current value on connection #2 again.
 
2: db=# select currval('t1_s_seq');
 currval
---------
       2
(1 row)
 
This is where I take issue with the output.  I'm not sure what benefit cache value has as the client should, in this case, consulted with the backend as too the value of the sequence.
 
Sequences are suppose to be unique but in this case it seems that may not always be the case.  Sequences have some kind of odd relationship (no pun intended) with transactions in that they are in some cases in sync regardless of the connection or query and in other situations, such as above, are out of sync.
 
Just for fun, select on connection #2 the nextval of the sequence.
 
2: db=# select nextval('t1_s_seq');
 nextval
---------
      12
(1 row)
 
*boogle*
 
What have I done wrong here?  Does it have any bearing on my unique constraint error?  I wouldn't think so but I haven't found any other possible explanation.
 
TIA,
Greg
 
--
 Greg Spiegelberg
 Manager, Product Development
 ISOdx Solutions, a division of Cranel, Inc.
 614.318.4314, office
 614.431.8388, fax
 
 
 

Re: duplicate key violates unique constraint

From
Tom Lane
Date:
"Spiegelberg, Greg" <gspiegelberg@isodxsolutions.com> writes:
> As you can see, it's only providing the key1 column and the sequence is
> providing the value for the column with the constraint.  How can this be
> happening?

Perhaps at some point you manually inserted an id value past the
then-current sequence value?

> Now, I don't know if this just hasn't been tested or is a documented
> feature (a.k.a. bug) but something does not seem right here.  In this
> test case I'm able to get the same sequence ID's via two psql
> connections to the same database on the same sequence.

No, you've misunderstood currval().  That gives the value most recently
obtained by nextval() within your own session; it is not affected by any
subsequent manipulation of the sequence.

            regards, tom lane

Re: duplicate key violates unique constraint

From
"Mikko Partio"
Date:


On 5/10/07, Spiegelberg, Greg <gspiegelberg@isodxsolutions.com> wrote:
List,

 
Check out the value for t1_s_seq on connection #1.
 
1: db=# select nextval('t1_s_seq');
 nextval
---------
       1
(1 row)
 
And check the value for t1_s_seq on connection #2.
 
2: db=# select nextval('t1_s_seq');
 nextval
---------
       2
(1 row)
 
So far, so good.  Now start a transaction on connection #1, advance t1_s_seq by 10 but don't commit;
 
1: db=# begin;
BEGIN
1: db=# select setval('t1_s_seq', currval('t1_s_seq')+10);
 setval
--------
     11
(1 row)
 
And check the current value for t1_s_seq on connection #2.
 
2: db=# select currval('t1_s_seq');
 nextval
---------
       2
(1 row)
 
 
That's expected since the transaction on connection #1 hasn't been commited.  Now commit the transaction on connection #1 and check it's current value.
 
1: db=# commit;
COMMIT
1: db=# select currval('t1_s_seq');
 currval
---------
      11
(1 row)
 
Again, expected.  Now let's check the current value on connection #2 again.
 
2: db=# select currval('t1_s_seq');
 currval
---------
       2
(1 row)
 
This is where I take issue with the output.  I'm not sure what benefit cache value has as the client should, in this case, consulted with the backend as too the value of the sequence.
 
Sequences are suppose to be unique but in this case it seems that may not always be the case.  Sequences have some kind of odd relationship (no pun intended) with transactions in that they are in some cases in sync regardless of the connection or query and in other situations, such as above, are out of sync.
 
Just for fun, select on connection #2 the nextval of the sequence.
 
2: db=# select nextval('t1_s_seq');
 nextval
---------
      12
(1 row)
 
*boogle*
 
What have I done wrong here?  Does it have any bearing on my unique constraint error?  I wouldn't think so but I haven't found any other possible explanation.
 
TIA,
Greg
 


The documentation says:

currval

Return the value most recently obtained by nextval for this sequence in the current session. (An error is reported if nextval has never been called for this sequence in this session.) Notice that because this is returning a session-local value, it gives a predictable answer whether or not other sessions have executed nextval since the current session did.


So currval is session specific. So I'd say this is expected behaviour.

Regards

MP

Re: duplicate key violates unique constraint

From
Jim Nasby
Date:
On May 10, 2007, at 9:01 AM, Carol Walter wrote:
> And this is more than a question than an answer, but I thought that
> sequences had to designated as data type "serial."

Nope... sequences are actually completely un-related to tables. You
can define a sequence that's not referenced by any table. The serial
data type is essentially a 'macro' that:

Creates a sequence
Sets the default value for the field to be the nextval() of that
sequence
Sets the field to be NOT NULL
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)



Re: duplicate key violates unique constraint

From
Carol Walter
Date:
Yes, I've seen "serial" used in other places than the primary key,
but I was referring to Greg's output.  The data type for the sequence
is listed as "bigint".  My thought was that the data type was listed
as "serial" for sequences.  I just looked at my database and I was
wrong about that.  It does list the data type as an integer when you
"describe" the table.

Carol

On May 10, 2007, at 5:17 PM, Jim Nasby wrote:

> On May 10, 2007, at 9:01 AM, Carol Walter wrote:
>> And this is more than a question than an answer, but I thought
>> that sequences had to designated as data type "serial."
>
> Nope... sequences are actually completely un-related to tables. You
> can define a sequence that's not referenced by any table. The
> serial data type is essentially a 'macro' that:
>
> Creates a sequence
> Sets the default value for the field to be the nextval() of that
> sequence
> Sets the field to be NOT NULL
> --
> Jim Nasby                                            jim@nasby.net
> EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)
>
>


Re: duplicate key violates unique constraint

From
"Spiegelberg, Greg"
Date:
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>
> "Spiegelberg, Greg" <gspiegelberg@isodxsolutions.com> writes:
> > As you can see, it's only providing the key1 column and the
> sequence is
> > providing the value for the column with the constraint.
> How can this be
> > happening?
>
> Perhaps at some point you manually inserted an id value past the
> then-current sequence value?

I have to assume that though I don't see it anywhere in the logs and the
logs are turned up all the way.  The database is continually being
dropped and recreated automatically.  Why this doesn't happen every time
the same data is imported or more often is leaving me scratching my
head.


> No, you've misunderstood currval().  That gives the value
> most recently
> obtained by nextval() within your own session; it is not
> affected by any
> subsequent manipulation of the sequence.

Thanks.  That does explain it.  I had incorrectly assumed currval()
always returned what the real current value of the sequence.