Re: Duplicate key violation on upsert - Mailing list pgsql-general

From Adrian Klaver
Subject Re: Duplicate key violation on upsert
Date
Msg-id a7aecaad-5bb5-3daf-95e8-2c815039f66c@aklaver.com
Whole thread Raw
In response to Re: Duplicate key violation on upsert  (Matt Magoffin <postgresql.org@msqr.us>)
Responses Re: Duplicate key violation on upsert
List pgsql-general
On 3/20/20 1:32 PM, Matt Magoffin wrote:
> 
>> On 21/03/2020, at 8:10 AM, Adrian Klaver <adrian.klaver@aklaver.com 
>> <mailto:adrian.klaver@aklaver.com>> wrote:
>>
>>> The _hyper_1_1931_chunk_da_datum_x_acc_idx index has the same 
>>> definition as the da_datum_x_acc_idx above (it is defined on a child 
>>> table). That is, they are both essentially:
>>> UNIQUE, btree (node_id, source_id, ts DESC, jdata_a) WHERE jdata_a IS 
>>> NOT NULL
>>> The da_datum_pkey index is what the ON CONFLICT cause refers to, so 
>>> (node_id, ts, source_id) is UNIQUE as well.
>>
>> Hmm, wonder if you are getting bit by this?:
>>
>> https://www.postgresql.org/docs/12/sql-insert.html#SQL-ON-CONFLICT
>>
>> "INSERT with an ON CONFLICT DO UPDATE clause is a “deterministic” 
>> statement. This means that the command will not be allowed to affect 
>> any single existing row more than once; a cardinality violation error 
>> will be raised when this situation arises. Rows proposed for insertion 
>> should not duplicate each other in terms of attributes constrained by 
>> an arbiter index or constraint.”
> 
> I’m not sure I’m wrapping my head around this. The INSERT affects 1 row 
> as the unique values (node_id, ts, source_id) are specified in the 
> statement. Is it possible that da_datum_x_acc_idx is used as the arbiter 
> index in this situation, rather than da_datum_pkey (that I intended), 
> and you’re saying that the jdata_a column is getting updated twice, 
> first in the INSERT and second in the DO UPDATE, triggering the 
> duplicate key violation?

I was thinking more about this:

"INSERT INTO solardatum.da_datum(ts, node_id, source_id, posted, 
jdata_i, jdata_a, jdata_s, jdata_t)
    VALUES (…) ..."

from your OP. Namely whether it was:

VALUES (), (), (), ...

and if so there were values in the (),(),() that duplicated each other.

As to the second part of your response, ON CONFLICT does one of either 
INSERT or UPDATE. If:

1) There is no conflict for ON CONFLICT (node_id, ts, source_id) then 
the INSERT proceeds.

2) If there is a conflict then an UPDATE occurs using the SET values.

Now just me working through this:

da_datum_pkey       = (node_id, ts, source_id)
da_datum_x_acc_idx  = (node_id, source_id, ts DESC, jdata_a)

If 1) from above applies then da_datum_x_acc_idx will not be tripped as 
the only way that could happen is if the node_id, ts, source_id was the 
same as an existing row and that can't be true because the PK over the 
same values passed.

If 2) from above happened then you are trying to UPDATE a row with 
matching PK values(node_id, ts, source_id). Now it is entirely possible 
that since you are not testing for constraint violation on (node_id, 
source_id, ts DESC, jdata_a) that you be doing SET jdata_a = 
EXCLUDED.jdata_a, using a value that would trip da_datum_x_acc_idx

> 
> — m@
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



pgsql-general by date:

Previous
From: Alastair McKinley
Date:
Subject: Re: Explain says 8 workers planned, only 1 executed
Next
From: pabloa98
Date:
Subject: Re: Could postgres12 support millions of sequences? (like 10 million)