Re: Duplicate key violation on upsert - Mailing list pgsql-general

From Matt Magoffin
Subject Re: Duplicate key violation on upsert
Date
Msg-id EB9192B4-9EAC-4376-BC8B-5D9CEF56F5FE@msqr.us
Whole thread Raw
In response to Re: Duplicate key violation on upsert  (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses Re: Duplicate key violation on upsert
List pgsql-general

On 22/03/2020, at 8:11 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

I was thinking more about this:
"INSERT INTO solardatum.da_datum(ts, node_id, source_id, posted, jdata_i, jdata_a, jdata_s, jdata_t)
    VALUES (…) ..."
from your OP. Namely whether it was:
VALUES (), (), (), ...
and if so there were values in the (),(),() that duplicated each other.
As to the second part of your response, ON CONFLICT does one of either INSERT or UPDATE. If:
1) There is no conflict for ON CONFLICT (node_id, ts, source_id) then the INSERT proceeds.
2) If there is a conflict then an UPDATE occurs using the SET values.
Now just me working through this:
da_datum_pkey       = (node_id, ts, source_id)
da_datum_x_acc_idx  = (node_id, source_id, ts DESC, jdata_a)
If 1) from above applies then da_datum_x_acc_idx will not be tripped as the only way that could happen is if the node_id, ts, source_id was the same as an existing row and that can't be true because the PK over the same values passed.

Well the below is complete rot. If you are UPDATEing then you are not creating a duplicate row, just overwriting a value with itself.

If 2) from above happened then you are trying to UPDATE a row with matching PK values(node_id, ts, source_id). Now it is entirely possible that since you are not testing for constraint violation on (node_id, source_id, ts DESC, jdata_a) that you be doing SET jdata_a = EXCLUDED.jdata_a, using a value that would trip da_datum_x_acc_idx

Sorry for the vagueness in my OP, I was trying to make it easier to read. The VALUES are for individual single column values, so a single possible row to insert/update.

So what you’ve outlined is basically what I thought should be happening. Namely, there can be only one row that will be inserted/updated. I am wondering if I should re-create the da_datum_x_acc_idx index without UNIQUE? I had it as UNIQUE to optimise the type of queries that make use of that index… but I did a little bit of testing using a non-UNIQUE index and those queries appear to execute around the same time as with the UNIQUE index. I just wasn’t sure if that would just be masking some other problem in my setup.

— m@

pgsql-general by date:

Previous
From: Alastair McKinley
Date:
Subject: Re: Explain says 8 workers planned, only 1 executed
Next
From: Adrian Klaver
Date:
Subject: Re: Could postgres12 support millions of sequences? (like 10 million)