Thread: Duplicate key error

Duplicate key error

From
yudhi s
Date:
We have a merge query as below for a partition table which is range partitioned on a truncated date column 'part_date'. And the only unique key in this table is a composite primary key on (id, part_date). And this merge queries ON condition is based on one of the columns i.e ID which is the leading column of the PK.So it means it will ensure no duplication happens for ID values, but still we are seeing "duplicate key" error. So what is the possible reason here or are we encountering any buggy behaviour here?

WITH source_data (col1, col2, col3.....col29) AS (VALUES ($1, $2::date, $3::timestamptz, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27::timestamptz, $28, $29::timestamptz)) MERGE INTO cpod.TAB1 AS target USING source_data AS source ON target.ID = source.ID WHEN MATCHED THEN UPDATE SET ....) WHEN NOT MATCHED THEN INSERT (...) VALUES (....);

Error:
ERROR: duplicate key value violates unique constraint "TAB1_pkey"
Detail: Key (ID, part_date)=(XXXXXXXXX, 2024-11-04) already exists.


Regards
Yudhi

Re: Duplicate key error

From
Achilleas Mantzios - cloud
Date:
On 11/11/24 07:40, yudhi s wrote:

> We have a merge query as below for a partition table which is range 
> partitioned on a truncated date column 'part_date'. And the only 
> unique key in this table is a composite primary key on (id, 
> part_date). And this merge queries ON condition is based on one of the 
> columns i.e ID which is the leading column of the PK.So it means it 
> will ensure no duplication happens for ID values, but still we are 
> seeing "duplicate key" error. So what is the possible reason here or 
> are we encountering any buggy behaviour here?
>
> WITH source_data (col1, col2, col3.....col29) AS (VALUES ($1, 
> $2::date, $3::timestamptz, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, 
> $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, 
> $27::timestamptz, $28, $29::timestamptz)) MERGE INTO cpod.TAB1 AS 
> target USING source_data AS source ON target.ID = source.ID WHEN 
> MATCHED THEN UPDATE SET ....) WHEN NOT MATCHED THEN INSERT (...) 
> VALUES (....);
>
> Error:
> ERROR: duplicate key value violates unique constraint "TAB1_pkey"
> Detail: Key (ID, part_date)=(XXXXXXXXX, 2024-11-04) already exists.

You have to use the whole composite unique key (including part_date) 
when matching against source_data .

If you had uniqueness on "ID" and then added "part_name" to the key, 
you'd still had uniqueness.

BUT, reversely, if you have uniqueness on the pair (ID, part_date) there 
is no guaranteed uniqueness on ID alone, hence your ERROR.

>
> Regards
> Yudhi
>



Re: Duplicate key error

From
yudhi s
Date:


On Mon, Nov 11, 2024 at 1:57 PM Achilleas Mantzios - cloud <a.mantzios@cloud.gatewaynet.com> wrote:

> WITH source_data (col1, col2, col3.....col29) AS (VALUES ($1,
> $2::date, $3::timestamptz, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13,
> $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26,
> $27::timestamptz, $28, $29::timestamptz)) MERGE INTO cpod.TAB1 AS
> target USING source_data AS source ON target.ID = source.ID WHEN
> MATCHED THEN UPDATE SET ....) WHEN NOT MATCHED THEN INSERT (...)
> VALUES (....);
>
> Error:
> ERROR: duplicate key value violates unique constraint "TAB1_pkey"
> Detail: Key (ID, part_date)=(XXXXXXXXX, 2024-11-04) already exists.

You have to use the whole composite unique key (including part_date)
when matching against source_data .

If you had uniqueness on "ID" and then added "part_name" to the key,
you'd still had uniqueness.

BUT, reversely, if you have uniqueness on the pair (ID, part_date) there
is no guaranteed uniqueness on ID alone, hence your ERROR.



Actually the table is partitioned on column part_date which is why the unique key is composite i.e. on ID and part_date. So in that case even we merge on one of the column i.e. ID which is unique in itself, if any duplicate value ID tries to be merged it should satisfy the MATCH criteria and thus it should do a UPDATE on table TAB1 rather than going for NOMATCH or INSERT. So in that case it should not error out with "duplicate key violation". Is my understanding correct here?

Re: Duplicate key error

From
Peter Geoghegan
Date:
On Mon, Nov 11, 2024 at 12:41 AM yudhi s <learnerdatabase99@gmail.com> wrote:
> So it means it will ensure no duplication happens for ID values, but still we are seeing "duplicate key" error. So
whatis the possible reason here or are we encountering any buggy behaviour here? 

MERGE doesn't actually make any promises about not getting unique
violations. Only ON CONFLICT DO UPDATE (and ON CONFLICT DO NOTHING)
make such a promise. That's the main reason why Postgres supports
both.

--
Peter Geoghegan



Re: Duplicate key error

From
yudhi s
Date:


On Tue, Nov 12, 2024 at 1:35 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Mon, Nov 11, 2024 at 12:41 AM yudhi s <learnerdatabase99@gmail.com> wrote:
> So it means it will ensure no duplication happens for ID values, but still we are seeing "duplicate key" error. So what is the possible reason here or are we encountering any buggy behaviour here?

MERGE doesn't actually make any promises about not getting unique
violations. Only ON CONFLICT DO UPDATE (and ON CONFLICT DO NOTHING)
make such a promise. That's the main reason why Postgres supports
both.


Okay. But here in this Merge statement it should first compare the ON clause which is the value of ID column and if its exists in the target table then its a MATCH which means it will do the UPDATE and if its not available in the target table then its a NOT MATCH and it will do the INSERT, so i am wondering at what exact situation it will throw duplicate key error. Also the WITH clause will only pick one record at a time and run the MERGE, so it will only merge one record at a time and then commit. Can you share your thoughts on how exactly this merge query can possibly cause the duplicate key error?

Re: Duplicate key error

From
Peter Geoghegan
Date:
On Tue, Nov 12, 2024 at 3:38 PM yudhi s <learnerdatabase99@gmail.com> wrote:
> Can you share your thoughts on how exactly this merge query can possibly cause the duplicate key error?

MERGE doesn't take any special precautions to avoid such unique
violations. ON CONFLICT does. It really is that simple.

Your app will be prone to the same sorts of errors (at READ COMMITTED
isolation level) if you do inserts conditioned on the absence of an
existing/conflict row. You'd have to retry the statement to plug the
race condition, which is how Postgres users did upserts prior to the
introduction of ON CONFLICT DO UPDATE.

--
Peter Geoghegan