Re: Duplicate Key Values - Mailing list pgsql-general
From | mark bradley |
---|---|
Subject | Re: Duplicate Key Values |
Date | |
Msg-id | SJ2PR22MB4328EAEBDB0C930CC0655149BAD52@SJ2PR22MB4328.namprd22.prod.outlook.com Whole thread Raw |
In response to | Re: Duplicate Key Values (mark bradley <markbradyju@outlook.com>) |
List | pgsql-general |
Here are some of the references from Copilot
From: mark bradley <markbradyju@outlook.com>
Sent: Friday, March 7, 2025 9:34 AM
To: Adrian Klaver <adrian.klaver@aklaver.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Friday, March 7, 2025 9:34 AM
To: Adrian Klaver <adrian.klaver@aklaver.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
This is what MS Copilot has to say about this apparent bug where Postgres inserts extra rows violating a primary keys uniqueness constraint:
Yes, this issue has been encountered by others. There are a few potential reasons why this might happen:
- Sequence Out of Sync: Sometimes, the sequence that generates unique values for the primary key can become out of sync, especially after a bulk import or a database restore. You can check if the sequence is out of sync and reset it if necessary.
- Index Corruption: Index corruption can occur due to various reasons, such as hardware failures or bugs in earlier versions of PostgreSQL. This can lead to duplicate primary keys being inserted.
- Table Inheritance: If you are using table inheritance, primary keys are not enforced among inherited tables. This can lead to duplicates if not handled correctly.
- Application Logic: Sometimes, the application logic might inadvertently insert duplicate records. Reviewing the application code and insert statements can help identify and resolve such issues.
To resolve the issue, you can:
- Check and reset the sequence if it's out of sync.
- Rebuild the index if it's corrupted.
Any of the first 3 could be involved. There isn't an application involved other than pgAdmin.
- Originally, the key in the node table was a sequence, but I changed it to a non-sequence.
- There is no index on the primary key node_id, and I understand there should be one.
- I didn't explicitly use Postgres inheritance but there are two tables that are subclasses of node. There are dataset nodes and processing_node [s] tables. Each is a type of node and have primary keys that are foreign keys from the node table. This key is node_id.
What to do? I hesitate to just delete my tables and start over because this error will reoccur.
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Thursday, March 6, 2025 3:34 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Thursday, March 6, 2025 3:34 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
On 3/6/25 12:06, mark bradley wrote:
My mistake I forgot to Cc list on my previous post, which was:
That would be an issue and also would mean it is not a PK.
In psql do:
\d dataset
and show the results as text in your reply.
Ccing list
The below shows there is an index("dataset_pkey") on node_id.
Note, reindexing will take a lock on the table that prevents changing
data while the operation is running. See the below for more information:
https://www.postgresql.org/docs/current/sql-reindex.html
If the table is not to big and you can interrupt access to it then the
simplest command to run would be:
REINDEX TABLE dataset;
> Universal Metadata Schema=# \d dataset
> Table "public.dataset"
> Column | Type | Collation |
> Nullable | Defau
> lt
> ---------------------------+---------------------------+-----------+----------+------
> ---
> node_id | integer | |
> not null |
> dataset_name | character varying(25) | |
> not null |
> notes | text | |
> |
> dataset_type | database_type | |
> not null |
> dataset_maturity | database_maturity_type | |
> not null |
> disposition | disposition_type | |
> not null |
> start_date | date | |
> |
> end_date | date | |
> |
> most_recent_update | date | |
> |
> update_periodicity | interval | |
> |
> system_of_record | text | |
> |
> point_of_contact | integer | |
> not null |
> dataset_url | text | |
> |
> classification_level | classification_level_type | |
> not null |
> physical_location | text | |
> |
> quality_control | yes_no_type | |
> not null |
> dataset_documentation_url | text | |
> not null |
> description | text | |
> |
> node_type | node_type | |
> |
> dummy | integer | |
> |
> Indexes:
> "dataset_pkey" PRIMARY KEY, btree (node_id)
> Foreign-key constraints:
> "node_id" FOREIGN KEY (node_id) REFERENCES node(node_id) NOT VALID
> "poc" FOREIGN KEY (point_of_contact) REFERENCES poc(poc_id) NOT VALID
> Referenced by:
> TABLE "dataset_table" CONSTRAINT "dataset" FOREIGN KEY (node_id)
> REFERENCES datas
> et(node_id) NOT VALID
> TABLE "dataset_subject" CONSTRAINT "dataset_subject_node_id_fkey"
> FOREIGN KEY (no
> de_id) REFERENCES dataset(node_id)
> TABLE "system_dataset" CONSTRAINT "system_dataset_node_id_fkey"
> FOREIGN KEY (node
> _id) REFERENCES dataset(node_id) NOT VALID
> Inherits: node
>
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Thursday, March 6, 2025 3:03 PM
> *To:* mark bradley <markbradyju@outlook.com>
> *Subject:* Re: Duplicate Key Values
> On 3/6/25 10:51, mark bradley wrote:
> Reply to list alos.
> Ccing list.
>
>> Looks like there is no index on node_id at the moment
>
> That would be an issue and also would mean it is not a PK.
>
> In psql do:
>
> \d dataset
>
> and show the results as text in your reply.
>
>>
>>
>> Mark Brady, Ph.D.
>> Deputy Chief Data Officer, TRMC
>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>_
>> ------------------------------------------------------------------------
>> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
>> *Sent:* Thursday, March 6, 2025 1:22 PM
>> *To:* mark bradley <markbradyju@outlook.com>; Ron Johnson
>> <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
>> *Subject:* Re: Duplicate Key Values
>> On 3/6/25 10:11, mark bradley wrote:
>>> Here is the table definition:
>>>
>>>
>>> And here is the error message I get when I try to delete a duplicate:
>>
>> Please answer the following:
>>
>> 1) Did you not see duplicates with the old version of pgAdmin4?
>>
>> 2) What do you see if you use psql?
>>
>> 3) Did you upgrade/move the Postgres server or the underlying OS?
>>
>> 4) Have you tried reindexing the node_id field?
>>
>>>
>>>
>>> Mark Brady,
>>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
>> <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>>_
>>> ------------------------------------------------------------------------
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com
>>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
--
Adrian Klaver
adrian.klaver@aklaver.com
My mistake I forgot to Cc list on my previous post, which was:
That would be an issue and also would mean it is not a PK.
In psql do:
\d dataset
and show the results as text in your reply.
Ccing list
The below shows there is an index("dataset_pkey") on node_id.
Note, reindexing will take a lock on the table that prevents changing
data while the operation is running. See the below for more information:
https://www.postgresql.org/docs/current/sql-reindex.html
If the table is not to big and you can interrupt access to it then the
simplest command to run would be:
REINDEX TABLE dataset;
> Universal Metadata Schema=# \d dataset
> Table "public.dataset"
> Column | Type | Collation |
> Nullable | Defau
> lt
> ---------------------------+---------------------------+-----------+----------+------
> ---
> node_id | integer | |
> not null |
> dataset_name | character varying(25) | |
> not null |
> notes | text | |
> |
> dataset_type | database_type | |
> not null |
> dataset_maturity | database_maturity_type | |
> not null |
> disposition | disposition_type | |
> not null |
> start_date | date | |
> |
> end_date | date | |
> |
> most_recent_update | date | |
> |
> update_periodicity | interval | |
> |
> system_of_record | text | |
> |
> point_of_contact | integer | |
> not null |
> dataset_url | text | |
> |
> classification_level | classification_level_type | |
> not null |
> physical_location | text | |
> |
> quality_control | yes_no_type | |
> not null |
> dataset_documentation_url | text | |
> not null |
> description | text | |
> |
> node_type | node_type | |
> |
> dummy | integer | |
> |
> Indexes:
> "dataset_pkey" PRIMARY KEY, btree (node_id)
> Foreign-key constraints:
> "node_id" FOREIGN KEY (node_id) REFERENCES node(node_id) NOT VALID
> "poc" FOREIGN KEY (point_of_contact) REFERENCES poc(poc_id) NOT VALID
> Referenced by:
> TABLE "dataset_table" CONSTRAINT "dataset" FOREIGN KEY (node_id)
> REFERENCES datas
> et(node_id) NOT VALID
> TABLE "dataset_subject" CONSTRAINT "dataset_subject_node_id_fkey"
> FOREIGN KEY (no
> de_id) REFERENCES dataset(node_id)
> TABLE "system_dataset" CONSTRAINT "system_dataset_node_id_fkey"
> FOREIGN KEY (node
> _id) REFERENCES dataset(node_id) NOT VALID
> Inherits: node
>
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Thursday, March 6, 2025 3:03 PM
> *To:* mark bradley <markbradyju@outlook.com>
> *Subject:* Re: Duplicate Key Values
> On 3/6/25 10:51, mark bradley wrote:
> Reply to list alos.
> Ccing list.
>
>> Looks like there is no index on node_id at the moment
>
> That would be an issue and also would mean it is not a PK.
>
> In psql do:
>
> \d dataset
>
> and show the results as text in your reply.
>
>>
>>
>> Mark Brady, Ph.D.
>> Deputy Chief Data Officer, TRMC
>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>_
>> ------------------------------------------------------------------------
>> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
>> *Sent:* Thursday, March 6, 2025 1:22 PM
>> *To:* mark bradley <markbradyju@outlook.com>; Ron Johnson
>> <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
>> *Subject:* Re: Duplicate Key Values
>> On 3/6/25 10:11, mark bradley wrote:
>>> Here is the table definition:
>>>
>>>
>>> And here is the error message I get when I try to delete a duplicate:
>>
>> Please answer the following:
>>
>> 1) Did you not see duplicates with the old version of pgAdmin4?
>>
>> 2) What do you see if you use psql?
>>
>> 3) Did you upgrade/move the Postgres server or the underlying OS?
>>
>> 4) Have you tried reindexing the node_id field?
>>
>>>
>>>
>>> Mark Brady,
>>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
>> <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>>_
>>> ------------------------------------------------------------------------
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com
>>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
--
Adrian Klaver
adrian.klaver@aklaver.com
pgsql-general by date: