Thread: Duplicate Key Values
Although I did not enter them, somehow duplicate primary key values have appeared in one of my tables. I can't remove the offending rows because the key is a foreign key in another table.
What to do?
Mark Brady,
On Wed, Mar 5, 2025 at 12:36 PM mark bradley <markbradyju@outlook.com> wrote:
Although I did not enter them, somehow duplicate primary key values have appeared in one of my tables.
Is it a text/varchar column? Has the distro been upgraded "recently", or maybe streamed from an older Linux system to a newer Linux system?
I can't remove the offending rows because the key is a foreign key in another table.What to do?
Are the records completely (i.e., all columns) identical?
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
What does "and unique identifier" mean?
(Creating a Primary Key constraint automatically creates a unique no-nulls-allowed index on the column(s), so no other index is needed.)
Can you show the table definition?
Anyway, you should be able to delete one of the duplicate node_id records by specifying another column with a different value;
On Wed, Mar 5, 2025 at 6:14 PM mark bradley <markbradyju@outlook.com> wrote:
Hi Ron,The key is an integer. I'm using pGAdmin4 and recently updated to the latest version.The records are not all identical, some have NULL values of the non-key for some unknown reason. Here is a screenshot. The key is node_id, which I have specified as a key and unique identifier.Mark Brady,From: Ron Johnson <ronljohnsonjr@gmail.com>
Sent: Wednesday, March 5, 2025 12:50 PM
To: pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: Re: Duplicate Key ValuesOn Wed, Mar 5, 2025 at 12:36 PM mark bradley <markbradyju@outlook.com> wrote:Although I did not enter them, somehow duplicate primary key values have appeared in one of my tables.Is it a text/varchar column? Has the distro been upgraded "recently", or maybe streamed from an older Linux system to a newer Linux system?I can't remove the offending rows because the key is a foreign key in another table.What to do?Are the records completely (i.e., all columns) identical?--Death to <Redacted>, and butter sauce.Don't boil me, I'm still alive.<Redacted> lobster!
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
Attachment
On 3/5/25 15:37, Ron Johnson wrote: > What does "and unique identifier" mean? > > (Creating a Primary Key constraint automatically creates a unique > no-nulls-allowed index on the column(s), so no other index is needed.) > > Can you show the table definition? > > Anyway, you /should/ be able to delete one of the duplicate node_id > records by specifying another column with a different value; > > On Wed, Mar 5, 2025 at 6:14 PM mark bradley <markbradyju@outlook.com > <mailto:markbradyju@outlook.com>> wrote: > > Hi Ron, > > The key is an integer. I'm using pGAdmin4 and recently updated to > the latest version. 1) Did you not see duplicates with the old version of pgAdmin4? 2) What do you see if you use psql? 3) Did you upgrade/move the Postgres server or the underlying OS? 4) Have you tried reindexing the node_id field? > > The records are not all identical, some have NULL values of the > non-key for some unknown reason. Here is a screenshot. The key is > node_id, which I have specified as a key and unique identifier. > > > -- Adrian Klaver adrian.klaver@aklaver.com
Here is the table definition:

And here is the error message I get when I try to delete a duplicate:

Mark Brady,
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Wednesday, March 5, 2025 6:49 PM
To: Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Wednesday, March 5, 2025 6:49 PM
To: Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
On 3/5/25 15:37, Ron Johnson wrote:
> What does "and unique identifier" mean?
>
> (Creating a Primary Key constraint automatically creates a unique
> no-nulls-allowed index on the column(s), so no other index is needed.)
>
> Can you show the table definition?
>
> Anyway, you /should/ be able to delete one of the duplicate node_id
> records by specifying another column with a different value;
>
> On Wed, Mar 5, 2025 at 6:14 PM mark bradley <markbradyju@outlook.com
> <mailto:markbradyju@outlook.com>> wrote:
>
> Hi Ron,
>
> The key is an integer. I'm using pGAdmin4 and recently updated to
> the latest version.
1) Did you not see duplicates with the old version of pgAdmin4?
2) What do you see if you use psql?
3) Did you upgrade/move the Postgres server or the underlying OS?
4) Have you tried reindexing the node_id field?
>
> The records are not all identical, some have NULL values of the
> non-key for some unknown reason. Here is a screenshot. The key is
> node_id, which I have specified as a key and unique identifier.
>
>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
> What does "and unique identifier" mean?
>
> (Creating a Primary Key constraint automatically creates a unique
> no-nulls-allowed index on the column(s), so no other index is needed.)
>
> Can you show the table definition?
>
> Anyway, you /should/ be able to delete one of the duplicate node_id
> records by specifying another column with a different value;
>
> On Wed, Mar 5, 2025 at 6:14 PM mark bradley <markbradyju@outlook.com
> <mailto:markbradyju@outlook.com>> wrote:
>
> Hi Ron,
>
> The key is an integer. I'm using pGAdmin4 and recently updated to
> the latest version.
1) Did you not see duplicates with the old version of pgAdmin4?
2) What do you see if you use psql?
3) Did you upgrade/move the Postgres server or the underlying OS?
4) Have you tried reindexing the node_id field?
>
> The records are not all identical, some have NULL values of the
> non-key for some unknown reason. Here is a screenshot. The key is
> node_id, which I have specified as a key and unique identifier.
>
>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
Attachment
I tried adding a dummy column with different values and then deleting one, as you suggested. Got the same message.
From: mark bradley <markbradyju@outlook.com>
Sent: Thursday, March 6, 2025 1:11 PM
To: Adrian Klaver <adrian.klaver@aklaver.com>; Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Thursday, March 6, 2025 1:11 PM
To: Adrian Klaver <adrian.klaver@aklaver.com>; Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Here is the table definition:

And here is the error message I get when I try to delete a duplicate:

Mark Brady,
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Wednesday, March 5, 2025 6:49 PM
To: Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Wednesday, March 5, 2025 6:49 PM
To: Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
On 3/5/25 15:37, Ron Johnson wrote:
> What does "and unique identifier" mean?
>
> (Creating a Primary Key constraint automatically creates a unique
> no-nulls-allowed index on the column(s), so no other index is needed.)
>
> Can you show the table definition?
>
> Anyway, you /should/ be able to delete one of the duplicate node_id
> records by specifying another column with a different value;
>
> On Wed, Mar 5, 2025 at 6:14 PM mark bradley <markbradyju@outlook.com
> <mailto:markbradyju@outlook.com>> wrote:
>
> Hi Ron,
>
> The key is an integer. I'm using pGAdmin4 and recently updated to
> the latest version.
1) Did you not see duplicates with the old version of pgAdmin4?
2) What do you see if you use psql?
3) Did you upgrade/move the Postgres server or the underlying OS?
4) Have you tried reindexing the node_id field?
>
> The records are not all identical, some have NULL values of the
> non-key for some unknown reason. Here is a screenshot. The key is
> node_id, which I have specified as a key and unique identifier.
>
>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
> What does "and unique identifier" mean?
>
> (Creating a Primary Key constraint automatically creates a unique
> no-nulls-allowed index on the column(s), so no other index is needed.)
>
> Can you show the table definition?
>
> Anyway, you /should/ be able to delete one of the duplicate node_id
> records by specifying another column with a different value;
>
> On Wed, Mar 5, 2025 at 6:14 PM mark bradley <markbradyju@outlook.com
> <mailto:markbradyju@outlook.com>> wrote:
>
> Hi Ron,
>
> The key is an integer. I'm using pGAdmin4 and recently updated to
> the latest version.
1) Did you not see duplicates with the old version of pgAdmin4?
2) What do you see if you use psql?
3) Did you upgrade/move the Postgres server or the underlying OS?
4) Have you tried reindexing the node_id field?
>
> The records are not all identical, some have NULL values of the
> non-key for some unknown reason. Here is a screenshot. The key is
> node_id, which I have specified as a key and unique identifier.
>
>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
Attachment
On 3/6/25 10:11, mark bradley wrote: > Here is the table definition: > > > And here is the error message I get when I try to delete a duplicate: Please answer the following: 1) Did you not see duplicates with the old version of pgAdmin4? 2) What do you see if you use psql? 3) Did you upgrade/move the Postgres server or the underlying OS? 4) Have you tried reindexing the node_id field? > > > Mark Brady, > _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_ > ------------------------------------------------------------------------ -- Adrian Klaver adrian.klaver@aklaver.com
On 3/6/25 10:36, mark bradley wrote: Reply to list also. Ccing list. > 1) Did you not see duplicates with the old version of pgAdmin4? > I did see it in my last update but have done a couple, so it should have > happened then. > > 2) What do you see if you use psql? > Here is the PSQL output: Supply information as copy and paste text. > > > 3) Did you upgrade/move the Postgres server or the underlying OS? > No > > 4) Have you tried reindexing the node_id field? > No, but I'll try that. > > > > Mark Brady, > _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_ > ------------------------------------------------------------------------ > *From:* Adrian Klaver <adrian.klaver@aklaver.com> > *Sent:* Thursday, March 6, 2025 1:22 PM > *To:* mark bradley <markbradyju@outlook.com>; Ron Johnson > <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org> > *Subject:* Re: Duplicate Key Values > On 3/6/25 10:11, mark bradley wrote: >> Here is the table definition: >> >> >> And here is the error message I get when I try to delete a duplicate: > > Please answer the following: > > 1) Did you not see duplicates with the old version of pgAdmin4? > > 2) What do you see if you use psql? > > 3) Did you upgrade/move the Postgres server or the underlying OS? > > 4) Have you tried reindexing the node_id field? > >> >> >> Mark Brady, >> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady > <https://amazon.com/author/markjbrady>>_ >> ------------------------------------------------------------------------ > > -- > Adrian Klaver > adrian.klaver@aklaver.com > -- Adrian Klaver adrian.klaver@aklaver.com
On 3/6/25 12:06, mark bradley wrote: My mistake I forgot to Cc list on my previous post, which was: That would be an issue and also would mean it is not a PK. In psql do: \d dataset and show the results as text in your reply. Ccing list The below shows there is an index("dataset_pkey") on node_id. Note, reindexing will take a lock on the table that prevents changing data while the operation is running. See the below for more information: https://www.postgresql.org/docs/current/sql-reindex.html If the table is not to big and you can interrupt access to it then the simplest command to run would be: REINDEX TABLE dataset; > Universal Metadata Schema=# \d dataset > Table "public.dataset" > Column | Type | Collation | > Nullable | Defau > lt > ---------------------------+---------------------------+-----------+----------+------ > --- > node_id | integer | | > not null | > dataset_name | character varying(25) | | > not null | > notes | text | | > | > dataset_type | database_type | | > not null | > dataset_maturity | database_maturity_type | | > not null | > disposition | disposition_type | | > not null | > start_date | date | | > | > end_date | date | | > | > most_recent_update | date | | > | > update_periodicity | interval | | > | > system_of_record | text | | > | > point_of_contact | integer | | > not null | > dataset_url | text | | > | > classification_level | classification_level_type | | > not null | > physical_location | text | | > | > quality_control | yes_no_type | | > not null | > dataset_documentation_url | text | | > not null | > description | text | | > | > node_type | node_type | | > | > dummy | integer | | > | > Indexes: > "dataset_pkey" PRIMARY KEY, btree (node_id) > Foreign-key constraints: > "node_id" FOREIGN KEY (node_id) REFERENCES node(node_id) NOT VALID > "poc" FOREIGN KEY (point_of_contact) REFERENCES poc(poc_id) NOT VALID > Referenced by: > TABLE "dataset_table" CONSTRAINT "dataset" FOREIGN KEY (node_id) > REFERENCES datas > et(node_id) NOT VALID > TABLE "dataset_subject" CONSTRAINT "dataset_subject_node_id_fkey" > FOREIGN KEY (no > de_id) REFERENCES dataset(node_id) > TABLE "system_dataset" CONSTRAINT "system_dataset_node_id_fkey" > FOREIGN KEY (node > _id) REFERENCES dataset(node_id) NOT VALID > Inherits: node > > > Best regards, > Mark Brady > _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_ > ------------------------------------------------------------------------ > *From:* Adrian Klaver <adrian.klaver@aklaver.com> > *Sent:* Thursday, March 6, 2025 3:03 PM > *To:* mark bradley <markbradyju@outlook.com> > *Subject:* Re: Duplicate Key Values > On 3/6/25 10:51, mark bradley wrote: > Reply to list alos. > Ccing list. > >> Looks like there is no index on node_id at the moment > > That would be an issue and also would mean it is not a PK. > > In psql do: > > \d dataset > > and show the results as text in your reply. > >> >> >> Mark Brady, Ph.D. >> Deputy Chief Data Officer, TRMC >> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady > <https://amazon.com/author/markjbrady>>_ >> ------------------------------------------------------------------------ >> *From:* Adrian Klaver <adrian.klaver@aklaver.com> >> *Sent:* Thursday, March 6, 2025 1:22 PM >> *To:* mark bradley <markbradyju@outlook.com>; Ron Johnson >> <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org> >> *Subject:* Re: Duplicate Key Values >> On 3/6/25 10:11, mark bradley wrote: >>> Here is the table definition: >>> >>> >>> And here is the error message I get when I try to delete a duplicate: >> >> Please answer the following: >> >> 1) Did you not see duplicates with the old version of pgAdmin4? >> >> 2) What do you see if you use psql? >> >> 3) Did you upgrade/move the Postgres server or the underlying OS? >> >> 4) Have you tried reindexing the node_id field? >> >>> >>> >>> Mark Brady, >>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady >> <https://amazon.com/author/markjbrady > <https://amazon.com/author/markjbrady>>>_ >>> ------------------------------------------------------------------------ >> >> -- >> Adrian Klaver >> adrian.klaver@aklaver.com >> > > -- > Adrian Klaver > adrian.klaver@aklaver.com > -- Adrian Klaver adrian.klaver@aklaver.com
This is what MS Copilot has to say about this apparent bug where Postgres inserts extra rows violating a primary keys uniqueness constraint:
Yes, this issue has been encountered by others. There are a few potential reasons why this might happen:
- Sequence Out of Sync: Sometimes, the sequence that generates unique values for the primary key can become out of sync, especially after a bulk import or a database restore. You can check if the sequence is out of sync and reset it if necessary.
- Index Corruption: Index corruption can occur due to various reasons, such as hardware failures or bugs in earlier versions of PostgreSQL. This can lead to duplicate primary keys being inserted.
- Table Inheritance: If you are using table inheritance, primary keys are not enforced among inherited tables. This can lead to duplicates if not handled correctly.
- Application Logic: Sometimes, the application logic might inadvertently insert duplicate records. Reviewing the application code and insert statements can help identify and resolve such issues.
To resolve the issue, you can:
- Check and reset the sequence if it's out of sync.
- Rebuild the index if it's corrupted.
Any of the first 3 could be involved. There isn't an application involved other than pgAdmin.
- Originally, the key in the node table was a sequence, but I changed it to a non-sequence.
- There is no index on the primary key node_id, and I understand there should be one.
- I didn't explicitly use Postgres inheritance but there are two tables that are subclasses of node. There are dataset nodes and processing_node [s] tables. Each is a type of node and have primary keys that are foreign keys from the node table. This key is node_id.
What to do? I hesitate to just delete my tables and start over because this error will reoccur.
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Thursday, March 6, 2025 3:34 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Thursday, March 6, 2025 3:34 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
On 3/6/25 12:06, mark bradley wrote:
My mistake I forgot to Cc list on my previous post, which was:
That would be an issue and also would mean it is not a PK.
In psql do:
\d dataset
and show the results as text in your reply.
Ccing list
The below shows there is an index("dataset_pkey") on node_id.
Note, reindexing will take a lock on the table that prevents changing
data while the operation is running. See the below for more information:
https://www.postgresql.org/docs/current/sql-reindex.html
If the table is not to big and you can interrupt access to it then the
simplest command to run would be:
REINDEX TABLE dataset;
> Universal Metadata Schema=# \d dataset
> Table "public.dataset"
> Column | Type | Collation |
> Nullable | Defau
> lt
> ---------------------------+---------------------------+-----------+----------+------
> ---
> node_id | integer | |
> not null |
> dataset_name | character varying(25) | |
> not null |
> notes | text | |
> |
> dataset_type | database_type | |
> not null |
> dataset_maturity | database_maturity_type | |
> not null |
> disposition | disposition_type | |
> not null |
> start_date | date | |
> |
> end_date | date | |
> |
> most_recent_update | date | |
> |
> update_periodicity | interval | |
> |
> system_of_record | text | |
> |
> point_of_contact | integer | |
> not null |
> dataset_url | text | |
> |
> classification_level | classification_level_type | |
> not null |
> physical_location | text | |
> |
> quality_control | yes_no_type | |
> not null |
> dataset_documentation_url | text | |
> not null |
> description | text | |
> |
> node_type | node_type | |
> |
> dummy | integer | |
> |
> Indexes:
> "dataset_pkey" PRIMARY KEY, btree (node_id)
> Foreign-key constraints:
> "node_id" FOREIGN KEY (node_id) REFERENCES node(node_id) NOT VALID
> "poc" FOREIGN KEY (point_of_contact) REFERENCES poc(poc_id) NOT VALID
> Referenced by:
> TABLE "dataset_table" CONSTRAINT "dataset" FOREIGN KEY (node_id)
> REFERENCES datas
> et(node_id) NOT VALID
> TABLE "dataset_subject" CONSTRAINT "dataset_subject_node_id_fkey"
> FOREIGN KEY (no
> de_id) REFERENCES dataset(node_id)
> TABLE "system_dataset" CONSTRAINT "system_dataset_node_id_fkey"
> FOREIGN KEY (node
> _id) REFERENCES dataset(node_id) NOT VALID
> Inherits: node
>
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Thursday, March 6, 2025 3:03 PM
> *To:* mark bradley <markbradyju@outlook.com>
> *Subject:* Re: Duplicate Key Values
> On 3/6/25 10:51, mark bradley wrote:
> Reply to list alos.
> Ccing list.
>
>> Looks like there is no index on node_id at the moment
>
> That would be an issue and also would mean it is not a PK.
>
> In psql do:
>
> \d dataset
>
> and show the results as text in your reply.
>
>>
>>
>> Mark Brady, Ph.D.
>> Deputy Chief Data Officer, TRMC
>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>_
>> ------------------------------------------------------------------------
>> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
>> *Sent:* Thursday, March 6, 2025 1:22 PM
>> *To:* mark bradley <markbradyju@outlook.com>; Ron Johnson
>> <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
>> *Subject:* Re: Duplicate Key Values
>> On 3/6/25 10:11, mark bradley wrote:
>>> Here is the table definition:
>>>
>>>
>>> And here is the error message I get when I try to delete a duplicate:
>>
>> Please answer the following:
>>
>> 1) Did you not see duplicates with the old version of pgAdmin4?
>>
>> 2) What do you see if you use psql?
>>
>> 3) Did you upgrade/move the Postgres server or the underlying OS?
>>
>> 4) Have you tried reindexing the node_id field?
>>
>>>
>>>
>>> Mark Brady,
>>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
>> <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>>_
>>> ------------------------------------------------------------------------
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com
>>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
--
Adrian Klaver
adrian.klaver@aklaver.com
My mistake I forgot to Cc list on my previous post, which was:
That would be an issue and also would mean it is not a PK.
In psql do:
\d dataset
and show the results as text in your reply.
Ccing list
The below shows there is an index("dataset_pkey") on node_id.
Note, reindexing will take a lock on the table that prevents changing
data while the operation is running. See the below for more information:
https://www.postgresql.org/docs/current/sql-reindex.html
If the table is not to big and you can interrupt access to it then the
simplest command to run would be:
REINDEX TABLE dataset;
> Universal Metadata Schema=# \d dataset
> Table "public.dataset"
> Column | Type | Collation |
> Nullable | Defau
> lt
> ---------------------------+---------------------------+-----------+----------+------
> ---
> node_id | integer | |
> not null |
> dataset_name | character varying(25) | |
> not null |
> notes | text | |
> |
> dataset_type | database_type | |
> not null |
> dataset_maturity | database_maturity_type | |
> not null |
> disposition | disposition_type | |
> not null |
> start_date | date | |
> |
> end_date | date | |
> |
> most_recent_update | date | |
> |
> update_periodicity | interval | |
> |
> system_of_record | text | |
> |
> point_of_contact | integer | |
> not null |
> dataset_url | text | |
> |
> classification_level | classification_level_type | |
> not null |
> physical_location | text | |
> |
> quality_control | yes_no_type | |
> not null |
> dataset_documentation_url | text | |
> not null |
> description | text | |
> |
> node_type | node_type | |
> |
> dummy | integer | |
> |
> Indexes:
> "dataset_pkey" PRIMARY KEY, btree (node_id)
> Foreign-key constraints:
> "node_id" FOREIGN KEY (node_id) REFERENCES node(node_id) NOT VALID
> "poc" FOREIGN KEY (point_of_contact) REFERENCES poc(poc_id) NOT VALID
> Referenced by:
> TABLE "dataset_table" CONSTRAINT "dataset" FOREIGN KEY (node_id)
> REFERENCES datas
> et(node_id) NOT VALID
> TABLE "dataset_subject" CONSTRAINT "dataset_subject_node_id_fkey"
> FOREIGN KEY (no
> de_id) REFERENCES dataset(node_id)
> TABLE "system_dataset" CONSTRAINT "system_dataset_node_id_fkey"
> FOREIGN KEY (node
> _id) REFERENCES dataset(node_id) NOT VALID
> Inherits: node
>
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Thursday, March 6, 2025 3:03 PM
> *To:* mark bradley <markbradyju@outlook.com>
> *Subject:* Re: Duplicate Key Values
> On 3/6/25 10:51, mark bradley wrote:
> Reply to list alos.
> Ccing list.
>
>> Looks like there is no index on node_id at the moment
>
> That would be an issue and also would mean it is not a PK.
>
> In psql do:
>
> \d dataset
>
> and show the results as text in your reply.
>
>>
>>
>> Mark Brady, Ph.D.
>> Deputy Chief Data Officer, TRMC
>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>_
>> ------------------------------------------------------------------------
>> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
>> *Sent:* Thursday, March 6, 2025 1:22 PM
>> *To:* mark bradley <markbradyju@outlook.com>; Ron Johnson
>> <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
>> *Subject:* Re: Duplicate Key Values
>> On 3/6/25 10:11, mark bradley wrote:
>>> Here is the table definition:
>>>
>>>
>>> And here is the error message I get when I try to delete a duplicate:
>>
>> Please answer the following:
>>
>> 1) Did you not see duplicates with the old version of pgAdmin4?
>>
>> 2) What do you see if you use psql?
>>
>> 3) Did you upgrade/move the Postgres server or the underlying OS?
>>
>> 4) Have you tried reindexing the node_id field?
>>
>>>
>>>
>>> Mark Brady,
>>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
>> <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>>_
>>> ------------------------------------------------------------------------
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com
>>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
--
Adrian Klaver
adrian.klaver@aklaver.com
Here are some of the references from Copilot
From: mark bradley <markbradyju@outlook.com>
Sent: Friday, March 7, 2025 9:34 AM
To: Adrian Klaver <adrian.klaver@aklaver.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Friday, March 7, 2025 9:34 AM
To: Adrian Klaver <adrian.klaver@aklaver.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
This is what MS Copilot has to say about this apparent bug where Postgres inserts extra rows violating a primary keys uniqueness constraint:
Yes, this issue has been encountered by others. There are a few potential reasons why this might happen:
- Sequence Out of Sync: Sometimes, the sequence that generates unique values for the primary key can become out of sync, especially after a bulk import or a database restore. You can check if the sequence is out of sync and reset it if necessary.
- Index Corruption: Index corruption can occur due to various reasons, such as hardware failures or bugs in earlier versions of PostgreSQL. This can lead to duplicate primary keys being inserted.
- Table Inheritance: If you are using table inheritance, primary keys are not enforced among inherited tables. This can lead to duplicates if not handled correctly.
- Application Logic: Sometimes, the application logic might inadvertently insert duplicate records. Reviewing the application code and insert statements can help identify and resolve such issues.
To resolve the issue, you can:
- Check and reset the sequence if it's out of sync.
- Rebuild the index if it's corrupted.
Any of the first 3 could be involved. There isn't an application involved other than pgAdmin.
- Originally, the key in the node table was a sequence, but I changed it to a non-sequence.
- There is no index on the primary key node_id, and I understand there should be one.
- I didn't explicitly use Postgres inheritance but there are two tables that are subclasses of node. There are dataset nodes and processing_node [s] tables. Each is a type of node and have primary keys that are foreign keys from the node table. This key is node_id.
What to do? I hesitate to just delete my tables and start over because this error will reoccur.
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Thursday, March 6, 2025 3:34 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Thursday, March 6, 2025 3:34 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
On 3/6/25 12:06, mark bradley wrote:
My mistake I forgot to Cc list on my previous post, which was:
That would be an issue and also would mean it is not a PK.
In psql do:
\d dataset
and show the results as text in your reply.
Ccing list
The below shows there is an index("dataset_pkey") on node_id.
Note, reindexing will take a lock on the table that prevents changing
data while the operation is running. See the below for more information:
https://www.postgresql.org/docs/current/sql-reindex.html
If the table is not to big and you can interrupt access to it then the
simplest command to run would be:
REINDEX TABLE dataset;
> Universal Metadata Schema=# \d dataset
> Table "public.dataset"
> Column | Type | Collation |
> Nullable | Defau
> lt
> ---------------------------+---------------------------+-----------+----------+------
> ---
> node_id | integer | |
> not null |
> dataset_name | character varying(25) | |
> not null |
> notes | text | |
> |
> dataset_type | database_type | |
> not null |
> dataset_maturity | database_maturity_type | |
> not null |
> disposition | disposition_type | |
> not null |
> start_date | date | |
> |
> end_date | date | |
> |
> most_recent_update | date | |
> |
> update_periodicity | interval | |
> |
> system_of_record | text | |
> |
> point_of_contact | integer | |
> not null |
> dataset_url | text | |
> |
> classification_level | classification_level_type | |
> not null |
> physical_location | text | |
> |
> quality_control | yes_no_type | |
> not null |
> dataset_documentation_url | text | |
> not null |
> description | text | |
> |
> node_type | node_type | |
> |
> dummy | integer | |
> |
> Indexes:
> "dataset_pkey" PRIMARY KEY, btree (node_id)
> Foreign-key constraints:
> "node_id" FOREIGN KEY (node_id) REFERENCES node(node_id) NOT VALID
> "poc" FOREIGN KEY (point_of_contact) REFERENCES poc(poc_id) NOT VALID
> Referenced by:
> TABLE "dataset_table" CONSTRAINT "dataset" FOREIGN KEY (node_id)
> REFERENCES datas
> et(node_id) NOT VALID
> TABLE "dataset_subject" CONSTRAINT "dataset_subject_node_id_fkey"
> FOREIGN KEY (no
> de_id) REFERENCES dataset(node_id)
> TABLE "system_dataset" CONSTRAINT "system_dataset_node_id_fkey"
> FOREIGN KEY (node
> _id) REFERENCES dataset(node_id) NOT VALID
> Inherits: node
>
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Thursday, March 6, 2025 3:03 PM
> *To:* mark bradley <markbradyju@outlook.com>
> *Subject:* Re: Duplicate Key Values
> On 3/6/25 10:51, mark bradley wrote:
> Reply to list alos.
> Ccing list.
>
>> Looks like there is no index on node_id at the moment
>
> That would be an issue and also would mean it is not a PK.
>
> In psql do:
>
> \d dataset
>
> and show the results as text in your reply.
>
>>
>>
>> Mark Brady, Ph.D.
>> Deputy Chief Data Officer, TRMC
>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>_
>> ------------------------------------------------------------------------
>> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
>> *Sent:* Thursday, March 6, 2025 1:22 PM
>> *To:* mark bradley <markbradyju@outlook.com>; Ron Johnson
>> <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
>> *Subject:* Re: Duplicate Key Values
>> On 3/6/25 10:11, mark bradley wrote:
>>> Here is the table definition:
>>>
>>>
>>> And here is the error message I get when I try to delete a duplicate:
>>
>> Please answer the following:
>>
>> 1) Did you not see duplicates with the old version of pgAdmin4?
>>
>> 2) What do you see if you use psql?
>>
>> 3) Did you upgrade/move the Postgres server or the underlying OS?
>>
>> 4) Have you tried reindexing the node_id field?
>>
>>>
>>>
>>> Mark Brady,
>>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
>> <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>>_
>>> ------------------------------------------------------------------------
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com
>>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
--
Adrian Klaver
adrian.klaver@aklaver.com
My mistake I forgot to Cc list on my previous post, which was:
That would be an issue and also would mean it is not a PK.
In psql do:
\d dataset
and show the results as text in your reply.
Ccing list
The below shows there is an index("dataset_pkey") on node_id.
Note, reindexing will take a lock on the table that prevents changing
data while the operation is running. See the below for more information:
https://www.postgresql.org/docs/current/sql-reindex.html
If the table is not to big and you can interrupt access to it then the
simplest command to run would be:
REINDEX TABLE dataset;
> Universal Metadata Schema=# \d dataset
> Table "public.dataset"
> Column | Type | Collation |
> Nullable | Defau
> lt
> ---------------------------+---------------------------+-----------+----------+------
> ---
> node_id | integer | |
> not null |
> dataset_name | character varying(25) | |
> not null |
> notes | text | |
> |
> dataset_type | database_type | |
> not null |
> dataset_maturity | database_maturity_type | |
> not null |
> disposition | disposition_type | |
> not null |
> start_date | date | |
> |
> end_date | date | |
> |
> most_recent_update | date | |
> |
> update_periodicity | interval | |
> |
> system_of_record | text | |
> |
> point_of_contact | integer | |
> not null |
> dataset_url | text | |
> |
> classification_level | classification_level_type | |
> not null |
> physical_location | text | |
> |
> quality_control | yes_no_type | |
> not null |
> dataset_documentation_url | text | |
> not null |
> description | text | |
> |
> node_type | node_type | |
> |
> dummy | integer | |
> |
> Indexes:
> "dataset_pkey" PRIMARY KEY, btree (node_id)
> Foreign-key constraints:
> "node_id" FOREIGN KEY (node_id) REFERENCES node(node_id) NOT VALID
> "poc" FOREIGN KEY (point_of_contact) REFERENCES poc(poc_id) NOT VALID
> Referenced by:
> TABLE "dataset_table" CONSTRAINT "dataset" FOREIGN KEY (node_id)
> REFERENCES datas
> et(node_id) NOT VALID
> TABLE "dataset_subject" CONSTRAINT "dataset_subject_node_id_fkey"
> FOREIGN KEY (no
> de_id) REFERENCES dataset(node_id)
> TABLE "system_dataset" CONSTRAINT "system_dataset_node_id_fkey"
> FOREIGN KEY (node
> _id) REFERENCES dataset(node_id) NOT VALID
> Inherits: node
>
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Thursday, March 6, 2025 3:03 PM
> *To:* mark bradley <markbradyju@outlook.com>
> *Subject:* Re: Duplicate Key Values
> On 3/6/25 10:51, mark bradley wrote:
> Reply to list alos.
> Ccing list.
>
>> Looks like there is no index on node_id at the moment
>
> That would be an issue and also would mean it is not a PK.
>
> In psql do:
>
> \d dataset
>
> and show the results as text in your reply.
>
>>
>>
>> Mark Brady, Ph.D.
>> Deputy Chief Data Officer, TRMC
>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>_
>> ------------------------------------------------------------------------
>> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
>> *Sent:* Thursday, March 6, 2025 1:22 PM
>> *To:* mark bradley <markbradyju@outlook.com>; Ron Johnson
>> <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
>> *Subject:* Re: Duplicate Key Values
>> On 3/6/25 10:11, mark bradley wrote:
>>> Here is the table definition:
>>>
>>>
>>> And here is the error message I get when I try to delete a duplicate:
>>
>> Please answer the following:
>>
>> 1) Did you not see duplicates with the old version of pgAdmin4?
>>
>> 2) What do you see if you use psql?
>>
>> 3) Did you upgrade/move the Postgres server or the underlying OS?
>>
>> 4) Have you tried reindexing the node_id field?
>>
>>>
>>>
>>> Mark Brady,
>>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
>> <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>>_
>>> ------------------------------------------------------------------------
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com
>>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
--
Adrian Klaver
adrian.klaver@aklaver.com
On Fri, Mar 7, 2025 at 9:35 AM mark bradley <markbradyju@outlook.com> wrote:
This is what MS Copilot has to say about this apparent bug where Postgres inserts extra rows violating a primary keys uniqueness constraint:Yes, this issue has been encountered by others. There are a few potential reasons why this might happen:
Sequence Out of Sync: Sometimes, the sequence that generates unique values for the primary key can become out of sync, especially after a bulk import or a database restore. You can check if the sequence is out of sync and reset it if necessary. Index Corruption: Index corruption can occur due to various reasons, such as hardware failures or bugs in earlier versions of PostgreSQL. This can lead to duplicate primary keys being inserted. Table Inheritance: If you are using table inheritance, primary keys are not enforced among inherited tables. This can lead to duplicates if not handled correctly. Application Logic: Sometimes, the application logic might inadvertently insert duplicate records. Reviewing the application code and insert statements can help identify and resolve such issues.
This is AI gobbledygook, and can be ignored. The only real option is number 2 (index corruption).
There is no index on the primary key node_id, and I understand there should be one.
There is an index, as your table definition showed.
What to do? I hesitate to just delete my tables and start over because this error will reoccur.
The error should not reoccur. At least, a normal Postgres system will prevent this from happening in the first place. To clean it up, carefully run the below. If an error appears, or something does not look right, rollback and stop.
-- Encourage not using indexes:
set enable_indexscan = 0;
set enable_bitmapscan = 0;
set enable_indexscan = 0;
set enable_bitmapscan = 0;
set enable_indexonlyscan = 0;
-- Sanity check. This should return a number greater than 1. If not, stop.
set search_path = public;
select count(*) from dataset where node_id = 26;
-- Make a backup:
create table dataset_backup as select * from dataset;
-- Test out the process on a subset of the data:
create table test_dataset as select * from dataset where node_id < 30;
create table test_dataset_duperows_20250307 (like dataset);
begin;
set local session_replication_role = 'replica';
with goodctids as (select min(ctid) from TEST_dataset group by node_id)
, mydelete as (delete from TEST_dataset where not exists (select 1 from goodctids where min=ctid)
returning *)
insert into test_dataset_duperows_20250307 select * from mydelete;
reset session_replication_role;
commit;
create table dataset_backup as select * from dataset;
-- Test out the process on a subset of the data:
create table test_dataset as select * from dataset where node_id < 30;
create table test_dataset_duperows_20250307 (like dataset);
begin;
set local session_replication_role = 'replica';
with goodctids as (select min(ctid) from TEST_dataset group by node_id)
, mydelete as (delete from TEST_dataset where not exists (select 1 from goodctids where min=ctid)
returning *)
insert into test_dataset_duperows_20250307 select * from mydelete;
reset session_replication_role;
commit;
-- STOP HERE and examine the test_dataset and test_dataset_duperows_20250307 tables
--
-- If ZERO rows were deleted, then you should no go further,
--
-- If ZERO rows were deleted, then you should no go further,
-- as some of the underlying assumptions must be wrong.
-- Do the real table:
create table dataset_duperows_20250307 (like dataset);
begin;
set local session_replication_role = 'replica';
with goodctids as (select min(ctid) from dataset group by node_id)
, mydelete as (delete from dataset where not exists (select 1 from goodctids where min=ctid)
returning *)
insert into dataset_duperows_20250307 select * from mydelete;
reset session_replication_role;
commit;
-- Do the real table:
create table dataset_duperows_20250307 (like dataset);
begin;
set local session_replication_role = 'replica';
with goodctids as (select min(ctid) from dataset group by node_id)
, mydelete as (delete from dataset where not exists (select 1 from goodctids where min=ctid)
returning *)
insert into dataset_duperows_20250307 select * from mydelete;
reset session_replication_role;
commit;
-- Rebuild the index
reindex index concurrently dataset_pkey;
-- Put things back from good measure:
reset enable_indexscan;
reset enable_bitmapscan;
reindex index concurrently dataset_pkey;
-- Put things back from good measure:
reset enable_indexscan;
reset enable_bitmapscan;
reset enable_indexonlyscan;
drop table test_dataset;
drop table test_dataset_duperows;
drop table test_dataset;
drop table test_dataset_duperows;
Given the issues, I would keep the dataset_backup table around for a while. And make sure your backups are running and up to date.
You might also want to reindex all the tables in your database, to see if any other issues are lurking.
Check your Postgres logs closely to see if anything else unusual has appeared.
Look over your OS logs to see if there are clues as to how the corruption happened. Maybe you recently upgraded your OS?
Cheers,
Greg
--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support
On 3/7/25 06:34, mark bradley wrote: > This is what MS Copilot has to say about this apparent bug where > Postgres inserts extra rows violating a primary keys uniqueness constraint: > What to do? I hesitate to just delete my tables and start over because > this error will reoccur. Let the AI solve it. > > Best regards, > Mark Brady -- Adrian Klaver adrian.klaver@aklaver.com
Wouldn't that be nice 🙂
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Friday, March 7, 2025 10:55 AM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Friday, March 7, 2025 10:55 AM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
On 3/7/25 06:34, mark bradley wrote:
> This is what MS Copilot has to say about this apparent bug where
> Postgres inserts extra rows violating a primary keys uniqueness constraint:
> What to do? I hesitate to just delete my tables and start over because
> this error will reoccur.
Let the AI solve it.
>
> Best regards,
> Mark Brady
--
Adrian Klaver
adrian.klaver@aklaver.com
> This is what MS Copilot has to say about this apparent bug where
> Postgres inserts extra rows violating a primary keys uniqueness constraint:
> What to do? I hesitate to just delete my tables and start over because
> this error will reoccur.
Let the AI solve it.
>
> Best regards,
> Mark Brady
--
Adrian Klaver
adrian.klaver@aklaver.com
On 3/7/25 11:47, mark bradley wrote: > Wouldn't that be nice 🙂 No, because you would end up with a mess. My AI rant: AI is neither artificial or intelligent. It is human code that pattern matches and then throws the matches against the wall and hope something sticks. It is left to the human to clean up. At any rate, in this post: https://www.postgresql.org/message-id/75b33741-ee99-4524-b63a-edad21c1266d%40aklaver.com You where provided an answer, which is further extended here: https://www.postgresql.org/message-id/CAKAnmm%2BBBBaXGN2xPHhXywkwb72UWzinWu2wQ5WadcMw3_57rQ%40mail.gmail.com > > > Best regards, > Mark Brady > _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_ > ------------------------------------------------------------------------ > *From:* Adrian Klaver <adrian.klaver@aklaver.com> > *Sent:* Friday, March 7, 2025 10:55 AM > *To:* mark bradley <markbradyju@outlook.com> > *Cc:* pgsql-general <pgsql-general@postgresql.org> > *Subject:* Re: Duplicate Key Values > On 3/7/25 06:34, mark bradley wrote: >> This is what MS Copilot has to say about this apparent bug where >> Postgres inserts extra rows violating a primary keys uniqueness constraint: > > > >> What to do? I hesitate to just delete my tables and start over because >> this error will reoccur. > > Let the AI solve it. > >> >> Best regards, >> Mark Brady > > > -- > Adrian Klaver > adrian.klaver@aklaver.com > -- Adrian Klaver adrian.klaver@aklaver.com
An "interesting" effect of reindexing is that all the records that were dups in the nodes table were deleted, both copies.
Also, all rows having node_id as a foreign key in other tables were deleted, which means all rows in these tables were deleted.
Fortunately these are not huge tables. I will reenter the data, make a backup, and then try your further extended suggestions.
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Friday, March 7, 2025 3:25 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Friday, March 7, 2025 3:25 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
On 3/7/25 11:47, mark bradley wrote:
> Wouldn't that be nice 🙂
No, because you would end up with a mess.
My AI rant:
AI is neither artificial or intelligent. It is human code that pattern
matches and then throws the matches against the wall and hope something
sticks. It is left to the human to clean up.
At any rate, in this post:
https://www.postgresql.org/message-id/75b33741-ee99-4524-b63a-edad21c1266d%40aklaver.com
You where provided an answer, which is further extended here:
https://www.postgresql.org/message-id/CAKAnmm%2BBBBaXGN2xPHhXywkwb72UWzinWu2wQ5WadcMw3_57rQ%40mail.gmail.com
>
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Friday, March 7, 2025 10:55 AM
> *To:* mark bradley <markbradyju@outlook.com>
> *Cc:* pgsql-general <pgsql-general@postgresql.org>
> *Subject:* Re: Duplicate Key Values
> On 3/7/25 06:34, mark bradley wrote:
>> This is what MS Copilot has to say about this apparent bug where
>> Postgres inserts extra rows violating a primary keys uniqueness constraint:
>
>
>
>> What to do? I hesitate to just delete my tables and start over because
>> this error will reoccur.
>
> Let the AI solve it.
>
>>
>> Best regards,
>> Mark Brady
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
--
Adrian Klaver
adrian.klaver@aklaver.com
> Wouldn't that be nice 🙂
No, because you would end up with a mess.
My AI rant:
AI is neither artificial or intelligent. It is human code that pattern
matches and then throws the matches against the wall and hope something
sticks. It is left to the human to clean up.
At any rate, in this post:
https://www.postgresql.org/message-id/75b33741-ee99-4524-b63a-edad21c1266d%40aklaver.com
You where provided an answer, which is further extended here:
https://www.postgresql.org/message-id/CAKAnmm%2BBBBaXGN2xPHhXywkwb72UWzinWu2wQ5WadcMw3_57rQ%40mail.gmail.com
>
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Friday, March 7, 2025 10:55 AM
> *To:* mark bradley <markbradyju@outlook.com>
> *Cc:* pgsql-general <pgsql-general@postgresql.org>
> *Subject:* Re: Duplicate Key Values
> On 3/7/25 06:34, mark bradley wrote:
>> This is what MS Copilot has to say about this apparent bug where
>> Postgres inserts extra rows violating a primary keys uniqueness constraint:
>
>
>
>> What to do? I hesitate to just delete my tables and start over because
>> this error will reoccur.
>
> Let the AI solve it.
>
>>
>> Best regards,
>> Mark Brady
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
--
Adrian Klaver
adrian.klaver@aklaver.com
On Tue, Mar 11, 2025 at 10:29 AM mark bradley <markbradyju@outlook.com> wrote:
An "interesting" effect of reindexing is that all the records that were dups in the nodes table were deleted, both copies.
Er...that's not just interesting, but alarming - if true. Can you show the steps you took?
Cheers,
Greg
--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support
Do you currently take regular backups?
We'd be glad to show you how to take regular logical backups.
On Tue, Mar 11, 2025 at 10:29 AM mark bradley <markbradyju@outlook.com> wrote:
An "interesting" effect of reindexing is that all the records that were dups in the nodes table were deleted, both copies.Also, all rows having node_id as a foreign key in other tables were deleted, which means all rows in these tables were deleted.Fortunately these are not huge tables. I will reenter the data, make a backup, and then try your further extended suggestions.From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Friday, March 7, 2025 3:25 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key ValuesOn 3/7/25 11:47, mark bradley wrote:
> Wouldn't that be nice 🙂
No, because you would end up with a mess.
My AI rant:
AI is neither artificial or intelligent. It is human code that pattern
matches and then throws the matches against the wall and hope something
sticks. It is left to the human to clean up.
At any rate, in this post:
https://www.postgresql.org/message-id/75b33741-ee99-4524-b63a-edad21c1266d%40aklaver.com
You where provided an answer, which is further extended here:
https://www.postgresql.org/message-id/CAKAnmm%2BBBBaXGN2xPHhXywkwb72UWzinWu2wQ5WadcMw3_57rQ%40mail.gmail.com
>
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Friday, March 7, 2025 10:55 AM
> *To:* mark bradley <markbradyju@outlook.com>
> *Cc:* pgsql-general <pgsql-general@postgresql.org>
> *Subject:* Re: Duplicate Key Values
> On 3/7/25 06:34, mark bradley wrote:
>> This is what MS Copilot has to say about this apparent bug where
>> Postgres inserts extra rows violating a primary keys uniqueness constraint:
>
>
>
>> What to do? I hesitate to just delete my tables and start over because
>> this error will reoccur.
>
> Let the AI solve it.
>
>>
>> Best regards,
>> Mark Brady
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
--
Adrian Klaver
adrian.klaver@aklaver.com
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On 3/11/25 07:28, mark bradley wrote: > An "interesting" effect of reindexing is that all the records that were > dups in the nodes table were deleted, both copies. I am trying to understand above. Was there at least one row of each node_id left? > > Also, all rows having node_id as a foreign key in other tables were > deleted, which means all rows in these tables were deleted. > > Fortunately these are not huge tables. I will reenter the data, make a > backup, and then try your further extended suggestions. > > Best regards, > Mark Brady > _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_ -- Adrian Klaver adrian.klaver@aklaver.com
The rows that were preserved in the nodes table were the ones that were not dups originally.
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Tuesday, March 11, 2025 10:56 AM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Tuesday, March 11, 2025 10:56 AM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
On 3/11/25 07:28, mark bradley wrote:
> An "interesting" effect of reindexing is that all the records that were
> dups in the nodes table were deleted, both copies.
I am trying to understand above.
Was there at least one row of each node_id left?
>
> Also, all rows having node_id as a foreign key in other tables were
> deleted, which means all rows in these tables were deleted.
>
> Fortunately these are not huge tables. I will reenter the data, make a
> backup, and then try your further extended suggestions.
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
--
Adrian Klaver
adrian.klaver@aklaver.com
> An "interesting" effect of reindexing is that all the records that were
> dups in the nodes table were deleted, both copies.
I am trying to understand above.
Was there at least one row of each node_id left?
>
> Also, all rows having node_id as a foreign key in other tables were
> deleted, which means all rows in these tables were deleted.
>
> Fortunately these are not huge tables. I will reenter the data, make a
> backup, and then try your further extended suggestions.
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
--
Adrian Klaver
adrian.klaver@aklaver.com
On 3/11/25 08:05, mark bradley wrote: > The rows that were preserved in the nodes table were the ones that were > not dups originally. 1) To be specific: a) If there where two or more rows with a node_id, after the reindexing was there only one left? b) Or for any node_ids that where duplicated did reindexing eliminate all rows with that node_id. 2) Per post from Greg Sabino Mullane, you need to show us the steps you took to reindex the table. > > > Best regards, > Mark Brady > _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_ > ------------------------------------------------------------------------ > *From:* Adrian Klaver <adrian.klaver@aklaver.com> > *Sent:* Tuesday, March 11, 2025 10:56 AM > *To:* mark bradley <markbradyju@outlook.com> > *Cc:* pgsql-general <pgsql-general@postgresql.org> > *Subject:* Re: Duplicate Key Values > On 3/11/25 07:28, mark bradley wrote: >> An "interesting" effect of reindexing is that all the records that were >> dups in the nodes table were deleted, both copies. > > I am trying to understand above. > > Was there at least one row of each node_id left? > >> >> Also, all rows having node_id as a foreign key in other tables were >> deleted, which means all rows in these tables were deleted. >> >> Fortunately these are not huge tables. I will reenter the data, make a >> backup, and then try your further extended suggestions. >> >> Best regards, >> Mark Brady >> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady > <https://amazon.com/author/markjbrady>>_ > > > -- > Adrian Klaver > adrian.klaver@aklaver.com > -- Adrian Klaver adrian.klaver@aklaver.com
b
REINDEX TABLE node;
Also reindexed table with node_id as a foreign key in the same way.
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Tuesday, March 11, 2025 11:12 AM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Tuesday, March 11, 2025 11:12 AM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
On 3/11/25 08:05, mark bradley wrote:
> The rows that were preserved in the nodes table were the ones that were
> not dups originally.
1) To be specific:
a) If there where two or more rows with a node_id, after the reindexing
was there only one left?
b) Or for any node_ids that where duplicated did reindexing eliminate
all rows with that node_id.
2) Per post from Greg Sabino Mullane, you need to show us the steps you
took to reindex the table.
>
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Tuesday, March 11, 2025 10:56 AM
> *To:* mark bradley <markbradyju@outlook.com>
> *Cc:* pgsql-general <pgsql-general@postgresql.org>
> *Subject:* Re: Duplicate Key Values
> On 3/11/25 07:28, mark bradley wrote:
>> An "interesting" effect of reindexing is that all the records that were
>> dups in the nodes table were deleted, both copies.
>
> I am trying to understand above.
>
> Was there at least one row of each node_id left?
>
>>
>> Also, all rows having node_id as a foreign key in other tables were
>> deleted, which means all rows in these tables were deleted.
>>
>> Fortunately these are not huge tables. I will reenter the data, make a
>> backup, and then try your further extended suggestions.
>>
>> Best regards,
>> Mark Brady
>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>_
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
--
Adrian Klaver
adrian.klaver@aklaver.com
> The rows that were preserved in the nodes table were the ones that were
> not dups originally.
1) To be specific:
a) If there where two or more rows with a node_id, after the reindexing
was there only one left?
b) Or for any node_ids that where duplicated did reindexing eliminate
all rows with that node_id.
2) Per post from Greg Sabino Mullane, you need to show us the steps you
took to reindex the table.
>
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Tuesday, March 11, 2025 10:56 AM
> *To:* mark bradley <markbradyju@outlook.com>
> *Cc:* pgsql-general <pgsql-general@postgresql.org>
> *Subject:* Re: Duplicate Key Values
> On 3/11/25 07:28, mark bradley wrote:
>> An "interesting" effect of reindexing is that all the records that were
>> dups in the nodes table were deleted, both copies.
>
> I am trying to understand above.
>
> Was there at least one row of each node_id left?
>
>>
>> Also, all rows having node_id as a foreign key in other tables were
>> deleted, which means all rows in these tables were deleted.
>>
>> Fortunately these are not huge tables. I will reenter the data, make a
>> backup, and then try your further extended suggestions.
>>
>> Best regards,
>> Mark Brady
>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>_
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
--
Adrian Klaver
adrian.klaver@aklaver.com
A reindex is not going to remove rows from the table, so we need to see how you came to the conclusion that it did.
Cheers,
Greg
--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support
On 3/11/25 08:16, mark bradley wrote: A) 1) Please do not top post. Either inline or bottom post. It makes things like the below easier to follow. I wrote the post you responded to and it took me a bit to catch the b as a reference to: > b > b) Or for any node_ids that where duplicated did reindexing eliminate all rows with that node_id. B) From a previous post of yours: " 1. Originally, the key in the node table was a sequence, but I changed it to a non-sequence. 2. There is no index on the primary key node_id, and I understand there should be one. 3. I didn't explicitly use Postgres inheritance but there are two tables that are subclasses of node. There are dataset nodes and processing_node [s] tables. Each is a type of node and have primary keys that are foreign keys from the node table. This key is node_id. " As to point 2, from this message: https://www.postgresql.org/message-id/75b33741-ee99-4524-b63a-edad21c1266d%40aklaver.com there is an index on node_id as it is the Primary Key. Why do you think there is not? Can you elaborate more on point 3. Are you calling the Foreign Key relationships subclassing? C) > REINDEX TABLE node; > > Also reindexed table with node_id as a foreign key in the same way. Is there anything in Postgres log at the time you did the above that showed it did more then a REINDEX? > > > Best regards, > Mark Brady > _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_ > ------------------------------------------------------------------------ > *From:* Adrian Klaver <adrian.klaver@aklaver.com> > *Sent:* Tuesday, March 11, 2025 11:12 AM > *To:* mark bradley <markbradyju@outlook.com> > *Cc:* pgsql-general <pgsql-general@postgresql.org> > *Subject:* Re: Duplicate Key Values > On 3/11/25 08:05, mark bradley wrote: >> The rows that were preserved in the nodes table were the ones that were >> not dups originally. > > 1) To be specific: > > a) If there where two or more rows with a node_id, after the reindexing > was there only one left? > > b) Or for any node_ids that where duplicated did reindexing eliminate > all rows with that node_id. > > > 2) Per post from Greg Sabino Mullane, you need to show us the steps you > took to reindex the table. > >> >> >> Best regards, >> Mark Brady >> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady > <https://amazon.com/author/markjbrady>>_ >> ------------------------------------------------------------------------ >> *From:* Adrian Klaver <adrian.klaver@aklaver.com> >> *Sent:* Tuesday, March 11, 2025 10:56 AM >> *To:* mark bradley <markbradyju@outlook.com> >> *Cc:* pgsql-general <pgsql-general@postgresql.org> >> *Subject:* Re: Duplicate Key Values >> On 3/11/25 07:28, mark bradley wrote: >>> An "interesting" effect of reindexing is that all the records that were >>> dups in the nodes table were deleted, both copies. >> >> I am trying to understand above. >> >> Was there at least one row of each node_id left? >> >>> >>> Also, all rows having node_id as a foreign key in other tables were >>> deleted, which means all rows in these tables were deleted. >>> >>> Fortunately these are not huge tables. I will reenter the data, make a >>> backup, and then try your further extended suggestions. >>> >>> Best regards, >>> Mark Brady >>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady >> <https://amazon.com/author/markjbrady > <https://amazon.com/author/markjbrady>>>_ >> >> >> -- >> Adrian Klaver >> adrian.klaver@aklaver.com >> > > -- > Adrian Klaver > adrian.klaver@aklaver.com > -- Adrian Klaver adrian.klaver@aklaver.com
there is an index on node_id as it is the Primary Key.
Why do you think there is not?
My mistake, I misread the output from \d dataset
Can you elaborate more on point 3.
Are you calling the Foreign Key relationships subclassing?
Although I did not explicitly use Postgres to declare inheritance, logically speaking table dataset and processing _node inherit or are subclasses of node because they are subclasses of node in a dataflow diagram.
In terms of keys, this is accomplished by having the node_id key in the node table appear as a foreign key and as a primary key in both the dataset and processing_node tables.
Is there anything in Postgres log at the time you did the above that
showed it did more then a REINDEX?
Not that I can tell.
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Tuesday, March 11, 2025 12:00 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Tuesday, March 11, 2025 12:00 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
On 3/11/25 08:16, mark bradley wrote:
A)
1) Please do not top post. Either inline or bottom post. It makes
things like the below easier to follow. I wrote the post you responded
to and it took me a bit to catch the b as a reference to:
> b
>
b) Or for any node_ids that where duplicated did reindexing eliminate
all rows with that node_id.
B)
From a previous post of yours:
"
1.
Originally, the key in the node table was a sequence, but I changed it
to a non-sequence.
2.
There is no index on the primary key node_id, and I understand there
should be one.
3.
I didn't explicitly use Postgres inheritance but there are two tables
that are subclasses of node. There are dataset nodes and
processing_node [s] tables. Each is a type of node and have primary
keys that are foreign keys from the node table. This key is node_id.
"
As to point 2, from this message:
https://www.postgresql.org/message-id/75b33741-ee99-4524-b63a-edad21c1266d%40aklaver.com
there is an index on node_id as it is the Primary Key.
Why do you think there is not?
Can you elaborate more on point 3.
Are you calling the Foreign Key relationships subclassing?
C)
> REINDEX TABLE node;
>
> Also reindexed table with node_id as a foreign key in the same way.
Is there anything in Postgres log at the time you did the above that
showed it did more then a REINDEX?
>
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Tuesday, March 11, 2025 11:12 AM
> *To:* mark bradley <markbradyju@outlook.com>
> *Cc:* pgsql-general <pgsql-general@postgresql.org>
> *Subject:* Re: Duplicate Key Values
> On 3/11/25 08:05, mark bradley wrote:
>> The rows that were preserved in the nodes table were the ones that were
>> not dups originally.
>
> 1) To be specific:
>
> a) If there where two or more rows with a node_id, after the reindexing
> was there only one left?
>
> b) Or for any node_ids that where duplicated did reindexing eliminate
> all rows with that node_id.
>
>
> 2) Per post from Greg Sabino Mullane, you need to show us the steps you
> took to reindex the table.
>
>>
>>
>> Best regards,
>> Mark Brady
>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>_
>> ------------------------------------------------------------------------
>> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
>> *Sent:* Tuesday, March 11, 2025 10:56 AM
>> *To:* mark bradley <markbradyju@outlook.com>
>> *Cc:* pgsql-general <pgsql-general@postgresql.org>
>> *Subject:* Re: Duplicate Key Values
>> On 3/11/25 07:28, mark bradley wrote:
>>> An "interesting" effect of reindexing is that all the records that were
>>> dups in the nodes table were deleted, both copies.
>>
>> I am trying to understand above.
>>
>> Was there at least one row of each node_id left?
>>
>>>
>>> Also, all rows having node_id as a foreign key in other tables were
>>> deleted, which means all rows in these tables were deleted.
>>>
>>> Fortunately these are not huge tables. I will reenter the data, make a
>>> backup, and then try your further extended suggestions.
>>>
>>> Best regards,
>>> Mark Brady
>>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
>> <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>>_
>>
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com
>>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
--
Adrian Klaver
adrian.klaver@aklaver.com
A)
1) Please do not top post. Either inline or bottom post. It makes
things like the below easier to follow. I wrote the post you responded
to and it took me a bit to catch the b as a reference to:
> b
>
b) Or for any node_ids that where duplicated did reindexing eliminate
all rows with that node_id.
B)
From a previous post of yours:
"
1.
Originally, the key in the node table was a sequence, but I changed it
to a non-sequence.
2.
There is no index on the primary key node_id, and I understand there
should be one.
3.
I didn't explicitly use Postgres inheritance but there are two tables
that are subclasses of node. There are dataset nodes and
processing_node [s] tables. Each is a type of node and have primary
keys that are foreign keys from the node table. This key is node_id.
"
As to point 2, from this message:
https://www.postgresql.org/message-id/75b33741-ee99-4524-b63a-edad21c1266d%40aklaver.com
there is an index on node_id as it is the Primary Key.
Why do you think there is not?
Can you elaborate more on point 3.
Are you calling the Foreign Key relationships subclassing?
C)
> REINDEX TABLE node;
>
> Also reindexed table with node_id as a foreign key in the same way.
Is there anything in Postgres log at the time you did the above that
showed it did more then a REINDEX?
>
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Tuesday, March 11, 2025 11:12 AM
> *To:* mark bradley <markbradyju@outlook.com>
> *Cc:* pgsql-general <pgsql-general@postgresql.org>
> *Subject:* Re: Duplicate Key Values
> On 3/11/25 08:05, mark bradley wrote:
>> The rows that were preserved in the nodes table were the ones that were
>> not dups originally.
>
> 1) To be specific:
>
> a) If there where two or more rows with a node_id, after the reindexing
> was there only one left?
>
> b) Or for any node_ids that where duplicated did reindexing eliminate
> all rows with that node_id.
>
>
> 2) Per post from Greg Sabino Mullane, you need to show us the steps you
> took to reindex the table.
>
>>
>>
>> Best regards,
>> Mark Brady
>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>_
>> ------------------------------------------------------------------------
>> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
>> *Sent:* Tuesday, March 11, 2025 10:56 AM
>> *To:* mark bradley <markbradyju@outlook.com>
>> *Cc:* pgsql-general <pgsql-general@postgresql.org>
>> *Subject:* Re: Duplicate Key Values
>> On 3/11/25 07:28, mark bradley wrote:
>>> An "interesting" effect of reindexing is that all the records that were
>>> dups in the nodes table were deleted, both copies.
>>
>> I am trying to understand above.
>>
>> Was there at least one row of each node_id left?
>>
>>>
>>> Also, all rows having node_id as a foreign key in other tables were
>>> deleted, which means all rows in these tables were deleted.
>>>
>>> Fortunately these are not huge tables. I will reenter the data, make a
>>> backup, and then try your further extended suggestions.
>>>
>>> Best regards,
>>> Mark Brady
>>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
>> <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>>_
>>
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com
>>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
--
Adrian Klaver
adrian.klaver@aklaver.com
On 3/11/25 11:52, mark bradley wrote: > > > there is an index on node_id as it is the Primary Key. > > Why do you think there is not? > > My mistake, I misread the output from \d dataset > > > > Can you elaborate more on point 3. > > Are you calling the Foreign Key relationships subclassing? > > > Although I did not explicitly use Postgres to declare inheritance, > logically speaking table /dataset/ and /processing _node/ inherit or are > subclasses of /node/ because they are subclasses of /node/ in a dataflow > diagram. > > > In terms of keys, this is accomplished by having the /node_id/ key in > the /node/ table appear as a foreign key and as a primary key in both > the /dataset/ and /processing_node/ tables. You will need to show the schema definitions for: node dataset processing_node Best to do using psql \d <table_name> Also in from previous \d dataset there where NOT VALID FK definitions. Did you ever run VALIDATE CONSTRAINT against them? > > > > Is there anything in Postgres log at the time you did the above that > showed it did more then a REINDEX? > > > Not that I can tell. > > > > > Best regards, > Mark Brady > _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_ -- Adrian Klaver adrian.klaver@aklaver.com
It happened again. Now there are no sequences (although there once was).
Sequence of data reentry:
- I reentered the node_ids in table node as a primary key.
- I reentered the datasets, in table dataset.
- Node_id was already a PK in dataset.
- I set node_id to also be a foreign key in dataset.
- I went back to look at node and see that duplicate key values appeared.


Processing nodes are unaffected because I didn't add any data to the processing_node table.
Next, I'm going to create a simple database from scratch and see if I can duplicate this behavior.
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Tuesday, March 11, 2025 3:37 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Tuesday, March 11, 2025 3:37 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
On 3/11/25 11:52, mark bradley wrote:
>
>
> there is an index on node_id as it is the Primary Key.
>
> Why do you think there is not?
>
> My mistake, I misread the output from \d dataset
>
>
>
> Can you elaborate more on point 3.
>
> Are you calling the Foreign Key relationships subclassing?
>
>
> Although I did not explicitly use Postgres to declare inheritance,
> logically speaking table /dataset/ and /processing _node/ inherit or are
> subclasses of /node/ because they are subclasses of /node/ in a dataflow
> diagram.
>
>
> In terms of keys, this is accomplished by having the /node_id/ key in
> the /node/ table appear as a foreign key and as a primary key in both
> the /dataset/ and /processing_node/ tables.
You will need to show the schema definitions for:
node
dataset
processing_node
Best to do using psql \d <table_name>
Also in from previous \d dataset there where NOT VALID FK definitions.
Did you ever run VALIDATE CONSTRAINT against them?
>
>
>
> Is there anything in Postgres log at the time you did the above that
> showed it did more then a REINDEX?
>
>
> Not that I can tell.
>
>
>
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
--
Adrian Klaver
adrian.klaver@aklaver.com
>
>
> there is an index on node_id as it is the Primary Key.
>
> Why do you think there is not?
>
> My mistake, I misread the output from \d dataset
>
>
>
> Can you elaborate more on point 3.
>
> Are you calling the Foreign Key relationships subclassing?
>
>
> Although I did not explicitly use Postgres to declare inheritance,
> logically speaking table /dataset/ and /processing _node/ inherit or are
> subclasses of /node/ because they are subclasses of /node/ in a dataflow
> diagram.
>
>
> In terms of keys, this is accomplished by having the /node_id/ key in
> the /node/ table appear as a foreign key and as a primary key in both
> the /dataset/ and /processing_node/ tables.
You will need to show the schema definitions for:
node
dataset
processing_node
Best to do using psql \d <table_name>
Also in from previous \d dataset there where NOT VALID FK definitions.
Did you ever run VALIDATE CONSTRAINT against them?
>
>
>
> Is there anything in Postgres log at the time you did the above that
> showed it did more then a REINDEX?
>
>
> Not that I can tell.
>
>
>
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
--
Adrian Klaver
adrian.klaver@aklaver.com
Attachment
On 3/11/25 12:55, mark bradley wrote: > It happened again. Now there are no sequences (although there once was). Read my previous post and provide the information requested. Also: 1) Postgres version. 2) Where did you get Postgres from? > > Sequence of data reentry: > > 1. > I reentered the /node_id/s in table node as a primary key. > 2. > I reentered the datasets, in table /dataset/. > 3. > /Node_id/ was already a PK in dataset. > 4. > _I set _/_node_id_/_ to also be a foreign key in _/_dataset_/_._ > 5. > I went back to look at /node/ and see that duplicate key values > appeared. > > > > > Processing nodes are unaffected because I didn't add any data to the > /processing_node/ table. > > Next, I'm going to create a simple database from scratch and see if I > can duplicate this behavior. > > Best regards, > Mark Brady > _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_ > ------------------------------------------------------------------------ > *From:* Adrian Klaver <adrian.klaver@aklaver.com> > *Sent:* Tuesday, March 11, 2025 3:37 PM > *To:* mark bradley <markbradyju@outlook.com> > *Cc:* pgsql-general <pgsql-general@postgresql.org> > *Subject:* Re: Duplicate Key Values > On 3/11/25 11:52, mark bradley wrote: >> >> >> there is an index on node_id as it is the Primary Key. >> >> Why do you think there is not? >> >> My mistake, I misread the output from \d dataset >> >> >> >> Can you elaborate more on point 3. >> >> Are you calling the Foreign Key relationships subclassing? >> >> >> Although I did not explicitly use Postgres to declare inheritance, >> logically speaking table /dataset/ and /processing _node/ inherit or are >> subclasses of /node/ because they are subclasses of /node/ in a dataflow >> diagram. >> >> >> In terms of keys, this is accomplished by having the /node_id/ key in >> the /node/ table appear as a foreign key and as a primary key in both >> the /dataset/ and /processing_node/ tables. > > You will need to show the schema definitions for: > > node > dataset > processing_node > > Best to do using psql \d <table_name> > > Also in from previous \d dataset there where NOT VALID FK definitions. > > Did you ever run VALIDATE CONSTRAINT against them? > >> >> >> >> Is there anything in Postgres log at the time you did the above that >> showed it did more then a REINDEX? >> >> >> Not that I can tell. >> >> >> >> >> Best regards, >> Mark Brady >> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady > <https://amazon.com/author/markjbrady>>_ > > > -- > Adrian Klaver > adrian.klaver@aklaver.com > -- Adrian Klaver adrian.klaver@aklaver.com
On 3/11/25 13:24, Adrian Klaver wrote: > On 3/11/25 12:55, mark bradley wrote: >> It happened again. Now there are no sequences (although there once was). > > Read my previous post and provide the information requested. > Mark sent me the below, which answers some of the questions, namely there is inheritance going on: Universal Metadata Schema=# \d node Table "public.node" Column | Type | Collation | Nullable | Default -----------+-----------+-----------+----------+--------- node_id | integer | | not null | node_type | node_type | | not null | Indexes: "node_pkey" PRIMARY KEY, btree (node_id) "node_id" UNIQUE CONSTRAINT, btree (node_id) INCLUDE (node_id) Referenced by: TABLE "user_role" CONSTRAINT "a" FOREIGN KEY (node_id) REFERENCES node(node_i d) NOT VALID TABLE "dataset" CONSTRAINT "node_id" FOREIGN KEY (node_id) REFERENCES node(no de_id) NOT VALID Number of child tables: 2 (Use \d+ to list them.) Universal Metadata Schema=# \d dataset Table "public.dataset" Column | Type | Collation | Nullable | Default ---------------------------+---------------------------+-----------+----------+-- ------- node_id | integer | | not null | dataset_name | character varying(25) | | not null | notes | text | | | dataset_type | database_type | | | dataset_maturity | database_maturity_type | | | disposition | disposition_type | | | start_date | date | | | end_date | date | | | most_recent_update | date | | | update_periodicity | interval | | | system_of_record | text | | | point_of_contact | integer | | | dataset_url | text | | | classification_level | classification_level_type | | | physical_location | text | | | quality_control | yes_no_type | | | dataset_documentation_url | text | | | description | text | | | node_type | node_type | | | Indexes: "dataset_pkey" PRIMARY KEY, btree (node_id) Foreign-key constraints: "node_id" FOREIGN KEY (node_id) REFERENCES node(node_id) NOT VALID "poc" FOREIGN KEY (point_of_contact) REFERENCES poc(poc_id) NOT VALID Referenced by: TABLE "dataset_table" CONSTRAINT "dataset" FOREIGN KEY (node_id) REFERENCES d ataset(node_id) NOT VALID TABLE "system_dataset" CONSTRAINT "system_dataset_node_id_fkey" FOREIGN KEY ( node_id) REFERENCES dataset(node_id) NOT VALID Inherits: node Universal Metadata Schema=# \d processing_node Table "public.processing_node" Column | Type | Collation | Nullable | Default -----------------------+-----------------------+-----------+----------+--------- node_id | integer | | not null | processing_node_name | character varying(25) | | | description | text | | | notes | text | | | point_of_contact | integer | | not null | is_a_user_application | yes_no_type | | not null | node_type | node_type | | | Indexes: "processing_node_pkey" PRIMARY KEY, btree (node_id) Foreign-key constraints: "processing_node_point_of_contact_fkey" FOREIGN KEY (point_of_contact) REFERE NCES poc(poc_id) Referenced by: TABLE "system_processing_node" CONSTRAINT "system_processing_node_processing_ node_id_fkey" FOREIGN KEY (processing_node_id) REFERENCES processing_node(node_id ) NOT VALID Inherits: node Universal Metadata Schema=# ALTER TABLE node VALID ATE CONSTRAINT node_id; ERROR: constraint "node_id" of relation "node" is not a foreign key or check constraint Universal Metadata Schema=# ALTER TABLE dataset VA LIDATE CONSTRAINTnode_id; ERROR: syntax error at or near "CONSTRAINTnode_id " LINE 1: ALTER TABLE dataset VALIDATE CONSTRAINTnod e_id; ^ > Did you ever run VALIDATE CONSTRAINT against them? Here is the run Universal Metadata Schema=# ALTER TABLE node VALID ATE CONSTRAINT node_id; ERROR: constraint "node_id" of relation "node" is not a foreign key or check constraint Universal Metadata Schema=# Universal Metadata Schema=# ALTER TABLE dataset VA LIDATE CONSTRAINT node_id; ALTER TABLE Universal Metadata Schema=# -- Adrian Klaver adrian.klaver@aklaver.com
On 3/12/25 08:46, Adrian Klaver wrote: > On 3/11/25 13:24, Adrian Klaver wrote: >> On 3/11/25 12:55, mark bradley wrote: >>> It happened again. Now there are no sequences (although there once >>> was). >> >> Read my previous post and provide the information requested. >> > > Mark sent me the below, which answers some of the questions, namely > there is inheritance going on: Mark, to illustrate: create table node (node_id integer primary key, fld1 varchar); create table node_1 (node_id integer primary key, node_1_fld boolean) inherits ( node); NOTICE: merging column "node_id" with inherited definition insert into node values (1, 'dog'); insert into node_1 values (1, 'cat', 'f'); select * from node; node_id | fld1 ---------+------ 1 | dog 1 | cat This is explained here: https://www.postgresql.org/docs/current/sql-createtable.html INHERITS ( parent_table [, ... ] ) "... , and by default the data of the child table is included in scans of the parent(s)." This explains why you see duplicates of node_id. Though if you try to enter a duplicate value in to a particular table you get: insert into node_1 values (1, 'test', 't'); ERROR: duplicate key value violates unique constraint "node_1_pkey" DETAIL: Key (node_id)=(1) already exists. This still does not explain why REINDEX TABLE node; caused data to disappear? > > Did you ever run VALIDATE CONSTRAINT against them? > Here is the run As error notes VALIDATE CONSTRAINT only works on FK and check constraints. You would need to run against the FK constraints that where marked NOT VALID e.g "dataset" on the dataset table. Honestly, I think you need rework your data model. Not sure what the inheritance is getting you. Seems simpler to just have the node table not be inherited and just use FK relationships back to it. > > Universal Metadata Schema=# ALTER TABLE node VALID > ATE CONSTRAINT node_id; > ERROR: constraint "node_id" of relation "node" is > not a foreign key or check constraint > Universal Metadata Schema=# > > Universal Metadata Schema=# ALTER TABLE dataset VA > LIDATE CONSTRAINT node_id; > ALTER TABLE > Universal Metadata Schema=# > > > > > -- Adrian Klaver adrian.klaver@aklaver.com
>Mark, to illustrate:
>create table node (node_id integer primary key, fld1 varchar);
>create table node_1 (node_id integer primary key, node_1_fld boolean)
>inherits ( node);
>NOTICE: merging column "node_id" with inherited definition
>insert into node values (1, 'dog');
>insert into node_1 values (1, 'cat', 'f');
>select * from node;
> node_id | fld1
>---------+------
> 1 | dog
> 1 | cat
>create table node (node_id integer primary key, fld1 varchar);
>create table node_1 (node_id integer primary key, node_1_fld boolean)
>inherits ( node);
>NOTICE: merging column "node_id" with inherited definition
>insert into node values (1, 'dog');
>insert into node_1 values (1, 'cat', 'f');
>select * from node;
> node_id | fld1
>---------+------
> 1 | dog
> 1 | cat
That would make sense except that I never explicitly use the inherits option in the node_1 (my dataset) table. Postgres seems to be assuming that.
Also, the second column in node and in dataset are two different columns. However, Postgres insists on the node_type attribute being included (last column) in table dataset and won't let me delete it. This is redundant because every dataset is a dataset type of node.


...

The reason for the current design is that I also have a dataflow table as shown below.
A dataflow record describes the flow of data between two nodes. Now, if there are 3 types of node: dataset, processing, and user, then there are 9 types of dataflow. Hence, 9 tables are needed to represent the dataflows instead of 1. In the below, source_id and destination_id are both node_ids. If I want to know if a node is a dataset, processing node, or a user, I just look that up in the nodes table.

So, I think the crux of the problem is that Postgres assumes that inheritance is declared when it is not.
More answers to your questions coming.
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Wednesday, March 12, 2025 12:01 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Wednesday, March 12, 2025 12:01 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
On 3/12/25 08:46, Adrian Klaver wrote:
> On 3/11/25 13:24, Adrian Klaver wrote:
>> On 3/11/25 12:55, mark bradley wrote:
>>> It happened again. Now there are no sequences (although there once
>>> was).
>>
>> Read my previous post and provide the information requested.
>>
>
> Mark sent me the below, which answers some of the questions, namely
> there is inheritance going on:
Mark, to illustrate:
create table node (node_id integer primary key, fld1 varchar);
create table node_1 (node_id integer primary key, node_1_fld boolean)
inherits ( node);
NOTICE: merging column "node_id" with inherited definition
insert into node values (1, 'dog');
insert into node_1 values (1, 'cat', 'f');
select * from node;
node_id | fld1
---------+------
1 | dog
1 | cat
This is explained here:
https://www.postgresql.org/docs/current/sql-createtable.html
INHERITS ( parent_table [, ... ] )
"... , and by default the data of the child table is included in scans
of the parent(s)."
This explains why you see duplicates of node_id.
Though if you try to enter a duplicate value in to a particular table
you get:
insert into node_1 values (1, 'test', 't');
ERROR: duplicate key value violates unique constraint "node_1_pkey"
DETAIL: Key (node_id)=(1) already exists.
This still does not explain why REINDEX TABLE node; caused data to
disappear?
> > Did you ever run VALIDATE CONSTRAINT against them?
> Here is the run
As error notes VALIDATE CONSTRAINT only works on FK and check
constraints. You would need to run against the FK constraints that where
marked NOT VALID e.g "dataset" on the dataset table.
Honestly, I think you need rework your data model. Not sure what the
inheritance is getting you. Seems simpler to just have the node table
not be inherited and just use FK relationships back to it.
>
> Universal Metadata Schema=# ALTER TABLE node VALID
> ATE CONSTRAINT node_id;
> ERROR: constraint "node_id" of relation "node" is
> not a foreign key or check constraint
> Universal Metadata Schema=#
>
> Universal Metadata Schema=# ALTER TABLE dataset VA
> LIDATE CONSTRAINT node_id;
> ALTER TABLE
> Universal Metadata Schema=#
>
>
>
>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
> On 3/11/25 13:24, Adrian Klaver wrote:
>> On 3/11/25 12:55, mark bradley wrote:
>>> It happened again. Now there are no sequences (although there once
>>> was).
>>
>> Read my previous post and provide the information requested.
>>
>
> Mark sent me the below, which answers some of the questions, namely
> there is inheritance going on:
Mark, to illustrate:
create table node (node_id integer primary key, fld1 varchar);
create table node_1 (node_id integer primary key, node_1_fld boolean)
inherits ( node);
NOTICE: merging column "node_id" with inherited definition
insert into node values (1, 'dog');
insert into node_1 values (1, 'cat', 'f');
select * from node;
node_id | fld1
---------+------
1 | dog
1 | cat
This is explained here:
https://www.postgresql.org/docs/current/sql-createtable.html
INHERITS ( parent_table [, ... ] )
"... , and by default the data of the child table is included in scans
of the parent(s)."
This explains why you see duplicates of node_id.
Though if you try to enter a duplicate value in to a particular table
you get:
insert into node_1 values (1, 'test', 't');
ERROR: duplicate key value violates unique constraint "node_1_pkey"
DETAIL: Key (node_id)=(1) already exists.
This still does not explain why REINDEX TABLE node; caused data to
disappear?
> > Did you ever run VALIDATE CONSTRAINT against them?
> Here is the run
As error notes VALIDATE CONSTRAINT only works on FK and check
constraints. You would need to run against the FK constraints that where
marked NOT VALID e.g "dataset" on the dataset table.
Honestly, I think you need rework your data model. Not sure what the
inheritance is getting you. Seems simpler to just have the node table
not be inherited and just use FK relationships back to it.
>
> Universal Metadata Schema=# ALTER TABLE node VALID
> ATE CONSTRAINT node_id;
> ERROR: constraint "node_id" of relation "node" is
> not a foreign key or check constraint
> Universal Metadata Schema=#
>
> Universal Metadata Schema=# ALTER TABLE dataset VA
> LIDATE CONSTRAINT node_id;
> ALTER TABLE
> Universal Metadata Schema=#
>
>
>
>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
Attachment
Postgresql does not assume / default to inheritance. In text-mode clients where you type in "raw" SQL, you have to explicitly add an explicit "INHERITS <parent_table>" clause to the "CREATE TABLE foo" statement.
Are you creating the tables via PgAdmin point-and-click?
On Thu, Mar 13, 2025 at 10:56 AM mark bradley <markbradyju@outlook.com> wrote:
>Mark, to illustrate:
>create table node (node_id integer primary key, fld1 varchar);
>create table node_1 (node_id integer primary key, node_1_fld boolean)
>inherits ( node);
>NOTICE: merging column "node_id" with inherited definition
>insert into node values (1, 'dog');
>insert into node_1 values (1, 'cat', 'f');
>select * from node;
> node_id | fld1
>---------+------
> 1 | dog
> 1 | catThat would make sense except that I never explicitly use the inherits option in the node_1 (my dataset) table. Postgres seems to be assuming that.Also, the second column in node and in dataset are two different columns. However, Postgres insists on the node_type attribute being included (last column) in table dataset and won't let me delete it. This is redundant because every dataset is a dataset type of node....The reason for the current design is that I also have a dataflow table as shown below.A dataflow record describes the flow of data between two nodes. Now, if there are 3 types of node: dataset, processing, and user, then there are 9 types of dataflow. Hence, 9 tables are needed to represent the dataflows instead of 1. In the below, source_id and destination_id are both node_ids. If I want to know if a node is a dataset, processing node, or a user, I just look that up in the nodes table.So, I think the crux of the problem is that Postgres assumes that inheritance is declared when it is not.More answers to your questions coming.From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Wednesday, March 12, 2025 12:01 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key ValuesOn 3/12/25 08:46, Adrian Klaver wrote:
> On 3/11/25 13:24, Adrian Klaver wrote:
>> On 3/11/25 12:55, mark bradley wrote:
>>> It happened again. Now there are no sequences (although there once
>>> was).
>>
>> Read my previous post and provide the information requested.
>>
>
> Mark sent me the below, which answers some of the questions, namely
> there is inheritance going on:
Mark, to illustrate:
create table node (node_id integer primary key, fld1 varchar);
create table node_1 (node_id integer primary key, node_1_fld boolean)
inherits ( node);
NOTICE: merging column "node_id" with inherited definition
insert into node values (1, 'dog');
insert into node_1 values (1, 'cat', 'f');
select * from node;
node_id | fld1
---------+------
1 | dog
1 | cat
This is explained here:
https://www.postgresql.org/docs/current/sql-createtable.html
INHERITS ( parent_table [, ... ] )
"... , and by default the data of the child table is included in scans
of the parent(s)."
This explains why you see duplicates of node_id.
Though if you try to enter a duplicate value in to a particular table
you get:
insert into node_1 values (1, 'test', 't');
ERROR: duplicate key value violates unique constraint "node_1_pkey"
DETAIL: Key (node_id)=(1) already exists.
This still does not explain why REINDEX TABLE node; caused data to
disappear?
> > Did you ever run VALIDATE CONSTRAINT against them?
> Here is the run
As error notes VALIDATE CONSTRAINT only works on FK and check
constraints. You would need to run against the FK constraints that where
marked NOT VALID e.g "dataset" on the dataset table.
Honestly, I think you need rework your data model. Not sure what the
inheritance is getting you. Seems simpler to just have the node table
not be inherited and just use FK relationships back to it.
>
> Universal Metadata Schema=# ALTER TABLE node VALID
> ATE CONSTRAINT node_id;
> ERROR: constraint "node_id" of relation "node" is
> not a foreign key or check constraint
> Universal Metadata Schema=#
>
> Universal Metadata Schema=# ALTER TABLE dataset VA
> LIDATE CONSTRAINT node_id;
> ALTER TABLE
> Universal Metadata Schema=#
>
>
>
>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
Attachment
On 3/13/25 07:56, mark bradley wrote: > >Mark, to illustrate: > > >create table node (node_id integer primary key, fld1 varchar); > >create table node_1 (node_id integer primary key, node_1_fld boolean) > >inherits ( node); > >NOTICE: merging column "node_id" with inherited definition > > >insert into node values (1, 'dog'); > >insert into node_1 values (1, 'cat', 'f'); > > >select * from node; > > node_id | fld1 > >---------+------ > > 1 | dog > > 1 | cat > > That would make sense except that I never explicitly use the > *inherits* option in the *node_1* (my *dataset*) table. Postgres seems > to be assuming that. 1) Well it is there from \d node: Number of child tables: 2 (Use \d+ to list them.) and from \d dataset and \d processing_node Inherits: node Also it explains the behavior. 2) Postgres does not assume that, it was done explicitly by some command. > > Also, the second column in *node* and in *dataset* are two different > columns. However, Postgres insists on the *node_type* attribute being > included (last column) in table *dataset *and won't let me delete > it.**This is redundant because every dataset is a dataset type of node. That is what inheritance does: https://www.postgresql.org/docs/current/sql-createtable.html " The optional INHERITS clause specifies a list of tables from which the new table automatically inherits all columns. Parent tables can be plain tables or foreign tables. Use of INHERITS creates a persistent relationship between the new child table and its parent table(s). Schema modifications to the parent(s) normally propagate to children as well, and by default the data of the child table is included in scans of the parent(s). If the same column name exists in more than one parent table, an error is reported unless the data types of the columns match in each of the parent tables. If there is no conflict, then the duplicate columns are merged to form a single column in the new table. If the column name list of the new table contains a column name that is also inherited, the data type must likewise match the inherited column(s), and the column definitions are merged into one. If the new table explicitly specifies a default value for the column, this default overrides any defaults from inherited declarations of the column. Otherwise, any parents that specify default values for the column must all specify the same default, or an error will be reported. CHECK constraints are merged in essentially the same way as columns: if multiple parent tables and/or the new table definition contain identically-named CHECK constraints, these constraints must all have the same check expression, or an error will be reported. Constraints having the same name and expression will be merged into one copy. A constraint marked NO INHERIT in a parent will not be considered. Notice that an unnamed CHECK constraint in the new table will never be merged, since a unique name will always be chosen for it. Column STORAGE settings are also copied from parent tables. If a column in the parent table is an identity column, that property is not inherited. A column in the child table can be declared identity column if desired. " > So, I think the crux of the problem is that Postgres assumes that > inheritance is declared when it is not. No it does not, as the screenshot shows. There is an explicit setting for 'Inherited from table(s)' > > More answers to your questions coming. > > Best regards, > Mark Brady > _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_ -- Adrian Klaver adrian.klaver@aklaver.com
On 3/13/25 07:56, mark bradley wrote: > >Mark, to illustrate: > So, I think the crux of the problem is that Postgres assumes that > inheritance is declared when it is not. Follow up. From pgAdmin end: https://www.pgadmin.org/docs/pgadmin4/9.1/table_dialog.html "Use the drop-down listbox next to Inherited from table(s) to specify any parent table(s); the table will inherit columns from the selected parent table(s)." This is an explicit action. > > More answers to your questions coming. > > Best regards, > Mark Brady > _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_ > -- Adrian Klaver adrian.klaver@aklaver.com
>Postgresql does not assume / default to inheritance. In text-mode clients where you type >in "raw" SQL, you have to explicitly add an explicit "INHERITS <parent_table>" clause to the >"CREATE TABLE foo" statement.
>Are you creating the tables via PgAdmin point-and-click?
I am using PgAdmin 4 v9.1.
I think the problem may also be related to the fact that I had node_id and node_type were in both tables from an earlier design and Postgres would not let me delete node_type from the dataset table.
As an experiment, I created a simple version of the same tables from scratch without node_type in the dataset table. So far, no dups are appearing.
From: Ron Johnson <ronljohnsonjr@gmail.com>
Sent: Thursday, March 13, 2025 11:05 AM
To: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Thursday, March 13, 2025 11:05 AM
To: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Postgresql does not assume / default to inheritance. In text-mode clients where you type in "raw" SQL, you have to explicitly add an explicit "INHERITS <parent_table>" clause to the "CREATE TABLE foo" statement.
Are you creating the tables via PgAdmin point-and-click?
On Thu, Mar 13, 2025 at 10:56 AM mark bradley <markbradyju@outlook.com> wrote:
>Mark, to illustrate:
>create table node (node_id integer primary key, fld1 varchar);
>create table node_1 (node_id integer primary key, node_1_fld boolean)
>inherits ( node);
>NOTICE: merging column "node_id" with inherited definition
>insert into node values (1, 'dog');
>insert into node_1 values (1, 'cat', 'f');
>select * from node;
> node_id | fld1
>---------+------
> 1 | dog
> 1 | catThat would make sense except that I never explicitly use the inherits option in the node_1 (my dataset) table. Postgres seems to be assuming that.Also, the second column in node and in dataset are two different columns. However, Postgres insists on the node_type attribute being included (last column) in table dataset and won't let me delete it. This is redundant because every dataset is a dataset type of node....The reason for the current design is that I also have a dataflow table as shown below.A dataflow record describes the flow of data between two nodes. Now, if there are 3 types of node: dataset, processing, and user, then there are 9 types of dataflow. Hence, 9 tables are needed to represent the dataflows instead of 1. In the below, source_id and destination_id are both node_ids. If I want to know if a node is a dataset, processing node, or a user, I just look that up in the nodes table.So, I think the crux of the problem is that Postgres assumes that inheritance is declared when it is not.More answers to your questions coming.From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Wednesday, March 12, 2025 12:01 PM
To: mark bradley <markbradyju@outlook.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key ValuesOn 3/12/25 08:46, Adrian Klaver wrote:
> On 3/11/25 13:24, Adrian Klaver wrote:
>> On 3/11/25 12:55, mark bradley wrote:
>>> It happened again. Now there are no sequences (although there once
>>> was).
>>
>> Read my previous post and provide the information requested.
>>
>
> Mark sent me the below, which answers some of the questions, namely
> there is inheritance going on:
Mark, to illustrate:
create table node (node_id integer primary key, fld1 varchar);
create table node_1 (node_id integer primary key, node_1_fld boolean)
inherits ( node);
NOTICE: merging column "node_id" with inherited definition
insert into node values (1, 'dog');
insert into node_1 values (1, 'cat', 'f');
select * from node;
node_id | fld1
---------+------
1 | dog
1 | cat
This is explained here:
https://www.postgresql.org/docs/current/sql-createtable.html
INHERITS ( parent_table [, ... ] )
"... , and by default the data of the child table is included in scans
of the parent(s)."
This explains why you see duplicates of node_id.
Though if you try to enter a duplicate value in to a particular table
you get:
insert into node_1 values (1, 'test', 't');
ERROR: duplicate key value violates unique constraint "node_1_pkey"
DETAIL: Key (node_id)=(1) already exists.
This still does not explain why REINDEX TABLE node; caused data to
disappear?
> > Did you ever run VALIDATE CONSTRAINT against them?
> Here is the run
As error notes VALIDATE CONSTRAINT only works on FK and check
constraints. You would need to run against the FK constraints that where
marked NOT VALID e.g "dataset" on the dataset table.
Honestly, I think you need rework your data model. Not sure what the
inheritance is getting you. Seems simpler to just have the node table
not be inherited and just use FK relationships back to it.
>
> Universal Metadata Schema=# ALTER TABLE node VALID
> ATE CONSTRAINT node_id;
> ERROR: constraint "node_id" of relation "node" is
> not a foreign key or check constraint
> Universal Metadata Schema=#
>
> Universal Metadata Schema=# ALTER TABLE dataset VA
> LIDATE CONSTRAINT node_id;
> ALTER TABLE
> Universal Metadata Schema=#
>
>
>
>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
Attachment
On 3/13/25 08:56, mark bradley wrote: > >Postgresql does not assume / default to inheritance. In text-mode > clients where you type >in "raw" SQL, you have to explicitly add an > explicit "INHERITS <parent_table>" clause to the >"CREATE TABLE foo" > statement. > > >Are you creating the tables via PgAdmin point-and-click? > > I am using PgAdmin 4 v9.1. > > I think the problem may also be related to the fact that I had > *node_id* and *node_type *were in both tables from an earlier design and > Postgres would not let me delete* node_type* from the* dataset* table. Because it was inherited: create table node (node_id integer primary key, fld1 varchar); create table node_1 (node_id integer primary key, node_1_fld boolean) inherits ( node); alter table node_1 drop column fld1; ERROR: cannot drop inherited column "fld1" > > As an experiment, I created a simple version of the same tables from > scratch without *node_type* in the *dataset* table. So far, no dups are > appearing. I'm assuming that by 'simple version' you mean no inheritance. > > Best regards, > Mark Brady > _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_ > ------------------------------------------------------------------------ -- Adrian Klaver adrian.klaver@aklaver.com
> I'm assuming that by 'simple version' you mean no inheritance.
Inheritance was not specified by me in either case. By simple, I mean that I used fewer columns in the test version, like so:


Where node_id is a foreign key in dataset, and node_type is not.
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Thursday, March 13, 2025 12:05 PM
To: mark bradley <markbradyju@outlook.com>; Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Thursday, March 13, 2025 12:05 PM
To: mark bradley <markbradyju@outlook.com>; Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
On 3/13/25 08:56, mark bradley wrote:
> >Postgresql does not assume / default to inheritance. In text-mode
> clients where you type >in "raw" SQL, you have to explicitly add an
> explicit "INHERITS <parent_table>" clause to the >"CREATE TABLE foo"
> statement.
>
> >Are you creating the tables via PgAdmin point-and-click?
>
> I am using PgAdmin 4 v9.1.
>
> I think the problem may also be related to the fact that I had
> *node_id* and *node_type *were in both tables from an earlier design and
> Postgres would not let me delete* node_type* from the* dataset* table.
Because it was inherited:
create table node (node_id integer primary key, fld1 varchar);
create table node_1 (node_id integer primary key, node_1_fld boolean)
inherits ( node);
alter table node_1 drop column fld1;
ERROR: cannot drop inherited column "fld1"
>
> As an experiment, I created a simple version of the same tables from
> scratch without *node_type* in the *dataset* table. So far, no dups are
> appearing.
I'm assuming that by 'simple version' you mean no inheritance.
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
--
Adrian Klaver
adrian.klaver@aklaver.com
> >Postgresql does not assume / default to inheritance. In text-mode
> clients where you type >in "raw" SQL, you have to explicitly add an
> explicit "INHERITS <parent_table>" clause to the >"CREATE TABLE foo"
> statement.
>
> >Are you creating the tables via PgAdmin point-and-click?
>
> I am using PgAdmin 4 v9.1.
>
> I think the problem may also be related to the fact that I had
> *node_id* and *node_type *were in both tables from an earlier design and
> Postgres would not let me delete* node_type* from the* dataset* table.
Because it was inherited:
create table node (node_id integer primary key, fld1 varchar);
create table node_1 (node_id integer primary key, node_1_fld boolean)
inherits ( node);
alter table node_1 drop column fld1;
ERROR: cannot drop inherited column "fld1"
>
> As an experiment, I created a simple version of the same tables from
> scratch without *node_type* in the *dataset* table. So far, no dups are
> appearing.
I'm assuming that by 'simple version' you mean no inheritance.
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
--
Adrian Klaver
adrian.klaver@aklaver.com
Attachment
> I'm assuming that by 'simple version' you mean no inheritance.
Anyway, inheritance can be undone via
ALTER TABLE dataset NO INHERIT node;
Now, there are no dups and hopefully it will stay that way.
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Thursday, March 13, 2025 12:05 PM
To: mark bradley <markbradyju@outlook.com>; Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Thursday, March 13, 2025 12:05 PM
To: mark bradley <markbradyju@outlook.com>; Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
On 3/13/25 08:56, mark bradley wrote:
> >Postgresql does not assume / default to inheritance. In text-mode
> clients where you type >in "raw" SQL, you have to explicitly add an
> explicit "INHERITS <parent_table>" clause to the >"CREATE TABLE foo"
> statement.
>
> >Are you creating the tables via PgAdmin point-and-click?
>
> I am using PgAdmin 4 v9.1.
>
> I think the problem may also be related to the fact that I had
> *node_id* and *node_type *were in both tables from an earlier design and
> Postgres would not let me delete* node_type* from the* dataset* table.
Because it was inherited:
create table node (node_id integer primary key, fld1 varchar);
create table node_1 (node_id integer primary key, node_1_fld boolean)
inherits ( node);
alter table node_1 drop column fld1;
ERROR: cannot drop inherited column "fld1"
>
> As an experiment, I created a simple version of the same tables from
> scratch without *node_type* in the *dataset* table. So far, no dups are
> appearing.
I'm assuming that by 'simple version' you mean no inheritance.
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
--
Adrian Klaver
adrian.klaver@aklaver.com
> >Postgresql does not assume / default to inheritance. In text-mode
> clients where you type >in "raw" SQL, you have to explicitly add an
> explicit "INHERITS <parent_table>" clause to the >"CREATE TABLE foo"
> statement.
>
> >Are you creating the tables via PgAdmin point-and-click?
>
> I am using PgAdmin 4 v9.1.
>
> I think the problem may also be related to the fact that I had
> *node_id* and *node_type *were in both tables from an earlier design and
> Postgres would not let me delete* node_type* from the* dataset* table.
Because it was inherited:
create table node (node_id integer primary key, fld1 varchar);
create table node_1 (node_id integer primary key, node_1_fld boolean)
inherits ( node);
alter table node_1 drop column fld1;
ERROR: cannot drop inherited column "fld1"
>
> As an experiment, I created a simple version of the same tables from
> scratch without *node_type* in the *dataset* table. So far, no dups are
> appearing.
I'm assuming that by 'simple version' you mean no inheritance.
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
--
Adrian Klaver
adrian.klaver@aklaver.com
On 3/13/25 09:25, mark bradley wrote: > > > I'm assuming that by 'simple version' you mean no inheritance. > > Inheritance was not specified by me in either case. By simple, I mean Unless someone else is working on this code, it was done by you. Pretty sure it had to do with from this message: https://www.postgresql.org/message-id/SJ2PR22MB4328CEB1B47FC1AC4A996CB3BAD12%40SJ2PR22MB4328.namprd22.prod.outlook.com "Although I did not explicitly use Postgres to declare inheritance, logically speaking table dataset and processing _node inherit or are subclasses of node because they are subclasses of node in a dataflow diagram." I'm guessing you actually did do the subclassing(inheritance) as it was an option presented in the pgAdmin4 CREATE TABLE screen. Anyway at this point the problem has been identified and a solution devised. > that I used fewer columns in the test version, like so: > > > > Where *node_id* is a foreign key in dataset, and *node_type* is not. > > Best regards, > Mark Brady -- Adrian Klaver adrian.klaver@aklaver.com
Adrian & Ron,
Thank you for your assistance.
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Thursday, March 13, 2025 1:03 PM
To: mark bradley <markbradyju@outlook.com>; Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
Sent: Thursday, March 13, 2025 1:03 PM
To: mark bradley <markbradyju@outlook.com>; Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Duplicate Key Values
On 3/13/25 09:25, mark bradley wrote:
>
> > I'm assuming that by 'simple version' you mean no inheritance.
>
> Inheritance was not specified by me in either case. By simple, I mean
Unless someone else is working on this code, it was done by you.
Pretty sure it had to do with from this message:
https://www.postgresql.org/message-id/SJ2PR22MB4328CEB1B47FC1AC4A996CB3BAD12%40SJ2PR22MB4328.namprd22.prod.outlook.com
"Although I did not explicitly use Postgres to declare inheritance,
logically speaking table dataset and processing _node inherit or are
subclasses of node because they are subclasses of node in a dataflow
diagram."
I'm guessing you actually did do the subclassing(inheritance) as it was
an option presented in the pgAdmin4 CREATE TABLE screen.
Anyway at this point the problem has been identified and a solution devised.
> that I used fewer columns in the test version, like so:
>
>
>
> Where *node_id* is a foreign key in dataset, and *node_type* is not.
>
> Best regards,
> Mark Brady
--
Adrian Klaver
adrian.klaver@aklaver.com
>
> > I'm assuming that by 'simple version' you mean no inheritance.
>
> Inheritance was not specified by me in either case. By simple, I mean
Unless someone else is working on this code, it was done by you.
Pretty sure it had to do with from this message:
https://www.postgresql.org/message-id/SJ2PR22MB4328CEB1B47FC1AC4A996CB3BAD12%40SJ2PR22MB4328.namprd22.prod.outlook.com
"Although I did not explicitly use Postgres to declare inheritance,
logically speaking table dataset and processing _node inherit or are
subclasses of node because they are subclasses of node in a dataflow
diagram."
I'm guessing you actually did do the subclassing(inheritance) as it was
an option presented in the pgAdmin4 CREATE TABLE screen.
Anyway at this point the problem has been identified and a solution devised.
> that I used fewer columns in the test version, like so:
>
>
>
> Where *node_id* is a foreign key in dataset, and *node_type* is not.
>
> Best regards,
> Mark Brady
--
Adrian Klaver
adrian.klaver@aklaver.com