Thread: Rows violating Foreign key constraint exists

Rows violating Foreign key constraint exists

From
Nandakumar M
Date:
Hi,

I am using PG version 10.5.

Saw a table where we have foreign key defined but few thousand rows
violate the foreign key constraint.

I understand that one possibility of this happening is if we had
manually disabled the triggers that do FK integrity checks and re
enabled them afterwards. Is there any way to confirm if this has
happened -- does PG internally maintain something like an audit
history of DDL changes?

I am not sure if this is due to some mistake in our end or if there
are any known issues in PG 10.5 that would cause this.

I tried the ALTER TABLE ... VALIDATE CONSTRAINT for this particular FK
and it doesn't report any errors. The documentation pretty clearly
mentions that 'VALIDATE CONSTRAINT' is used only to check those
constraints created with 'NOT VALID' clause.

It might be useful to have an option to check integrity of any FK (not
just ones created with NOT VALID clause). Please let me know if there
is already any way to do this.

Also, is there any way to make sure the FK checking trigger can never
be disabled (so that such a case will never arise)?

How do I proceed from here - Do I just delete the inconsistent rows or
is there something more I have to do?

Thanks for your help.

Regards,
Nanda



Re: Rows violating Foreign key constraint exists

From
Laurenz Albe
Date:
On Thu, 2019-11-28 at 18:55 +0530, Nandakumar M wrote:
> I am using PG version 10.5.
> 
> Saw a table where we have foreign key defined but few thousand rows
> violate the foreign key constraint.
> 
> I understand that one possibility of this happening is if we had
> manually disabled the triggers that do FK integrity checks and re
> enabled them afterwards. Is there any way to confirm if this has
> happened -- does PG internally maintain something like an audit
> history of DDL changes?
> 
> I am not sure if this is due to some mistake in our end or if there
> are any known issues in PG 10.5 that would cause this.
> 
> I tried the ALTER TABLE ... VALIDATE CONSTRAINT for this particular FK
> and it doesn't report any errors. The documentation pretty clearly
> mentions that 'VALIDATE CONSTRAINT' is used only to check those
> constraints created with 'NOT VALID' clause.
> 
> It might be useful to have an option to check integrity of any FK (not
> just ones created with NOT VALID clause). Please let me know if there
> is already any way to do this.
> 
> Also, is there any way to make sure the FK checking trigger can never
> be disabled (so that such a case will never arise)?
> 
> How do I proceed from here - Do I just delete the inconsistent rows or
> is there something more I have to do?
> 
> Thanks for your help.

It could be that somebody disabled the triggers, but that would have to
be a superuser.  And I hope that people randomly disabling system triggers
on tables don't have superuser access to your database.

There is no way to ascertain that that did happen.
It could be in the log if you have "log_statement = 'ddl'" and keep old
logs around.

The other option is that you are suppering from data corruption, perhaps
because of a software bug, but most likely because of hardware problems.

If you don't know better, assume the worst.

I would test the hardware for problems.
Once you are sure the hardware is fine, manually fix the corruption
by deleting rows that violate the constraint.
Then create a new PostgreSQL cluster with "initdb", dump the original
database with "pg_dumpall" and restore it to the new cluster.
That should get rid of all data corruption.

Make sure you upgrade to 10.11.

Yours,
Laurenz Albe

-- 
Cybertec | https://www.cybertec-postgresql.com




Re: Rows violating Foreign key constraint exists

From
Adrian Klaver
Date:
On 11/28/19 5:25 AM, Nandakumar M wrote:
> Hi,
> 
> I am using PG version 10.5.
> 
> Saw a table where we have foreign key defined but few thousand rows
> violate the foreign key constraint.

Have you verified that the FK is not in the parent table and is just not 
some index error/corruption?


> It might be useful to have an option to check integrity of any FK (not
> just ones created with NOT VALID clause). Please let me know if there
> is already any way to do this.

Assuming no corruption, how about?:

select distinct on(fk_field) fk_field from child_table where 
child_table.fk_field not in (select parent_field from parent_table);

> 
> Also, is there any way to make sure the FK checking trigger can never
> be disabled (so that such a case will never arise)?

Not sure that can happen as it is baked into existing code. For instance:

https://www.postgresql.org/docs/11/app-pgdump.html

"
--disable-triggers

     This option is relevant only when creating a data-only dump. It 
instructs pg_dump to include commands to temporarily disable triggers on 
the target tables while the data is reloaded. Use this if you have 
referential integrity checks or other triggers on the tables that you do 
not want to invoke during data reload.

...

"
> 
> How do I proceed from here - Do I just delete the inconsistent rows or
> is there something more I have to do?
> 
> Thanks for your help.
> 
> Regards,
> Nanda
> 
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Rows violating Foreign key constraint exists

From
Nandakumar M
Date:
Hi,

> It could be that somebody disabled the triggers, but that would have to
> be a superuser.  And I hope that people randomly disabling system triggers
> on tables don't have superuser access to your database.

It is unlikely that this happened. So I am assuming corruption.
But I am able to query both the referred and referring table
successfully without any 'missing chunk' or similar errors that
usually indicate corruption.
Is it possible that corruption might cause data loss like this without
any errors?

> The other option is that you are suppering from data corruption, perhaps
> because of a software bug, but most likely because of hardware problems.
>
> If you don't know better, assume the worst.
>
> I would test the hardware for problems.
> Once you are sure the hardware is fine, manually fix the corruption
> by deleting rows that violate the constraint.

How do I test for hardware problems? Are there any tools for this? I
am running PG on windows machine.

Thanks.

Regards,
Nanda



Re: Rows violating Foreign key constraint exists

From
Nandakumar M
Date:
Hi,


> Have you verified that the FK is not in the parent table and is just not
> some index error/corruption?

Yes.

> >
> > Also, is there any way to make sure the FK checking trigger can never
> > be disabled (so that such a case will never arise)?
>
> Not sure that can happen as it is baked into existing code. For instance:
>
> https://www.postgresql.org/docs/11/app-pgdump.html
>
> "
> --disable-triggers
>
>      This option is relevant only when creating a data-only dump. It
> instructs pg_dump to include commands to temporarily disable triggers on
> the target tables while the data is reloaded. Use this if you have
> referential integrity checks or other triggers on the tables that you do
> not want to invoke during data reload.

Found this thread which discusses the same topic as here.
https://www.postgresql.org/message-id/20190715160926.GA17140%40alvherre.pgsql

PG already allows a new FK to be created with ADD CONSTRAINT ... NOT
VALID clause which can be validated later using ALTER TABLE ...
VALIDATE CONSTRAINT.
I guess what we are looking for here is the same but for existing FKs.

i.e Something like

`ALTER TABLE distributors ALTER CONSTRAINT distfk NOT VALID;`
`ALTER TABLE distributors VALIDATE CONSTRAINT distfk;`


Regards,
Nanda



Re: Rows violating Foreign key constraint exists

From
Nandakumar M
Date:
Hi,

> PG already allows a new FK to be created with ADD CONSTRAINT ... NOT
> VALID clause which can be validated later using ALTER TABLE ...
> VALIDATE CONSTRAINT.
> I guess what we are looking for here is the same but for existing FKs.
>
> i.e Something like
>
> `ALTER TABLE distributors ALTER CONSTRAINT distfk NOT VALID;`
> `ALTER TABLE distributors VALIDATE CONSTRAINT distfk;`
>

I was wrong about this. ADD CONSTRAINT ... NOT VALID just postpones
integrity existing on existing data.
There is no equivalent for that when altering an existing FK.

Maybe, DISABLE ALL TRIGGERS can mark the FKs as NOT VALID. ALTER TABLE
... VALIDATE CONSTRAINT would now report inconsistencies if any.
Also, VALIDATE CONSTRAINT would have to verify that the trigger
implementing FK is enabled.

Hope/unsure if this would not break backwards compatibility.

Regards,
Nanda



Re: Rows violating Foreign key constraint exists

From
Tom Lane
Date:
Nandakumar M <m.nanda92@gmail.com> writes:
>> It could be that somebody disabled the triggers, but that would have to
>> be a superuser.  And I hope that people randomly disabling system triggers
>> on tables don't have superuser access to your database.

> It is unlikely that this happened. So I am assuming corruption.
> But I am able to query both the referred and referring table
> successfully without any 'missing chunk' or similar errors that
> usually indicate corruption.
> Is it possible that corruption might cause data loss like this without
> any errors?

The most likely "corruption" explanation is something wrong with the
indexes on the referenced and/or referencing column, causing rows to
not be found when referential actions should have found them.  Random
querying of the tables wouldn't necessarily expose that --- you'd need
to be sure that your queries use the questionable indexes, and maybe
even search for some of the specific rows that seem mis-indexed.

            regards, tom lane



Re: Rows violating Foreign key constraint exists

From
Peter Geoghegan
Date:
On Fri, Nov 29, 2019 at 7:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The most likely "corruption" explanation is something wrong with the
> indexes on the referenced and/or referencing column, causing rows to
> not be found when referential actions should have found them.  Random
> querying of the tables wouldn't necessarily expose that --- you'd need
> to be sure that your queries use the questionable indexes, and maybe
> even search for some of the specific rows that seem mis-indexed.

Or try using contrib/amcheck, which is available in Postgres 10.
Perhaps try the query here, modified to verify all B-Tree indexes (not
just those indexes in the pg_catalog schema):

https://www.postgresql.org/docs/10/amcheck.html

--
Peter Geoghegan



Re: Rows violating Foreign key constraint exists

From
Nandakumar M
Date:
Hi,

Sorry about the delay in getting back with the results.

>
> On Fri, Nov 29, 2019 at 7:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > The most likely "corruption" explanation is something wrong with the
> > indexes on the referenced and/or referencing column, causing rows to
> > not be found when referential actions should have found them.  Random
> > querying of the tables wouldn't necessarily expose that --- you'd need
> > to be sure that your queries use the questionable indexes, and maybe
> > even search for some of the specific rows that seem mis-indexed.

This indeed is the problem. Select query with criteria on FK column
did not return any rows. However, after I disabled indexscan, bitmap
indexscan and tried the same query this time sequential scan was
performed by PG and it returned 80 rows.

> Or try using contrib/amcheck, which is available in Postgres 10.
> Perhaps try the query here, modified to verify all B-Tree indexes (not
> just those indexes in the pg_catalog schema):
>
> https://www.postgresql.org/docs/10/amcheck.html
>
> --

I tried amcheck query on all indexes in the database and it did not
raise any errors.

How do I identify such corruption exists in the database? Will
enabling page checksum be of help here?

Thanks.

Regards,
Nanda