Thread: Duplicate primary keys

Duplicate primary keys

From
"Jason Tan Boon Teck"
Date:
I was performing a reindex yesterday when it failed stating a duplicate primary key violation. I looked into the database and found 3 tables with duplicate rows.

How reliable is PostgreSQL in *preventing* rows duplicate primary keys to be inserted? I am not using any table inheritance and the db is ver 8.1.1 from Debian Etch. The db is used by a PHP app.


TIA.

--
Jason Tan Boon Teck

Re: Duplicate primary keys

From
Tom Lane
Date:
"Jason Tan Boon Teck" <tanboonteck@gmail.com> writes:
> How reliable is PostgreSQL in *preventing* rows duplicate primary keys to be
> inserted?

When we find such a bug, we fix it ...

> I am not using any table inheritance and the db is ver 8.1.1

... but those fixes don't help people who are running ancient releases.
There are multiple data-corruption bugs fixed in the 8.1 branch since
8.1.1, including at least one that might explain your problem:

2007-03-14 14:48  tgl

    * src/backend/commands/: vacuum.c (REL7_3_STABLE), vacuum.c
    (REL7_4_STABLE), vacuum.c (REL8_1_STABLE), vacuum.c
    (REL8_0_STABLE), vacuum.c (REL8_2_STABLE), vacuum.c: Fix a
    longstanding bug in VACUUM FULL's handling of update chains.  The
    code did not expect that a DEAD tuple could follow a RECENTLY_DEAD
    tuple in an update chain, but because the OldestXmin rule for
    determining deadness is a simplification of reality, it is possible
    for this situation to occur (implying that the RECENTLY_DEAD tuple
    is in fact dead to all observers, but this patch does not attempt
    to exploit that).  The code would follow a chain forward all the
    way, but then stop before a DEAD tuple when backing up, meaning
    that not all of the chain got moved.  This could lead to copying
    the chain multiple times (resulting in duplicate copies of the live
    tuple at its end), or leaving dangling index entries behind (which,
    aside from generating warnings from later vacuums, creates a risk
    of wrong query results or bogus duplicate-key errors once the heap
    slot the index entry points to is repopulated).

    The fix is to recheck HeapTupleSatisfiesVacuum while following a
    chain forward, and to stop if a DEAD tuple is reached.    Each
    contiguous group of RECENTLY_DEAD tuples will therefore be copied
    as a separate chain.  The patch also adds a couple of extra sanity
    checks to verify correct behavior.

    Per report and test case from Pavan Deolasee.

The current release in that branch is 8.1.11.  I recommend updating
and reindexing.

            regards, tom lane

Re: Duplicate primary keys

From
"Jason Tan Boon Teck"
Date:
I will make an upgrade on my next opportunity.

Thanks, Tom.

jason

On Jan 18, 2008 11:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Jason Tan Boon Teck" < tanboonteck@gmail.com> writes:
> How reliable is PostgreSQL in *preventing* rows duplicate primary keys to be
> inserted?

When we find such a bug, we fix it ...

> I am not using any table inheritance and the db is ver 8.1.1

... but those fixes don't help people who are running ancient releases.
There are multiple data-corruption bugs fixed in the 8.1 branch since
8.1.1, including at least one that might explain your problem:

2007-03-14 14:48  tgl

       * src/backend/commands/: vacuum.c (REL7_3_STABLE), vacuum.c
       (REL7_4_STABLE), vacuum.c (REL8_1_STABLE), vacuum.c
       (REL8_0_STABLE), vacuum.c (REL8_2_STABLE), vacuum.c: Fix a
       longstanding bug in VACUUM FULL's handling of update chains.  The
       code did not expect that a DEAD tuple could follow a RECENTLY_DEAD
       tuple in an update chain, but because the OldestXmin rule for
       determining deadness is a simplification of reality, it is possible
       for this situation to occur (implying that the RECENTLY_DEAD tuple
       is in fact dead to all observers, but this patch does not attempt
       to exploit that).  The code would follow a chain forward all the
       way, but then stop before a DEAD tuple when backing up, meaning
       that not all of the chain got moved.  This could lead to copying
       the chain multiple times (resulting in duplicate copies of the live
       tuple at its end), or leaving dangling index entries behind (which,
       aside from generating warnings from later vacuums, creates a risk
       of wrong query results or bogus duplicate-key errors once the heap
       slot the index entry points to is repopulated).

       The fix is to recheck HeapTupleSatisfiesVacuum while following a
       chain forward, and to stop if a DEAD tuple is reached.  Each
       contiguous group of RECENTLY_DEAD tuples will therefore be copied
       as a separate chain.  The patch also adds a couple of extra sanity
       checks to verify correct behavior.

       Per report and test case from Pavan Deolasee.

The current release in that branch is 8.1.11.  I recommend updating
and reindexing.

                       regards, tom lane



--
Jason Tan Boon Teck