Thread: Duplicate primary keys/rows

Duplicate primary keys/rows

From
CSN
Date:
This is weird. I set up a table with a serial id field
and created a primary key on it. Then I imported data.
Running an app against it, I got periodic errors
stating "duplicate key violates unique constraint
"pkey_table1." Looking through the table (with
phppgadmin), there are duplicate rows:

id|f1|f2|f3|f4
585|c|a|e|f
586|a|b|c|d
586|a|b|c|d
587|g|e|r|z

However:

select * from table1 where id=586;
586|a|b|c|d

Yet:
select * from table1 where id>=585 and id<=587;
585|c|a|e|f
586|a|b|c|d
586|a|b|c|d
587|g|e|r|z

Wow, how is this possible? I'm using PG 8.0.3 on
Windows XP. This computer has been crashing repeatedly
lately, if that could be blamed (bad memory? hard
disk? I haven't quite figured out why.) Using
phppgadmin, I was able to delete one of the duplicate
rows (there are several) - don't know how it does that
- maybe using OIDs?

CSN



__________________________________
Yahoo! Music Unlimited
Access over 1 million songs. Try it free.
http://music.yahoo.com/unlimited/

Re: Duplicate primary keys/rows

From
Michael Fuhr
Date:
On Sun, Oct 09, 2005 at 12:46:51PM -0700, CSN wrote:
> select * from table1 where id=586;
> 586|a|b|c|d

Do you get different results from the following queries?

SET enable_seqscan TO on;
SET enable_indexscan TO off;
SELECT * FROM table1 WHERE id = 586;

SET enable_seqscan TO off;
SET enable_indexscan TO on;
SELECT * FROM table1 WHERE id = 586;

> Yet:
> select * from table1 where id>=585 and id<=587;
> 585|c|a|e|f
> 586|a|b|c|d
> 586|a|b|c|d
> 587|g|e|r|z

What's the output of the following query?

RESET enable_seqscan;
RESET enable_indexscan;

SELECT oid, ctid, xmin, cmin, xmax, cmax, *
FROM table1
WHERE id >= 585 AND id <= 587;

If you get the error 'column "oid" does not exist' then you've
created the table without oids, so just omit oid from the select
list:

SELECT ctid, xmin, cmin, xmax, cmax, *
FROM table1
WHERE id >= 585 AND id <= 587;

> Wow, how is this possible? I'm using PG 8.0.3 on
> Windows XP. This computer has been crashing repeatedly
> lately, if that could be blamed (bad memory? hard
> disk? I haven't quite figured out why.)

Faulty hardware is one possibile explanation.

--
Michael Fuhr

Re: Duplicate primary keys/rows

From
CSN
Date:
--- Michael Fuhr <mike@fuhr.org> wrote:

> On Sun, Oct 09, 2005 at 12:46:51PM -0700, CSN wrote:
> > select * from table1 where id=586;
> > 586|a|b|c|d
>
> Do you get different results from the following
> queries?
>
> SET enable_seqscan TO on;
> SET enable_indexscan TO off;
> SELECT * FROM table1 WHERE id = 586;

This returns 2 rows.

> SET enable_seqscan TO off;
> SET enable_indexscan TO on;
> SELECT * FROM table1 WHERE id = 586;

This returns 1 row.

> > Yet:
> > select * from table1 where id>=585 and id<=587;
> > 585|c|a|e|f
> > 586|a|b|c|d
> > 586|a|b|c|d
> > 587|g|e|r|z
>
> What's the output of the following query?
>
> RESET enable_seqscan;
> RESET enable_indexscan;
>
> SELECT oid, ctid, xmin, cmin, xmax, cmax, *
> FROM table1
> WHERE id >= 585 AND id <= 587;

  oid   |   ctid    |  xmin   | cmin |  xmax   | cmax
| id
--------+-----------+---------+------+---------+------+-----
 125465 | (3143,78) | 1664385 |    0 | 1664386 |    2
| 984
 125466 | (2745,50) | 1481020 |    0 | 1682425 |    2
| 985
 125466 | (2672,11) | 1445346 |    0 | 1481020 |    0
| 985
 125467 | (3159,28) | 1671875 |    0 | 1671876 |    2
| 986

(I'm using a different duplicate row - 985. I deleted
586's duplicate.)

Is this a problem with the index? Would rebuilding
them fix this problem? I'm still curious why this
happened, and somewhat troubled that something like
this can happen.

Thanks for your help,
CSN


> If you get the error 'column "oid" does not exist'
> then you've
> created the table without oids, so just omit oid
> from the select
> list:
>
> SELECT ctid, xmin, cmin, xmax, cmax, *
> FROM table1
> WHERE id >= 585 AND id <= 587;
>
> > Wow, how is this possible? I'm using PG 8.0.3 on
> > Windows XP. This computer has been crashing
> repeatedly
> > lately, if that could be blamed (bad memory? hard
> > disk? I haven't quite figured out why.)
>
> Faulty hardware is one possibile explanation.
>
> --
> Michael Fuhr
>





__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com

Re: Duplicate primary keys/rows

From
Tom Lane
Date:
CSN <cool_screen_name90001@yahoo.com> writes:
>   oid   |   ctid    |  xmin   | cmin |  xmax   | cmax | id
> --------+-----------+---------+------+---------+------+-----
>  125466 | (2672,11) | 1445346 |    0 | 1481020 |    0 | 985
>  125466 | (2745,50) | 1481020 |    0 | 1682425 |    2 | 985

Hmm.  The fact that the dup rows have the same OID indicates pretty
strongly that they are actually two versions of the same row, and
not two independently inserted rows.  Furthermore we can see that xact
1481020 deleted the first version and inserted the second (note I took
the liberty of rearranging your output to make the rows appear in
chronological order).

So the index hasn't screwed up, exactly; the problem is that both rows
appear as good at the same time.  But why?

It's really highly annoying that we can't see the contents of the
infomasks for the rows.  Would you be willing to grab a copy of
pg_filedump and dump out these two data pages so we can see the
complete tuple headers?

(If you don't have a compiler then you'd need to find a precompiled
copy of pg_filedump for Windows.  I don't know if anyone's made one
available.)

Given that you say the machine has been crashing, my bet is that a crash
caused the loss of pg_clog status for xid 1481020 at a time when
2745,50's xmin had been marked committed good, but 2672,11's xmax had
not been similarly marked.  We have sufficient defenses against this
sort of thing *if the disk drive does not lie about write complete*.
(Unfortunately the vast majority of el-cheapo PCs are configured to lie
with abandon, which means that we can't guarantee data consistency
across power failures on such hardware.)  It'd be nice to get direct
confirmation of that theory though.

            regards, tom lane

Re: Duplicate primary keys/rows

From
CSN
Date:
I don't have a compiler on this machine. If somebody
can point me to a copy of pg_filedump for Windows (I
didn't see any using Google) I'd be happy to use it.
Or perhaps I could compile it under cygwin.

The hard drive is a Western Digital 200GB JD (SATA),
if that can be used to determine how badly it lies. ;)

Thanks,
CSN


--- Tom Lane <tgl@sss.pgh.pa.us> wrote:

> CSN <cool_screen_name90001@yahoo.com> writes:
> >   oid   |   ctid    |  xmin   | cmin |  xmax   |
> cmax | id
> >
>
--------+-----------+---------+------+---------+------+-----
> >  125466 | (2672,11) | 1445346 |    0 | 1481020 |
>  0 | 985
> >  125466 | (2745,50) | 1481020 |    0 | 1682425 |
>  2 | 985
>
> Hmm.  The fact that the dup rows have the same OID
> indicates pretty
> strongly that they are actually two versions of the
> same row, and
> not two independently inserted rows.  Furthermore we
> can see that xact
> 1481020 deleted the first version and inserted the
> second (note I took
> the liberty of rearranging your output to make the
> rows appear in
> chronological order).
>
> So the index hasn't screwed up, exactly; the problem
> is that both rows
> appear as good at the same time.  But why?
>
> It's really highly annoying that we can't see the
> contents of the
> infomasks for the rows.  Would you be willing to
> grab a copy of
> pg_filedump and dump out these two data pages so we
> can see the
> complete tuple headers?
>
> (If you don't have a compiler then you'd need to
> find a precompiled
> copy of pg_filedump for Windows.  I don't know if
> anyone's made one
> available.)
>
> Given that you say the machine has been crashing, my
> bet is that a crash
> caused the loss of pg_clog status for xid 1481020 at
> a time when
> 2745,50's xmin had been marked committed good, but
> 2672,11's xmax had
> not been similarly marked.  We have sufficient
> defenses against this
> sort of thing *if the disk drive does not lie about
> write complete*.
> (Unfortunately the vast majority of el-cheapo PCs
> are configured to lie
> with abandon, which means that we can't guarantee
> data consistency
> across power failures on such hardware.)  It'd be
> nice to get direct
> confirmation of that theory though.
>
>             regards, tom lane
>




__________________________________
Yahoo! Music Unlimited
Access over 1 million songs. Try it free.
http://music.yahoo.com/unlimited/

Re: Duplicate primary keys/rows

From
Michael Fuhr
Date:
On Mon, Oct 10, 2005 at 04:28:57PM -0400, Tom Lane wrote:
> It's really highly annoying that we can't see the contents of the
> infomasks for the rows.

Any particular reason there isn't an infomask system column?

--
Michael Fuhr

Re: Duplicate primary keys/rows

From
Scott Marlowe
Date:
On Mon, 2005-10-10 at 15:51, CSN wrote:
> I don't have a compiler on this machine. If somebody
> can point me to a copy of pg_filedump for Windows (I
> didn't see any using Google) I'd be happy to use it.
> Or perhaps I could compile it under cygwin.
>
> The hard drive is a Western Digital 200GB JD (SATA),
> if that can be used to determine how badly it lies. ;)

The general rule is that ATA (parallel or serial) drives lie.  You can
turn off the write cache, which will make it slower, but then it should
be reliable during a power loss.

Or, put it all on a big UPS and hope the power supply never goes out.

Re: Duplicate primary keys/rows

From
Tom Lane
Date:
Michael Fuhr <mike@fuhr.org> writes:
> On Mon, Oct 10, 2005 at 04:28:57PM -0400, Tom Lane wrote:
>> It's really highly annoying that we can't see the contents of the
>> infomasks for the rows.

> Any particular reason there isn't an infomask system column?

(a) inertia

(b) lack of desire to add an additional pg_attribute row per table.

I recall having proposed that we stop storing explicit pg_attribute
entries for system columns, which would make this sort of change easier
to make, and would save a pretty considerable amount of space in
pg_attribute too.  (In the present regression database, about 45% of the
rows in pg_attribute are for system columns; that might be overly high
for real-world DBs though.)  But people were a bit worried about what
might break.

            regards, tom lane

Re: Duplicate primary keys/rows

From
Martijn van Oosterhout
Date:
On Mon, Oct 10, 2005 at 05:28:17PM -0400, Tom Lane wrote:
> I recall having proposed that we stop storing explicit pg_attribute
> entries for system columns, which would make this sort of change easier
> to make, and would save a pretty considerable amount of space in
> pg_attribute too.  (In the present regression database, about 45% of the
> rows in pg_attribute are for system columns; that might be overly high
> for real-world DBs though.)  But people were a bit worried about what
> might break.

In catalog/heap.c there already is a SystemAttributeByName() to do the
legwork. Seems to me all you'd need to do is check just as you're about
to fail on "attribute not found".

If you actually look at scanRTEForColumn() in parser/parse_relation.c
it actually checks to see if a column name could be a system column
name, *before* looking it up in the catalog. Remove the catalog test
(except for OID obviously) and it'd sail right through. It'd be
interesting to see what happened...

Given that internally, they're referred to by number, it might not be
so bad. As usual, external clients might get confused if they're not
there...
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: Duplicate primary keys/rows

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> On Mon, Oct 10, 2005 at 05:28:17PM -0400, Tom Lane wrote:
>> But people were a bit worried about what
>> might break.

> In catalog/heap.c there already is a SystemAttributeByName() to do the
> legwork. Seems to me all you'd need to do is check just as you're about
> to fail on "attribute not found".

We could certainly make it work as far as the backend is concerned.
The issue is whether there is any client code out there that will fail
if these entries are no longer present in pg_attribute.

            regards, tom lane

Re: Duplicate primary keys/rows

From
CSN
Date:
I don't know if I'm going to get a copy of
pg_filedump. What's the best way to fix this - dump
then restore?

CSN


--- Tom Lane <tgl@sss.pgh.pa.us> wrote:

> CSN <cool_screen_name90001@yahoo.com> writes:
> >   oid   |   ctid    |  xmin   | cmin |  xmax   |
> cmax | id
> >
>
--------+-----------+---------+------+---------+------+-----
> >  125466 | (2672,11) | 1445346 |    0 | 1481020 |
>  0 | 985
> >  125466 | (2745,50) | 1481020 |    0 | 1682425 |
>  2 | 985
>
> Hmm.  The fact that the dup rows have the same OID
> indicates pretty
> strongly that they are actually two versions of the
> same row, and
> not two independently inserted rows.  Furthermore we
> can see that xact
> 1481020 deleted the first version and inserted the
> second (note I took
> the liberty of rearranging your output to make the
> rows appear in
> chronological order).
>
> So the index hasn't screwed up, exactly; the problem
> is that both rows
> appear as good at the same time.  But why?
>
> It's really highly annoying that we can't see the
> contents of the
> infomasks for the rows.  Would you be willing to
> grab a copy of
> pg_filedump and dump out these two data pages so we
> can see the
> complete tuple headers?
>
> (If you don't have a compiler then you'd need to
> find a precompiled
> copy of pg_filedump for Windows.  I don't know if
> anyone's made one
> available.)
>
> Given that you say the machine has been crashing, my
> bet is that a crash
> caused the loss of pg_clog status for xid 1481020 at
> a time when
> 2745,50's xmin had been marked committed good, but
> 2672,11's xmax had
> not been similarly marked.  We have sufficient
> defenses against this
> sort of thing *if the disk drive does not lie about
> write complete*.
> (Unfortunately the vast majority of el-cheapo PCs
> are configured to lie
> with abandon, which means that we can't guarantee
> data consistency
> across power failures on such hardware.)  It'd be
> nice to get direct
> confirmation of that theory though.
>
>             regards, tom lane
>





__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com