Duplicate key existant/index visibility bug in 9.3.3 - Mailing list pgsql-bugs

From Erik Jones
Subject Duplicate key existant/index visibility bug in 9.3.3
Date
Msg-id D43A52C1-79DD-4621-B7D2-B21BF66DB2C2@engineyard.com
Whole thread Raw
Responses Re: Duplicate key existant/index visibility bug in 9.3.3  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Greetings,

Today I had a client report an issue restoring a dump made from their =
production db on a testing server wherein there existed two rows for a =
table with identical primary key values.  Upon investigation I found =
this to be true, but neither is visible via index usage:

(Server version: 9.3.3)

db=3D# explain select xmin, xmax, cmin, cmax, ctid, id, created_at from =
tags where id =3D 42982;
                               QUERY PLAN                               =20=

=
-------------------------------------------------------------------------
 Index Scan using groups_pkey on tags  (cost=3D0.29..8.31 rows=3D1 =
width=3D34)
   Index Cond: (id =3D 42982)
(2 rows)

db=3D# select xmin, xmax, cmin, cmax, ctid, id, created_at from tags =
where id =3D 42982;
 xmin | xmax | cmin | cmax | ctid | id | created_at=20
------+------+------+------+------+----+------------
(0 rows)

db=3D# set enable_bitmapscan =3D off;
SET
db=3D# set enable_indexscan =3D off;
SET
db=3D# explain select xmin, xmax, cmin, cmax, ctid, id, created_at from =
tags where id =3D 42982;
                       QUERY PLAN                      =20
--------------------------------------------------------
 Seq Scan on tags  (cost=3D0.00..1541.94 rows=3D1 width=3D34)
   Filter: (id =3D 42982)
(2 rows)

db=3D# select xmin, xmax, cmin, cmax, ctid, id, created_at from tags =
where id =3D 42982;
  xmin   |  xmax   | cmin | cmax |   ctid   |  id   |         created_at =
       =20
=
---------+---------+------+------+----------+-------+---------------------=
-------
 1195647 | 1195647 |    2 |    2 | (677,51) | 42982 | 2015-01-23 =
00:12:19.498942
 1195648 | 1195647 |    1 |    1 | (677,52) | 42982 | 2015-01-23 =
00:12:19.498942
(2 rows)

There are other columns on the table and all values in the two entries =
are identical.  Note that another table has a column that reference this =
table's primary key via a foreign key and there are multiple referencing =
rows there.  The last even has the same xmin as the first above:

db=3D# select xmin, xmax, cmin, cmax, ctid, id, created_at, tag_id from =
tagged_items where tag_id =3D 42982 order by xmin::text::bigint desc;
  xmin   | xmax | cmin | cmax |   ctid    |   id   |         created_at  =
       | tag_id=20
=
---------+------+------+------+-----------+--------+----------------------=
------+--------
 1195647 |    0 |    0 |    0 | (361,52)  | 130562 | 2015-01-23 =
01:14:55.809218 |  42982
 1195628 |    0 |    0 |    0 | (360,89)  | 130557 | 2015-01-23 =
01:14:55.391223 |  42982
 1195619 |    0 |    0 |    0 | (361,44)  | 130548 | 2015-01-23 =
01:14:55.252926 |  42982
 1195575 |    0 |    0 |    0 | (360,72)  | 130529 | 2015-01-23 =
01:14:54.66755  |  42982
 1195536 |    0 |    0 |    0 | (360,60)  | 130505 | 2015-01-23 =
01:14:53.841223 |  42982
 1195528 |    0 |    0 |    0 | (361,22)  | 130497 | 2015-01-23 =
01:14:53.725746 |  42982
 1195490 |    0 |    0 |    0 | (361,16)  | 130487 | 2015-01-23 =
01:14:53.21153  |  42982
 1195489 |    0 |    0 |    0 | (360,50)  | 130486 | 2015-01-23 =
01:14:53.197715 |  42982
 1195470 |    0 |    0 |    0 | (361,10)  | 130477 | 2015-01-23 =
01:14:52.896855 |  42982
 1195402 |    0 |    0 |    0 | (360,28)  | 130444 | 2015-01-23 =
01:14:52.020715 |  42982
<snip>

I'm assuming that the fact that the xmin values of the rest of the rows =
are from before those in the original table is indicative of update =
traffic since the original tags row was added and that the source of the =
problem is perhaps an update gone wrong, but let me know if I'm way off =
there.=20

So, a few questions:

1. Is there an existing bug that's been fixed by 9.3.5 that would =
account for this.  I scanned the 9.3.4 and 9.3.5 release notes and while =
the first fix listed in the release notes for 9.3.4 =
(http://www.postgresql.org/docs/9.3/static/release-9-3-4.html) was what =
gave me the idea to disable index usage to see the rows this server has =
been up and running as a master since September so neither the standby =
nor the crash recovery setup from that bug's description applies.

2. Is there any other info that I can dig up that could be helpful here?

3. Would manually deleting one of those rows (likely using the ctid to =
specify which) be safe and make the remaining one visible?  If so, I'm =
assuming I should delete the second given the xmin of the first =
referencing row in the other table?

Note that I've already confirmed that there are no other duplicate rows =
in this or other tables, at least with no clausing unique ids, via =
restore tests on a test server.  So, I know I can fix this case with a =
dump/restore but if I can get away with simply deleting one of the =
offending tuples that would be ideal (i.e. no downtime for the client).=

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #12644: Planner fails to use available index with anything other than default operator
Next
From: Tom Lane
Date:
Subject: Re: Duplicate key existant/index visibility bug in 9.3.3