Thread: BUG #4913: Row missing from primary key index

BUG #4913: Row missing from primary key index

From
"Mathieu De Zutter"
Date:
The following bug has been logged online:

Bug reference:      4913
Logged by:          Mathieu De Zutter
Email address:      mathieu@dezutter.org
PostgreSQL version: 8.3.7
Operating system:   Debian Lenny
Description:        Row missing from primary key index
Details:

I have a table log_event with a primary key on an integer 'id', called
log_event_pkey.

The tables contains a duplicate for id = 15723018. The duplicate (note that
besides the id, all data differs) doesn't seem to be known by the index at
all.

Example:

shs=# select id, event_timestamp from log_event where id = 15723018;
    id    |      event_timestamp
----------+----------------------------
 15723018 | 2009-05-09 13:47:33.441668
(1 row)

But with indexscan and bitmapscan off, I get:

shs=# select id, event_timestamp from log_event where id = 15723018;
    id    |      event_timestamp
----------+----------------------------
 15723018 | 2009-05-09 13:47:48.68558
 15723018 | 2009-05-09 13:47:33.441668
(2 rows)


Table definition:
shs=# \d log_event
                                        Table "public.log_event"
     Column      |            Type             |
Modifiers
-----------------+-----------------------------+----------------------------
----------------------------
 id              | bigint                      | not null default
nextval('log_event_id_seq'::regclass)
 user_id         | integer                     |
 ip              | inet                        | not null
 action_id       | integer                     | not null
 object1_id      | integer                     |
 object2_id      | integer                     |
 event_timestamp | timestamp without time zone | not null
Indexes:
    "log_event_pkey" PRIMARY KEY, btree (id)
    "log_event_action_id_idx" btree (action_id)
    "log_event_timestamp_idx" btree (event_timestamp)
    "log_event_user_id_idx" btree (user_id)
Foreign-key constraints:
    "log_event_action_id_fkey" FOREIGN KEY (action_id) REFERENCES
config.log_action(id)


In this table definition you can also see that 'id' is generated by a
sequence, which makes it even stranger to contain a dupe.

I have made a copy of the complete pg cluster, so I can debug without
interfering with my production db.

Re: BUG #4913: Row missing from primary key index

From
Andrew Gierth
Date:
>>>>> "Mathieu" == "Mathieu De Zutter" <mathieu@dezutter.org> writes:

 Mathieu> I have a table log_event with a primary key on an integer
 Mathieu> 'id', called log_event_pkey.

 Mathieu> The tables contains a duplicate for id = 15723018. The
 Mathieu> duplicate (note that besides the id, all data differs)
 Mathieu> doesn't seem to be known by the index at all.
[snip]

This was first reported on IRC and I spent a little time working with
the OP trying to dig up info to suggest a cause; here is the relevent
data (all provided by the OP at my request). I have not been able to
suggest a possible cause based on this.

shs=# select ctid,xmin,xmax,cmin,cmax,* from log_event where id = 15723018;
    ctid    | xmin  | xmax | cmin | cmax |    id    | user_id |      ip       | action_id | object1_id | object2_id |
  event_timestamp        

------------+-------+------+------+------+----------+---------+---------------+-----------+------------+------------+----------------------------
 (79844,13) | 16410 |    0 |    0 |    0 | 15723018 |       0 | 79.65.40.251  |         4 |      10145 |            |
2009-05-0913:47:48.68558 
 (79870,70) | 16479 |    0 |    0 |    0 | 15723018 |       1 | 80.200.87.228 |         1 |            |            |
2009-05-0913:47:33.441668 
(2 rows)

Notice that the two rows seem entirely independent (different xmin).
The OP stated that his app generally does single-row inserts (with
some exceptions not relevent here); however, we found a nearby row
which shares the xmin:

 (79870,16)  | 16410 |    0 |    0 |    0 | 15722964 |       1 | 80.200.87.228   |         3 |        720 |
|2009-05-09 13:41:53.099589 

(that row shows up correctly)

Using pg_filedump on the primary key index did not show any index
entries at all pointing to the (79844,13) row.

Hexdump of the 79844 page did not reveal anything out of the ordinary;
this is tuple (79844,13) starting at offset 1cc8:

00001cc0  00 00 00 00 00 00 00 00  1a 40 00 00 00 00 00 00  |.........@......|
00001cd0  00 00 00 00 01 00 e4 37  0d 00 07 00 03 09 18 5f  |.......7......._|
00001ce0  0a ea ef 00 00 00 00 00  00 00 00 00 0f 02 20 4f  |.............. O|
00001cf0  41 28 fb 00 04 00 00 00  a1 27 00 00 00 00 00 00  |A(.......'......|

(full page at http://pastebin.com/m28036405)

There are additional rows in the table which also show similar issues
(rows found by seqscan but missing in indexscan). These also have xids
which are "suspicious" (apparently not in the expected sequence or dups
of other xids). e.g.:

these rows don't show up in the index:

 (79854,50)  | 16401 |    0 |    0 |    0 | 15722931 |       0 | 90.39.81.150    |         6 |            |
|2009-05-09 13:42:06.278105 
 (79849,6)   | 16406 |    0 |    0 |    0 | 15722932 |       0 | 78.150.166.244  |         5 |        940 |
|2009-05-09 13:45:49.012712 
 (79857,39)  | 16408 |    0 |    0 |    0 | 15722933 |       0 | 64.17.2.164     |         2 |      22435 |
|2009-05-09 13:46:03.1664 

but these do:

 (79870,13)  | 16406 |    0 |    0 |    0 | 15722961 |       1 | 80.200.87.228   |        71 |      29005 |
|2009-05-09 13:41:50.190479 
 (79870,14)  | 16408 |    0 |    0 |    0 | 15722962 |       0 | 80.136.54.145   |         2 |      92765 |
|2009-05-09 13:41:50.439442 

This seems to be associated with some odd jumps in xid, time and ID;
this set of rows is ordered by id:

 (79835,21)  | 16394 |    0 |    0 |    0 | 15722927 |       0 | 85.225.253.43   |         1 |            |
|2009-05-09 13:39:43.943032 
 (79869,12)  | 16396 |    0 |    0 |    0 | 15722928 |       0 | 66.235.124.132  |         4 |       1663 |
|2009-05-09 13:40:37.048477 
 (79844,12)  | 16398 |    0 |    0 |    0 | 15722929 |       0 | 64.17.2.164     |         2 |      22324 |
|2009-05-09 13:40:58.781981 
 (79841,9)   | 16399 |    0 |    0 |    0 | 15722930 |       0 | 78.137.163.133  |         2 |       4950 |
|2009-05-09 13:41:32.126199 
 (79854,50)  | 16401 |    0 |    0 |    0 | 15722931 |       0 | 90.39.81.150    |         6 |            |
|2009-05-09 13:42:06.278105 
 (79849,6)   | 16406 |    0 |    0 |    0 | 15722932 |       0 | 78.150.166.244  |         5 |        940 |
|2009-05-09 13:45:49.012712 
 (79857,39)  | 16408 |    0 |    0 |    0 | 15722933 |       0 | 64.17.2.164     |         2 |      22435 |
|2009-05-09 13:46:03.1664 

(above 7 rows are not in index)

 (79870,3)   | 16393 |    0 |    0 |    0 | 15722951 |       0 | 77.202.22.228   |        56 |          0 |
|2009-05-09 13:39:48.460145 
 (79870,4)   | 16394 |    0 |    0 |    0 | 15722952 |       0 | 77.202.22.228   |         3 |      37349 |
|2009-05-09 13:40:09.031336 
 (79870,5)   | 16395 |    0 |    0 |    0 | 15722953 |       0 | 77.202.22.228   |        56 |          0 |
|2009-05-09 13:40:23.072695 

(above 3 rows are)

(full list at http://pastebin.com/m16600dc8 - that list is from a seqscan,
so includes rows missing from the index)

--
Andrew (irc:RhodiumToad)

Re: BUG #4913: Row missing from primary key index

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> This was first reported on IRC and I spent a little time working with
> the OP trying to dig up info to suggest a cause; here is the relevent
> data (all provided by the OP at my request). I have not been able to
> suggest a possible cause based on this.

> shs=# select ctid,xmin,xmax,cmin,cmax,* from log_event where id = 15723018;
>     ctid    | xmin  | xmax | cmin | cmax |    id    | user_id |      ip       | action_id | object1_id | object2_id |
    event_timestamp        
>
------------+-------+------+------+------+----------+---------+---------------+-----------+------------+------------+----------------------------
>  (79844,13) | 16410 |    0 |    0 |    0 | 15723018 |       0 | 79.65.40.251  |         4 |      10145 |            |
2009-05-0913:47:48.68558 
>  (79870,70) | 16479 |    0 |    0 |    0 | 15723018 |       1 | 80.200.87.228 |         1 |            |            |
2009-05-0913:47:33.441668 
> (2 rows)

> Notice that the two rows seem entirely independent (different xmin).
> The OP stated that his app generally does single-row inserts (with
> some exceptions not relevent here); however, we found a nearby row
> which shares the xmin:

>  (79870,16)  | 16410 |    0 |    0 |    0 | 15722964 |       1 | 80.200.87.228   |         3 |        720 |
| 2009-05-09 13:41:53.099589 

How is the timestamp column generated?  I'm wondering what we can deduce
from the fact that the timestamps are all different.  It's evidently not
now().

One thing that seems odd is that the xids are kinda small.  Did the
system just recently have a wraparound event?

            regards, tom lane

Re: BUG #4913: Row missing from primary key index

From
Alvaro Herrera
Date:
Andrew Gierth wrote:
> >>>>> "Mathieu" == "Mathieu De Zutter" <mathieu@dezutter.org> writes:
>
>  Mathieu> I have a table log_event with a primary key on an integer
>  Mathieu> 'id', called log_event_pkey.
>
>  Mathieu> The tables contains a duplicate for id = 15723018. The
>  Mathieu> duplicate (note that besides the id, all data differs)
>  Mathieu> doesn't seem to be known by the index at all.
> [snip]
>
> This was first reported on IRC and I spent a little time working with
> the OP trying to dig up info to suggest a cause; here is the relevent
> data (all provided by the OP at my request). I have not been able to
> suggest a possible cause based on this.

How are the event_timestamps generated?  If these come from the
transaction timestamp, it is even more weird that tuples seemingly
inserted by the same xmin have differing timestamps.

FWIW I think I've seen similar cases before.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: BUG #4913: Row missing from primary key index

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 >> Notice that the two rows seem entirely independent (different
 >> xmin).  The OP stated that his app generally does single-row
 >> inserts (with some exceptions not relevent here); however, we
 >> found a nearby row which shares the xmin:

 Tom> How is the timestamp column generated?  I'm wondering what we
 Tom> can deduce from the fact that the timestamps are all different.
 Tom> It's evidently not now().

(answering this one since the OP has probably gone for the night)

My understanding is that it is now(), but the OP should be able to
give a definitive answer. (Yes, this does raise some questions about
why it appears to have gone backwards at some points.)

 Tom> One thing that seems odd is that the xids are kinda small.  Did
 Tom> the system just recently have a wraparound event?

The system was recently dump/restored from a different box. The
failing rows are all new inserts since the restore.

--
Andrew (irc:RhodiumToad)

Re: BUG #4913: Row missing from primary key index

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
>  Tom> One thing that seems odd is that the xids are kinda small.  Did
>  Tom> the system just recently have a wraparound event?

> The system was recently dump/restored from a different box. The
> failing rows are all new inserts since the restore.

So the table has been insert-only so far?  I was just thinking that HOT
bugs seemed like a probable explanation, but that idea goes out the
window if there have been no UPDATEs.

            regards, tom lane

Re: BUG #4913: Row missing from primary key index

From
Alvaro Herrera
Date:
Mathieu De Zutter wrote:

> I have a table log_event with a primary key on an integer 'id', called
> log_event_pkey.
>
> The tables contains a duplicate for id = 15723018. The duplicate (note that
> besides the id, all data differs) doesn't seem to be known by the index at
> all.

What do the INSERT lines look like?  Is it a trigger, an insert called
directly by the application?  How is the sequence involved -- lastval(),
nextval(), does the code just leave the column out for the default to fire?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: BUG #4913: Row missing from primary key index

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 >> The system was recently dump/restored from a different box. The
 >> failing rows are all new inserts since the restore.

 Tom> So the table has been insert-only so far?  I was just thinking
 Tom> that HOT bugs seemed like a probable explanation, but that idea
 Tom> goes out the window if there have been no UPDATEs.

No UPDATEs (and there are no HOT flags set on any tuple I looked at).
There may have been DELETEs.

--
Andrew (irc:RhodiumToad)

Re: BUG #4913: Row missing from primary key index

From
Mathieu De Zutter
Date:
On Fri, Jul 10, 2009 at 2:02 AM, Alvaro Herrera
<alvherre@commandprompt.com>wrote:

> What do the INSERT lines look like?  Is it a trigger, an insert called
> directly by the application?  How is the sequence involved -- lastval(),
> nextval(), does the code just leave the column out for the default to fire?
>
>

* This is the INSERT query, called from PHP/Apache

INSERT INTO log_event (user_id, ip, action_id, object1_id, object2_id,
event_timestamp)
VALUES ($1, $2, $3, $4, $5, NOW())

So the timestamp is generated by NOW(), and the pkey is generated by using
the default value (nextval).

* There are no UPDATE queries.

* There a few DELETE queries that are lauched daily (called from PHP/cron)

DELETE FROM log_event WHERE action_id = $1 AND event_timestamp < NOW() - '1
month'::interval

(some events are not worthy of archiving individually, we just keep totals)


Regards,
Mathieu

Re: BUG #4913: Row missing from primary key index

From
Tom Lane
Date:
Mathieu De Zutter <mathieu@dezutter.org> writes:
> On Fri, Jul 10, 2009 at 2:02 AM, Alvaro Herrera
> <alvherre@commandprompt.com>wrote:
>> What do the INSERT lines look like?

> * This is the INSERT query, called from PHP/Apache

> INSERT INTO log_event (user_id, ip, action_id, object1_id, object2_id,
> event_timestamp)
> VALUES ($1, $2, $3, $4, $5, NOW())

> So the timestamp is generated by NOW(), and the pkey is generated by using
> the default value (nextval).

The timestamps are invariably generated by NOW()?  That seems impossible
given that Andrew's report shows different timestamps for the same
insert XID.  But actually, you have to believe about six impossible
things in order to credit that Andrew's report represents the behavior
of a normally-functioning Postgres system.

What I am thinking is that we are looking at the results of multiple
disconnected backends hacking on the same database files --- ie, there's
more than one active shared memory block.  That would explain the
apparent re-use of XIDs (they're being handed out separately in each
shmem block) as well as disappearance of index entries (a write from
either shmem block would wipe out recent changes in that page from the
other one).  This could be checked by looking at the output of "ipcs -m"
(run this as root to be sure you get everything).

            regards, tom lane

Re: BUG #4913: Row missing from primary key index

From
Mathieu De Zutter
Date:
On Fri, Jul 10, 2009 at 5:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

>   This could be checked by looking at the output of "ipcs -m"
> (run this as root to be sure you get everything).
>


 $ sudo  ipcs -m

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status

0x0052e2c1 1081344    postgres  600        30384128   21

0x00000000 425985     postgres  600        18440      34

0x00000000 458754     postgres  600        16384      34

0x00000000 491523     postgres  600        768        34

0x00000000 524292     postgres  600        12         34


Note that I stopped/started postgresql last night.

Regards,
Mathieu

Re: BUG #4913: Row missing from primary key index

From
Tom Lane
Date:
Mathieu De Zutter <mathieu@dezutter.org> writes:
>  $ sudo  ipcs -m
> ------ Shared Memory Segments --------
> key        shmid      owner      perms      bytes      nattch     status
> 0x0052e2c1 1081344    postgres  600        30384128   21
> 0x00000000 425985     postgres  600        18440      34
> 0x00000000 458754     postgres  600        16384      34
> 0x00000000 491523     postgres  600        768        34
> 0x00000000 524292     postgres  600        12         34

Hmm, I wonder what those last four are?  Surely they are too small
to be Postgres shmem segments, but if they're something else like X11
segments then why do we only see ones belong to user postgres?  I think
it's probably irrelevant to the problem, but just wondering ...

> Note that I stopped/started postgresql last night.

The evidence may be gone then.  I still think that disconnected backends
are the most plausible explanation, but you might have killed everything
off.

As far as anyone knows, it's impossible to get into the
disconnected-backends state unless (a) you manually remove the
postmaster.pid file that provides the interlock against it, or
(b) you're trying to run multiple copies of Postgres on different
machines that're sharing an NFS or SAN mount of the same database files.
I assume you'd have mentioned it if you were trying (b), but I wonder
whether you have an init script that takes it on itself to remove the
postmaster.pid file?  It's generally considered irresponsible for an
init script to do that, but I'm not sure every distro has gotten the word.

            regards, tom lane

Re: BUG #4913: Row missing from primary key index

From
Mathieu De Zutter
Date:
On Fri, Jul 10, 2009 at 6:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> As far as anyone knows, it's impossible to get into the
> disconnected-backends state unless (a) you manually remove the
> postmaster.pid file that provides the interlock against it, or
> (b) you're trying to run multiple copies of Postgres on different
> machines that're sharing an NFS or SAN mount of the same database files.
> I assume you'd have mentioned it if you were trying (b), but I wonder
> whether you have an init script that takes it on itself to remove the
> postmaster.pid file?  It's generally considered irresponsible for an
> init script to do that, but I'm not sure every distro has gotten the word.
>

(a) I can't remember doing that, but who knows I got frustrated at some
point in time and took a wrong shortcut.
(b) I've only got local RAID storage.

I've checked the debian init script, and the only thing related to the pid
file I found was creating the underlying directory /var/run/postgresql

Regards,
Mathieu