Re: BUG #4929: Corrupted pg_class, possibly truncate/rollback related - Mailing list pgsql-bugs

From Andrew Gierth
Subject Re: BUG #4929: Corrupted pg_class, possibly truncate/rollback related
Date
Msg-id 874ot7q7kd.fsf@news-spur.riddles.org.uk
Whole thread Raw
In response to BUG #4929: Corrupted pg_class, possibly truncate/rollback related  ("Robert Treat" <xzilla@users.sourceforge.net>)
List pgsql-bugs
>>>>> "Robert" == "Robert Treat" <xzilla@users.sourceforge.net> writes:

 Robert> Bug reference:      4929
 Robert> Logged by:          Robert Treat
 Robert> Email address:      xzilla@users.sourceforge.net
 Robert> PostgreSQL version: 8.3.1

pe2=# select xmin, xmax, cmin, cmax, ctid, oid, relnamespace, relname,
 reltype, relowner, relfilenode, relpages, reltuples from pg_class  where oid = 23708;
     xmin    |    xmax    | cmin | cmax |   ctid   |  oid  | relnamespace | relname  | reltype | relowner | relfilenode
|relpages |  reltuples   

------------+------------+------+------+----------+-------+--------------+----------+---------+----------+-------------+----------+-------------
  3291061347 |          0 |    6 |    6 | (1118,2) | 23708 |        23681 | prooln_m |   23710 |       10 |      654963
|  114055 | 7.42746e+06 
           2 | 3291061347 |    6 |    6 | (23,39)  | 23708 |        23681 | prooln_m |   23710 |       10 |      181519
|  104401 |  6.5017e+06 

 Robert> So, clearly this is bad. It seems we've gotten some level of
 Robert> corruption on disk. The most perculiar bits of information
 Robert> around this system are that we tend to have long running
 Robert> vacuum jobs (multiple days), and we recently did a truncate +
 Robert> rollback within a transaction on the table in question. I've
 Robert> also noticed that the file on disk for the 181519 row is not
 Robert> actually there. Oh, and this does run on lvm, though we
 Robert> haven't used the lvm feature set for a long time. So, worth
 Robert> investigating?

I did some analysis on this at Robert's request on IRC. Here are hexdumps
of the offending tuples:

(1118,2):

00001e60  63 9c 29 c4 00 00 00 00  06 00 00 00 00 00 5e 04  |c.)...........^.|
00001e70  02 00 1b 00 0b 29 20 ff  ff ff 03 00 9c 5c 00 00  |.....) ......\..|
00001e80  70 72 6f 6f 6c 6e 5f 6d  00 00 00 00 00 00 00 00  |prooln_m........|
00001e90  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001ec0  81 5c 00 00 9e 5c 00 00  0a 00 00 00 00 00 00 00  |.\...\..........|
00001ed0  73 fe 09 00 00 00 00 00  87 bd 01 00 02 ab e2 4a  |s..............J|
00001ee0  00 00 00 00 00 00 00 00  01 00 72 00 0f 00 00 00  |..........r.....|
00001ef0  00 00 00 00 00 00 00 00  00 01 00 00 42 9c 29 c4  |............B.).|
00001f00  5b 01 00 00 00 00 00 00  00 09 04 00 00 02 00 00  |[...............|
00001f10  00 01 00 00 00 0a 00 00  00 0a 00 00 00 6f 00 00  |.............o..|
00001f20  00 15 5a 02 00 0a 00 00  00 02 00 00 00 00 00 00  |..Z.............|

Everything above looks about as we expect.

(1118,1): this is an earlier version of the tuple, correctly marked dead,
presumably resulting from the truncate/rollback referred to above:

00001f30  19 55 23 c4 00 00 00 00  06 00 00 00 00 00 5e 04  |.U#...........^.|
00001f40  01 00 1b 00 0b 2a 20 ff  ff ff 03 00 9c 5c 00 00  |.....* ......\..|
00001f50  70 72 6f 6f 6c 6e 5f 6d  00 00 00 00 00 00 00 00  |prooln_m........|
00001f60  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001f90  81 5c 00 00 9e 5c 00 00  0a 00 00 00 00 00 00 00  |.\...\..........|
00001fa0  72 fe 09 00 00 00 00 00  00 00 00 00 00 00 00 00  |r...............|
00001fb0  00 00 00 00 00 00 00 00  01 00 72 00 0f 00 00 00  |..........r.....|
00001fc0  00 00 00 00 00 00 00 00  00 01 00 00 19 55 23 c4  |.............U#.|
00001fd0  5b 01 00 00 00 00 00 00  00 09 04 00 00 02 00 00  |[...............|
00001fe0  00 01 00 00 00 0a 00 00  00 0a 00 00 00 6f 00 00  |.............o..|
00001ff0  00 15 5a 02 00 0a 00 00  00 02 00 00 00 00 00 00  |..Z.............|

Again, no obvious surprises.

This is (23,39) which is the broken one; note HEAP_XMAX_INVALID is
set, despite the fact that this is a transaction that committed (as
evidenced by the removal of the old relfilenode) so the row is showing
up incorrectly to queries. Note also that HEAP_ONLY_TUPLE is set (but
there are no other versions of this tuple on page 23).

00000a20  02 00 00 00 63 9c 29 c4  06 00 00 00 00 00 5e 04  |....c.).......^.|
00000a30  02 00 1b 80 0b 29 20 ff  ff ff 03 00 9c 5c 00 00  |.....) ......\..|
00000a40  70 72 6f 6f 6c 6e 5f 6d  00 00 00 00 00 00 00 00  |prooln_m........|
00000a50  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000a80  81 5c 00 00 9e 5c 00 00  0a 00 00 00 00 00 00 00  |.\...\..........|
00000a90  0f c5 02 00 00 00 00 00  d1 97 01 00 8a 6a c6 4a  |.............j.J|
00000aa0  00 00 00 00 00 00 00 00  01 00 72 00 0f 00 00 00  |..........r.....|
00000ab0  00 00 00 00 00 00 00 00  00 01 00 00 c5 52 c5 b9  |.............R..|
00000ac0  5b 01 00 00 00 00 00 00  00 09 04 00 00 02 00 00  |[...............|
00000ad0  00 01 00 00 00 0a 00 00  00 0a 00 00 00 6f 00 00  |.............o..|
00000ae0  00 15 5a 02 00 0a 00 00  00 02 00 00 00 00 00 00  |..Z.............|


--
Andrew (irc:RhodiumToad)

pgsql-bugs by date:

Previous
From: "Robert Treat"
Date:
Subject: BUG #4929: Corrupted pg_class, possibly truncate/rollback related
Next
From: Tom Lane
Date:
Subject: Re: Bug 4906 -- Left join of subselect incorrect