Re: ERROR: MultiXactId xxxx has not been created yet -- apparent wraparound - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: ERROR: MultiXactId xxxx has not been created yet -- apparent wraparound
Date
Msg-id 20150526214109.GZ5885@postgresql.org
Whole thread Raw
In response to Re: ERROR: MultiXactId xxxx has not been created yet -- apparent wraparound  (Paul Smith <paul@pscs.co.uk>)
List pgsql-hackers
Paul Smith wrote:

> In the backup, for the row which will become broken, I get:
> 
>     ctid    |   xmin   |   xmax
> ------------+----------+----------
>  (18722,29) | 23862661 | 23862661

Okay.

> Assuming it hadn't changed in the broken version (from application
> knowledge, it's unlikely to change), then on the broken version,
> select heap_page_items(get_raw_page('messages.msgdata', 18722)); gives:
> 
>  (21,3112,1,812,1332175474,894779760,223823413,"(1443526721,25142)",12141,18516,57,,)
>
(23,4448,1,352,728903474,1466642284,1869042033,"(1262568257,16743)",16751,26455,65,111101101000001011010010100000101010001011110110100000101101001010000010100000000000000000000000000000001010000001110110110000001000000001001000101000010110111000000000000000001100000111100010000000000000000010110010101100101101100000000000110010110010110011000000000000000000000000000000000000000000000000000000000000000000000010100001,)
>
(24,4184,1,264,619804,228,8000,"(956969542,26694)",22857,29251,98,110001101100001001100110011001100001110010000010001000100000111011110100110100101001011000101100100101100110001001000010101011100101111001010110111100101100001011000110101000101101011000011100010101100100110010011100110100101001101001100110110001101000101000110110100101100100001010011110000010100011011001000010101011001111001010100110010010100111011000101110010010101111010000011010110101101000001000101100101100101000101000001110000110101110001000001100110001100001001001010010000010100101001000011110010010100110110010000110101100000101000011100110110010100011011010110110100110101111010000100110,)
>
(26,3928,1,252,1212837441,1232350037,1668772674,"(728910135,11093)",31284,16687,70,1111011010100010100010100100110011001100100000101101001001110110000101101110011001100110101101101001110010011110010010101000110011110100000100100010111011010100011010100001001010110110101100100011001000101010111011000000111011110100110011001011000001010000101101101111010010110110100001101000001010100010011011100100101010110110001100100101011001001110001101101100011001100110,1717791858)
>  (29,2296,1,812,1229271346,1934308693,1733443957,"(1848337707,21358)",31337,21592,88,,1178686785)
>  (30,1944,1,352,947155032,1380218998,792031842,"(1211650927,22579)",30585,20532,80,,)
>  (31,1560,1,380,23935627,23935627,1,"(18722,31)",40976,9510,24,,)
> (31 rows)
> 
> 
> The data for index 29 has 1934308693 in the header (which is the multixactid
> reported in the error message when trying to retrieve that row).

Right.  The ctids on items 21, 23, 24, 26,29 and 30 are pretty obviously
corrupted (page numbers are too large, and offset numbers are way above
normal offset numbers), as is probaly everything else about them.  Note
xmin should be a valid xid, but for example on item 21 it's 133 million
which obviously is not a valid xid on your system.  Items 26 and 29 even
have an OID, which the others do not (indicating wrong HEAP_HASOID
flag).

Note the lp_off fields.  It seems that all valid items have offsets
above 4096, and corrupted items have offsets below that.  The theory
that somehow the system wrote a bad filesystem page (4096 bytes) on the
lower half of the Postgres page (8192 bytes) sounds rather more probable
now.

> I'm struggling to find the definitions for the data returned by
> heap_page_items,

Try \df+ heap_page_items; or more easily, use this query instead:
select * from heap_page_items(get_raw_page('messages.msgdata', 18722)); gives:

> If I look at the raw page data, it contains data relating to this
> installation, so it doesn't look like a page from a different Hyper-V guest.
> It also doesn't look like just random data from a memory error. It actually
> looks like legitimate data which should possibly have been written
> elsewhere. (I don't want to post it here because it contains possibly
> confidential data from one of our customers).

Sure.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Redesigning checkpoint_segments
Next
From: Martín Marqués
Date:
Subject: Re: about lob(idea)