Re: 9.3.9 and pg_multixact corruption - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: 9.3.9 and pg_multixact corruption
Date
Msg-id CAEepm=2xW7L-etbyEVEOspDSwDFmSvwR_=EO_duE2USNFvO5xA@mail.gmail.com
Whole thread Raw
In response to Re: 9.3.9 and pg_multixact corruption  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: 9.3.9 and pg_multixact corruption  (Thomas Munro <thomas.munro@enterprisedb.com>)
Re: 9.3.9 and pg_multixact corruption  (Andreas Seltenreich <andreas.seltenreich@credativ.de>)
List pgsql-hackers
On Fri, Sep 11, 2015 at 10:45 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Bernd Helmle wrote:
> A customer had a severe issue with a PostgreSQL 9.3.9/sparc64/Solaris 11
> instance.
>
> The database crashed with the following log messages:
>
> 2015-09-08 00:49:16 CEST [2912] PANIC:  could not access status of
> transaction 1068235595
> 2015-09-08 00:49:16 CEST [2912] DETAIL:  Could not open file
> "pg_multixact/members/FFFF5FC4": No such file or directory.
> 2015-09-08 00:49:16 CEST [2912] STATEMENT:  delete from StockTransfer
> where oid = $1 and tanum = $2

I wonder if these bogus page and offset numbers are just
SlruReportIOError being confused because pg_multixact/members is so
weird (I don't think it should be the case, since this stuff is using
page numbers only, not anything related to how each page is layed out).

But SlruReportIOError uses the same macro to build the filename as SlruReadPhysicalPage and other functions, namely SlruFileName which uses sprintf with %04X (unsigned integer uppercase hex) and gives it segno (which is always an int), so I don't think the problem is in error reporting only.

Assuming default block size, to get FFFF5FC4 from SlruFileName you need segno == -41020.

We have int segno = pageno / 32 (that's SLRU_PAGES_PER_SEGMENT), so to get segno == -41020 you need pageno between -1312640 and -1312609 (whose bit patterns  reinterpreted as unsigned are 4293654656 and 4293654687).

In various places we have int pageno = offset / (uint32) 1636, expanded from this macro (which calls the offset an xid):

#define MXOffsetToMemberPage(xid) ((xid) / (TransactionId) MULTIXACT_MEMBERS_PER_PAGE)

I don't really see how any uint32 value could produce such a pageno via that macro.  Even if called in an environment where (xid) is accidentally an int, the int / unsigned expression would convert it to unsigned first (unless (xid) is a bigger type like int64_t: by the rules of int promotion you'd get signed division in that case, hmm...).  But it's always called with a MultiXactOffset AKA uint32 variable.

So via that route, there is no MultiXactOffset value that can't be mapped to a segment in the range "0000", "14078".  Famously, it wraps after that.

Maybe the negative pageno came from somewhere else.  Where?  Inside SLRU code we can see pageno = shared->page_number[slotno]... maybe the SLRU slots got corrupted somehow?

--

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: 9.3.9 and pg_multixact corruption
Next
From: Thomas Munro
Date:
Subject: Re: 9.3.9 and pg_multixact corruption