Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1 - Mailing list pgsql-general

From Robert Haas
Subject Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Date
Msg-id CA+TgmoagLrh8cy7adUG-08OjRK6tAtiJpx7TaRqPbCriWAQ0MQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-general
On Thu, May 28, 2015 at 8:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, May 27, 2015 at 6:21 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>> Steve Kehlet wrote:
>>> I have a database that was upgraded from 9.4.1 to 9.4.2 (no pg_upgrade, we
>>> just dropped new binaries in place) but it wouldn't start up. I found this
>>> in the logs:
>>>
>>> waiting for server to start....2015-05-27 13:13:00 PDT [27341]: [1-1] LOG:
>>>  database system was shut down at 2015-05-27 13:12:55 PDT
>>> 2015-05-27 13:13:00 PDT [27342]: [1-1] FATAL:  the database system is
>>> starting up
>>> .2015-05-27 13:13:00 PDT [27341]: [2-1] FATAL:  could not access status of
>>> transaction 1
>>
>> I am debugging today a problem currently that looks very similar to
>> this.  AFAICT the problem is that WAL replay of an online checkpoint in
>> which multixact files are removed fails because replay tries to read a
>> file that has already been removed.
>
> Wait a minute, wait a minute.  There's a serious problem with this
> theory, at least in Steve's scenario.  This message:
>
> 2015-05-27 13:13:00 PDT [27341]: [1-1] LOG: database system was shut
> down at 2015-05-27
>
> That message implies a *clean shutdown*.  If he had performed an
> immediate shutdown or just pulled the plug, it would have said
> "database system was interrupted" or some such.
>
> There may be bugs in redo, also, but they don't explain what happened to Steve.
>
> Steve, is there any chance we can get your pg_controldata output and a
> list of all the files in pg_clog?

Err, make that pg_multixact/members, which I assume is at issue here.
You didn't show us the DETAIL line from this message, which would
presumably clarify:

FATAL:  could not access status of transaction 1

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-general by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1