Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1 - Mailing list pgsql-hackers

Steve Kehlet wrote:
> I have a database that was upgraded from 9.4.1 to 9.4.2 (no pg_upgrade, we
> just dropped new binaries in place) but it wouldn't start up. I found this
> in the logs:
>
> waiting for server to start....2015-05-27 13:13:00 PDT [27341]: [1-1] LOG:
>  database system was shut down at 2015-05-27 13:12:55 PDT
> 2015-05-27 13:13:00 PDT [27342]: [1-1] FATAL:  the database system is
> starting up
> .2015-05-27 13:13:00 PDT [27341]: [2-1] FATAL:  could not access status of
> transaction 1

I am debugging today a problem currently that looks very similar to
this.  AFAICT the problem is that WAL replay of an online checkpoint in
which multixact files are removed fails because replay tries to read a
file that has already been removed.

(I was nervous about removing the check to omit reading pg_multixact
files while on recovery.  Looks like my hunch was right, though the
actual problem is not what I was fearing.)

I think the fix to this is to verify whether the file exists on disk
before reading it; if it doesn't, assume the truncation has already
happened and that it's not necessary to remove it.

> I found [this report from a couple days ago](
> https://bugs.archlinux.org/task/45071) from someone else that looks like
> the same problem.

Right :-(

I think a patch like this should be able to fix it ... not tested yet.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: pg_upgrade resets timeline to 1
Next
From: Steve Kehlet
Date:
Subject: Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1