Steve Kehlet wrote:
> I have a database that was upgraded from 9.4.1 to 9.4.2 (no pg_upgrade, we
> just dropped new binaries in place) but it wouldn't start up. I found this
> in the logs:
>
> waiting for server to start....2015-05-27 13:13:00 PDT [27341]: [1-1] LOG:
> database system was shut down at 2015-05-27 13:12:55 PDT
> 2015-05-27 13:13:00 PDT [27342]: [1-1] FATAL: the database system is
> starting up
> .2015-05-27 13:13:00 PDT [27341]: [2-1] FATAL: could not access status of
> transaction 1
I am debugging today a problem currently that looks very similar to
this. AFAICT the problem is that WAL replay of an online checkpoint in
which multixact files are removed fails because replay tries to read a
file that has already been removed.
(I was nervous about removing the check to omit reading pg_multixact
files while on recovery. Looks like my hunch was right, though the
actual problem is not what I was fearing.)
I think the fix to this is to verify whether the file exists on disk
before reading it; if it doesn't, assume the truncation has already
happened and that it's not necessary to remove it.
> I found [this report from a couple days ago](
> https://bugs.archlinux.org/task/45071) from someone else that looks like
> the same problem.
Right :-(
I think a patch like this should be able to fix it ... not tested yet.
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services