Hi,
On 2014-06-04 16:59:10 +0200, Alexey Klyukin wrote:
> I've recently discovered a peculiar problem on one of our big databases
> (more than 1TB). The database has been upgraded from 9.2 to 9.3.4 (using
> hardlinks to speedup the process) on April 7th around 22:00 local time.
> When doing vacuum on any table, the system fails with the following error:
>
> ERROR: could not access status of transaction 2035987419
> DETAIL: Could not open file "pg_multixact/offsets/795A": No such file or
> directory.
>
> The erorr doesn't depend on the table being vacuumed, or even database, i.e:
>
> postgres=# create database x;
> CREATE DATABASE
> postgres=# \c x
> You are now connected to database "x" as user "postgres".
> x=# create table test();
> CREATE TABLE
> x=# vacuum test;
> ERROR: could not access status of transaction 2035987419
> DETAIL: Could not open file "pg_multixact/offsets/795A": No such file or
> directory.
>
> The content of pg_multixact/offsets is:
>
> pg_multixact$ ls -lR
> ./members:
> -rw------- 1 postgres postgres 8192 Apr 16 18:20 0000
> ./offsets:
> -rw------- 1 postgres postgres 8192 Apr 7 22:51 0000
> -rw------- 1 postgres postgres 262144 Apr 16 18:20 79A6
>
> the select mutlixact from pg_database gives me:
> and the 2035987419 = 0x795AB3DB belongs to 795A segment.
> The 0000 file just contains a single page of all zeroes. Neither the 9.3.4
> replica of this database, nor the original 9.2 cluster data directory
> contain this file.
Looks like you're hitting the issue described in
http://archives.postgresql.org/message-id/20140530121631.GE25431%40alap3.anarazel.de
> I'm tempted to just remove the 0000 file from master and restart the
> database, since it's effectively impossible to run vacuum now, but I'd like
> to understand what's happening first. Below is the result of pg_filedump
> for the master:
Yes, that's fine in this specific case.
Note that the 0000 segment isn't yused by anything between the oldest
and newest multixact:
> Latest checkpoint's NextMultiXactId: 2040987419
> Latest checkpoint's NextMultiOffset: 3
> Latest checkpoint's oldestXID: 1038291920
> Latest checkpoint's oldestXID's DB: 16415
> Latest checkpoint's oldestActiveXID: 1655189767
> Latest checkpoint's oldestMultiXid: 2040987417
> Latest checkpoint's oldestMulti's DB: 0
Greetings,
Andres Freund