Possible data corruption - Mailing list pgsql-bugs

From Martijn Meijer
Subject Possible data corruption
Date
Msg-id CAOVf_d=hscUrGjB=yy-GvKgqOJqqcuwmrqg7QUzXx4W21puLeQ@mail.gmail.com
Whole thread Raw
Responses Re: Possible data corruption  (Kevin Grittner <kgrittn@ymail.com>)
List pgsql-bugs
Hi all,

I'm having weird issues with my Postgres installation, possibly a
corruption. The last backup I have if from a day and a half before, so
ideally I'd like to restore the data as-is.

I have made a full file system-level copy of the related data, as
instructed at https://wiki.postgresql.org/wiki/Corruption .

I was told I should provide the following:


A description of what you are trying to achieve and what results you
expect.:

Any query on some tables fail, including simple ones like: select count(*)
from contracts;

Gives:

ERROR:  could not access status of transaction 552079857
DETAIL:  Could not open file "pg_multixact/members/60D4": No such file or
directory.


PostgreSQL version number you are running:

9.3.4
=E2=80=8B. This was the initial version installed on this machine, but the =
data was
previously on a different machine. A normal export (i.e., no -Fc or similar
was passed to pg_dump) was imported.=E2=80=8B


How you installed PostgreSQL:

Added postgres servers to apt sources, apt-get install


Changes made to the settings in the postgresql.conf file:

max_connections =3D 500
superuser_reserved_connections =3D 3
shared_buffers =3D 512MB
temp_buffers =3D 8MB
work_mem =3D 5MB
maintenance_work_mem =3D 16MB
wal_buffers =3D 8MB
checkpoint_segments =3D 32
checkpoint_completion_target =3D 0.9
seq_page_cost =3D 1.0
random_page_cost =3D 1.2
effective_cache_size =3D 1536MB
log_min_messages =3D info
log_min_duration_statement =3D 5000
log_checkpoints =3D on
autovacuum_naptime =3D 5min


Operating system and version:

Ubuntu Lucid (10.04) 64 bit


What program you're using to connect to PostgreSQL:

psql on the command line


Is there anything relevant or unusual in the PostgreSQL server logs?:

The messages about "Could not open file" started appearing last night at
19:00. I don't see any other relevant messages.


The EXACT TEXT of the error message you're getting, if there is one:

ERROR:  could not access status of transaction 552079857
DETAIL:  Could not open file "pg_multixact/members/60D4": No such file or
directory.


=E2=80=8BHardware details:=E2=80=8B


=E2=80=8BCPU: 2x Intel(R) Xeon(R) CPU E5-2603
RAM: 16 GB
Storage: 2x INTEL SSDSC2BW48 in mdraid 1 (2 other Intel SSD's present)

$ modinfo raid1
filename:       /lib/modules/3.0.0-26-server/kernel/drivers/md/raid1.ko
alias:          md-level-1
alias:          md-raid1
alias:          md-personality-3
description:    RAID1 (mirroring) personality for MD
license:        GPL
srcversion:     2AAEFFAAADEDE0EDEE8D523
depends:
vermagic:       3.0.0-26-server SMP mod_unload modversions

fsync=3Doff was never used.
We did do a partition resize 2 weeks back (followed
https://raid.wiki.kernel.org/index.php/Growing ) of the parition containing
the postgres files.


What I already
=E2=80=8Btried
:

- Restarted PostgreSQL
=E2=80=8B- fsck=E2=80=8B.ext4 -fr (returned no results)
- vacuum analyze; (returns with the same error)
- From the 9.3.5 release notes:

postgres=3D# WITH list(file) AS (SELECT * FROM
pg_ls_dir('pg_multixact/offsets'))
postgres-# SELECT EXISTS (SELECT * FROM list WHERE file =3D '0000') AND
postgres-#        NOT EXISTS (SELECT * FROM list WHERE file =3D '0001') AND
postgres-#        NOT EXISTS (SELECT * FROM list WHERE file =3D 'FFFF') AND
postgres-#        EXISTS (SELECT * FROM list WHERE file !=3D '0000')
postgres-#        AS file_0000_removal_required;
 file_0000_removal_required
----------------------------
 f
(1 row)


=E2=80=8BThanks so much in advance for helping out!

Martijn Meijer=E2=80=8B

pgsql-bugs by date:

Previous
From: ekocjan@gmail.com
Date:
Subject: BUG #13594: pg_ctl.exe redirects stderr to Windows Events Log if stderr is redirected to pipe
Next
From: Kevin Grittner
Date:
Subject: Re: Possible data corruption