Re: TOAST table repeatedly corrupted - Mailing list pgsql-bugs

From Tom Lane
Subject Re: TOAST table repeatedly corrupted
Date
Msg-id 15116.1525898831@sss.pgh.pa.us
Whole thread Raw
In response to TOAST table repeatedly corrupted  (Niles Oien <noien@nso.edu>)
Responses Re: TOAST table repeatedly corrupted  (Niles Oien <noien@nso.edu>)
List pgsql-bugs
Niles Oien <noien@nso.edu> writes:
> I am running a reasonably recent version of postgres :
>  PostgreSQL 9.5.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
> 20120313 (Red Hat 4.4.7-17), 64-bit

As David said, that's not terribly recent.  If you are going to upgrade,
I'd suggest waiting till tomorrow and grabbing 9.5.13, because we fixed
a pretty serious TOAST data corruption bug in this week's batch of
releases.  The expected symptoms of it don't match what you're seeing,
unfortunately, but nonetheless you ought to be using the latest, just
in case this is an already-fixed issue.

> 2018-05-09 16:14:03.834 GMT,,,27018,,5af31e4b.698a,1,,2018-05-09 16:14:03
> GMT,12/611211,0,ERROR,XX001,"invalid page in block 1374551 of relation
> base/16384/36298640",,,,,"automatic vacuum of table
> ""data.pg_toast.pg_toast_36298637""",,,,""

Block 1374551 would be well past the first segment of the file, since
in a standard build (1GB segments, 8K blocks) there are only 131072
pages per segment.  This explains why you didn't see any complaints
from pg_filedump, if you only ran it over the first segment.

If you've not clobbered the DB yet, file 36298640.10 would be what
to look at, I believe.

> And sure enough, I now cannot dump that table :
> pg_dump: Error message from server: ERROR:  compressed data is corrupted

That's interesting, because it seems to indicate an independent problem.
The "invalid page" error indicates a bad page header, or possibly a
page checksum failure; either way the page would not have been allowed
into the buffer pool.  But "compressed data is corrupted" implies that
we did read a page but the data in it seems messed up.  So this evidence
says you have at least two different corrupted places in that table.

Do you have checksums enabled in this installation?  If you're going
to have to rebuild it, you should probably turn those on (use
initdb --data-checksums), in hopes of narrowing down what's happening.  

> I think this is probably a bug? Every time it happens
> it affects the same table, hmi.rdvtrack_fd05.

That's mighty suggestive all right, but unfortunately it doesn't
do much to narrow down the problem :-(

            regards, tom lane


pgsql-bugs by date:

Previous
From: "David G. Johnston"
Date:
Subject: TOAST table repeatedly corrupted
Next
From: Niles Oien
Date:
Subject: Re: TOAST table repeatedly corrupted