Thread: TOAST table repeatedly corrupted

TOAST table repeatedly corrupted

From
Niles Oien
Date:

I have what I think is a bug in postgres. I'd like to help track it down, if that
is indeed what's going on.

I am running a reasonably recent version of postgres :

data=# select version();
                                                 version                                                  
----------------------------------------------------------------------------------------------------------
 PostgreSQL 9.5.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-17), 64-bit
(1 row)

About every six months, the logs show that vacuuming of
one of our tables (the same table every time) starts to fail :

 grep ERROR postgresql-09.csv

2018-05-09 16:14:03.834 GMT,,,27018,,5af31e4b.698a,1,,2018-05-09 16:14:03 GMT,12/611211,0,ERROR,XX001,"invalid page in block 1374551 of relation base/16384/36298640",,,,,"automatic vacuum of table ""data.pg_toast.pg_toast_36298637""",,,,""
2018-05-09 16:15:03.865 GMT,,,27696,,5af31e87.6c30,1,,2018-05-09 16:15:03 GMT,12/611219,0,ERROR,XX001,"invalid page in block 1374551 of relation base/16384/36298640",,,,,"automatic vacuum of table ""data.pg_toast.pg_toast_36298637""",,,,""
2018-05-09 16:16:03.897 GMT,,,31666,,5af31ec3.7bb2,1,,2018-05-09 16:16:03 GMT,10/1330957,0,ERROR,XX001,"invalid page in block 1374551 of relation base/16384/36298640",,,,,"automatic vacuum of table ""data.pg_toast.pg_toast_36298637""",,,,""


So it looks like a TOAST table has been corrupted. I can see which
table the TOAST table is supporting :

data=# select relname from pg_class where reltoastrelid = 'pg_toast.pg_toast_36298637'::regclass;
    relname    
---------------
 rdvtrack_fd05
(1 row)


And sure enough, I now cannot dump that table :

linux> pg_dump -p5432 -Upostgres -t hmi.rdvtrack_fd05 data > hmi_rdvtrack_fd05.sql
pg_dump: Dumping the contents of table "rdvtrack_fd05" failed: PQgetResult() failed.
pg_dump: Error message from server: ERROR:  compressed data is corrupted
pg_dump: The command was: COPY hmi.rdvtrack_fd05 (recnum, sunum, slotnum, sessionid, sessionns, cparms_sg000, v_bzero, v_bscale, carrrot, cmlon, lonhg, lathg, loncm, midtime, crval1, crval2, crval3, t_start, t_stop, lonspan, coverage, zonalvel, mai, module, source, input, created, bld_vers, backgrnd, rejectlist, cmlon_index, loncm_index, lathg_index, lonhg_index, sg_000_file, sg_001_file, history, datavals, missvals, datamin, datamax, datamean, datarms, dataskew, datakurt, meanmu, meanlat) TO stdout;


I think I can see the disk files where the tables are stored :

data=# select pg_relation_filepath('hmi.rdvtrack_fd05');
 pg_relation_filepath 
----------------------
 base/16384/36298637
(1 row)

data=# select pg_relation_filepath('pg_toast.pg_toast_36298637');
 pg_relation_filepath 
----------------------
 base/16384/36298640
(1 row)


I can run pg_filedump on those two files, in both cases all
the flags seem unremarkable (although I may not be knowing
what to look for) and it runs to the end of the file, giving
me a message like :

*** End of File Encountered. Last Block Read: 131071 ***

I think this is probably a bug? Every time it happens
it affects the same table, hmi.rdvtrack_fd05. The hardware
we're running on is fairly new, disks and memory are
not throwing errors, and we've never had problems with any
other tables. We're running on CentOS 6.8 - again, not brand
new but hardly decrepit, either. I've rebuilt the table
several times. This time I'd like to give the PG community
a chance to chase it down before I do that.

I've put the two data files, the postgres config and
essentially what I've said here online at :



If it helps, the TOAST table looks like this :

data=# \d pg_toast.pg_toast_36298637
TOAST table "pg_toast.pg_toast_36298637"
   Column   |  Type   
------------+---------
 chunk_id   | oid
 chunk_seq  | integer
 chunk_data | bytea


And the table the TOAST table is supporting looks like this :

data=# \d hmi.rdvtrack_fd05
          Table "hmi.rdvtrack_fd05"
    Column    |       Type       | Modifiers 
--------------+------------------+-----------
 recnum       | bigint           | not null
 sunum        | bigint           | 
 slotnum      | integer          | 
 sessionid    | bigint           | 
 sessionns    | text             | 
 cparms_sg000 | text             | 
 v_bzero      | double precision | 
 v_bscale     | double precision | 
 carrrot      | integer          | 
 cmlon        | real             | 
 lonhg        | real             | 
 lathg        | real             | 
 loncm        | real             | 
 midtime      | double precision | 
 crval1       | real             | 
 crval2       | real             | 
 crval3       | double precision | 
 t_start      | double precision | 
 t_stop       | double precision | 
 lonspan      | real             | 
 coverage     | real             | 
 zonalvel     | real             | 
 mai          | real             | 
 module       | text             | 
 source       | text             | 
 input        | text             | 
 created      | double precision | 
 bld_vers     | text             | 
 backgrnd     | text             | 
 rejectlist   | text             | 
 cmlon_index  | bigint           | 
 loncm_index  | bigint           | 
 lathg_index  | bigint           | 
 lonhg_index  | bigint           | 
 sg_000_file  | text             | 
 sg_001_file  | text             | 
 history      | text             | 
 datavals     | integer          | 
 missvals     | integer          | 
 datamin      | real             | 
 datamax      | real             | 
 datamean     | double precision | 
 datarms      | double precision | 
 dataskew     | double precision | 
 datakurt     | double precision | 
 meanmu       | real             | 
 meanlat      | real             | 
Indexes:
    "rdvtrack_fd05_pkey" PRIMARY KEY, btree (recnum)
    "rdvtrack_fd05_carrrot" btree (carrrot)
    "rdvtrack_fd05_cmlon_index" btree (cmlon_index)
    "rdvtrack_fd05_lathg_index" btree (lathg_index)
    "rdvtrack_fd05_loncm_index" btree (loncm_index)
    "rdvtrack_fd05_lonhg_index" btree (lonhg_index)

Niles Oien, noien@nso.edu




--
Niles Oien, National Solar Observatory, Boulder Colorado USA

TOAST table repeatedly corrupted

From
"David G. Johnston"
Date:
On Wednesday, May 9, 2018, Niles Oien <noien@nso.edu> wrote:

I have what I think is a bug in postgres. I'd like to help track it down, if that
is indeed what's going on.

I am running a reasonably recent version of postgres :

data=# select version();
                                                 version                                                  
----------------------------------------------------------------------------------------------------------
 PostgreSQL 9.5.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-17), 64-bit
(1 row)


9.5.12 is reasonably recent.  9.5.5 is not.  Even if you found a live bug you'd have to upgrade to fix it so you should just do that now and see if that works.  You can scan the release notes for 9.5.6 and newer to see whether any vacuum related fixes seem to apply.

David J.

Re: TOAST table repeatedly corrupted

From
Tom Lane
Date:
Niles Oien <noien@nso.edu> writes:
> I am running a reasonably recent version of postgres :
>  PostgreSQL 9.5.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
> 20120313 (Red Hat 4.4.7-17), 64-bit

As David said, that's not terribly recent.  If you are going to upgrade,
I'd suggest waiting till tomorrow and grabbing 9.5.13, because we fixed
a pretty serious TOAST data corruption bug in this week's batch of
releases.  The expected symptoms of it don't match what you're seeing,
unfortunately, but nonetheless you ought to be using the latest, just
in case this is an already-fixed issue.

> 2018-05-09 16:14:03.834 GMT,,,27018,,5af31e4b.698a,1,,2018-05-09 16:14:03
> GMT,12/611211,0,ERROR,XX001,"invalid page in block 1374551 of relation
> base/16384/36298640",,,,,"automatic vacuum of table
> ""data.pg_toast.pg_toast_36298637""",,,,""

Block 1374551 would be well past the first segment of the file, since
in a standard build (1GB segments, 8K blocks) there are only 131072
pages per segment.  This explains why you didn't see any complaints
from pg_filedump, if you only ran it over the first segment.

If you've not clobbered the DB yet, file 36298640.10 would be what
to look at, I believe.

> And sure enough, I now cannot dump that table :
> pg_dump: Error message from server: ERROR:  compressed data is corrupted

That's interesting, because it seems to indicate an independent problem.
The "invalid page" error indicates a bad page header, or possibly a
page checksum failure; either way the page would not have been allowed
into the buffer pool.  But "compressed data is corrupted" implies that
we did read a page but the data in it seems messed up.  So this evidence
says you have at least two different corrupted places in that table.

Do you have checksums enabled in this installation?  If you're going
to have to rebuild it, you should probably turn those on (use
initdb --data-checksums), in hopes of narrowing down what's happening.  

> I think this is probably a bug? Every time it happens
> it affects the same table, hmi.rdvtrack_fd05.

That's mighty suggestive all right, but unfortunately it doesn't
do much to narrow down the problem :-(

            regards, tom lane


Re: TOAST table repeatedly corrupted

From
Niles Oien
Date:

Thanks. I don't have checksums on. I'll look into it on the next build.

The file 36298640.10 didn't show anything unusual under pg_filedump.

I'm betting that we are suffering from a now-fixed TOAST issue, if not the recently fixed one you mentioned. That's probably all the chasing that's worth doing here given the dated nature of our production box. On our development box, where we have some room to move, we're running something a bit newer :

 PostgreSQL 10.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit

Does that need an upgrade to get this week's TOAST fixes, too? I'm not sure if CentOS's 'yum upgrade' will pick it up - I have the repo pgdg10/7/x86_64 enabled, will the update show up that way?

Thanks,

Niles.






On Wed, May 9, 2018 at 2:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Niles Oien <noien@nso.edu> writes:
> I am running a reasonably recent version of postgres :
>  PostgreSQL 9.5.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
> 20120313 (Red Hat 4.4.7-17), 64-bit

As David said, that's not terribly recent.  If you are going to upgrade,
I'd suggest waiting till tomorrow and grabbing 9.5.13, because we fixed
a pretty serious TOAST data corruption bug in this week's batch of
releases.  The expected symptoms of it don't match what you're seeing,
unfortunately, but nonetheless you ought to be using the latest, just
in case this is an already-fixed issue.

> 2018-05-09 16:14:03.834 GMT,,,27018,,5af31e4b.698a,1,,2018-05-09 16:14:03
> GMT,12/611211,0,ERROR,XX001,"invalid page in block 1374551 of relation
> base/16384/36298640",,,,,"automatic vacuum of table
> ""data.pg_toast.pg_toast_36298637""",,,,""

Block 1374551 would be well past the first segment of the file, since
in a standard build (1GB segments, 8K blocks) there are only 131072
pages per segment.  This explains why you didn't see any complaints
from pg_filedump, if you only ran it over the first segment.

If you've not clobbered the DB yet, file 36298640.10 would be what
to look at, I believe.

> And sure enough, I now cannot dump that table :
> pg_dump: Error message from server: ERROR:  compressed data is corrupted

That's interesting, because it seems to indicate an independent problem.
The "invalid page" error indicates a bad page header, or possibly a
page checksum failure; either way the page would not have been allowed
into the buffer pool.  But "compressed data is corrupted" implies that
we did read a page but the data in it seems messed up.  So this evidence
says you have at least two different corrupted places in that table.

Do you have checksums enabled in this installation?  If you're going
to have to rebuild it, you should probably turn those on (use
initdb --data-checksums), in hopes of narrowing down what's happening. 

> I think this is probably a bug? Every time it happens
> it affects the same table, hmi.rdvtrack_fd05.

That's mighty suggestive all right, but unfortunately it doesn't
do much to narrow down the problem :-(

                        regards, tom lane



--
Niles Oien, National Solar Observatory, Boulder Colorado USA

Re: TOAST table repeatedly corrupted

From
Peter Geoghegan
Date:
On Wed, May 9, 2018 at 11:54 AM, Niles Oien <noien@nso.edu> wrote:
> I think this is probably a bug? Every time it happens
> it affects the same table, hmi.rdvtrack_fd05. The hardware
> we're running on is fairly new, disks and memory are
> not throwing errors, and we've never had problems with any
> other tables. We're running on CentOS 6.8 - again, not brand
> new but hardly decrepit, either. I've rebuilt the table
> several times. This time I'd like to give the PG community
> a chance to chase it down before I do that.

amcheck could potentially reveal more information about the problem.
The version on Github works with 9.5, and there are Redhat/CentOS pgdg
packages you can use. See:

https://github.com/petergeoghegan/amcheck

The "heapallindexed" option may be of particular interest to you. It
verifies that the table has matching rows for a target index, in
addition to testing the structure of the target B-Tree index itself.
It doesn't do anything special with TOAST tables just yet, but it's
still the best general corruption test we have.

-- 
Peter Geoghegan


Re: TOAST table repeatedly corrupted

From
Tom Lane
Date:
Niles Oien <noien@nso.edu> writes:
>  PostgreSQL 10.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
> 20150623 (Red Hat 4.8.5-16), 64-bit
> 
> Does that need an upgrade to get this week's TOAST fixes, too?

Yes.

> I'm not sure
> if CentOS's 'yum upgrade' will pick it up - I have the repo pgdg10/7/x86_64
> enabled, will the update show up that way?

I believe so.

            regards, tom lane