Adam Witney <awitney@sgul.ac.uk> writes:
> Here you go....
> pg_filedump-3.0/pg_filedump -i -f -R 34318 34320 134401986.1
Thanks. What it looks like to me is that block 34320 (really 165392)
is data from some other file altogether. It's evidently still Postgres
heap data, but instead of having 3 non-null columns as any toast row
ought to have, these rows have 77 columns many of which are nulls.
They've got OIDs, too. Possibly you can work out which table these
rows really belong to. It looks like this ought to be block 415664
of whatever table it belongs to (which would make it block 22448 of
the xxx.3 file of that table, if I did the math right).
So the diagnosis is that somebody wrote a data block to the wrong offset
in the wrong file. Whether this is the fault of Postgres, the kernel,
or the disk drive is difficult to say. We've seen a number of cases in
which table pages got overwritten with data that was obviously of
non-Postgres origin, and in those cases we could blame the kernel or
disk drive with a clear conscience. In this case, since the bogus data
is Postgres data, it could be that it's a bug lurking within Postgres
itself --- or it could be that it's like those past cases.
It might be worth your while to run some memory and disk drive tests.
There's no particular reason to suspect a hardware fault more than a
software one, but this is at least something simple to do. Check for
availability of kernel updates, too.
regards, tom lane