Re: locate DB corruption - Mailing list pgsql-general

From Adrian Klaver
Subject Re: locate DB corruption
Date
Msg-id e0334b31-f0ca-b0c2-6224-b07fe10037a8@aklaver.com
Whole thread Raw
In response to Re: locate DB corruption  (Dave Peticolas <dave@krondo.com>)
Responses Re: locate DB corruption  (Dave Peticolas <dave@krondo.com>)
List pgsql-general
On 08/31/2018 08:51 AM, Dave Peticolas wrote:
> On Fri, Aug 31, 2018 at 8:14 AM Adrian Klaver <adrian.klaver@aklaver.com 
> <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     On 08/31/2018 08:02 AM, Dave Peticolas wrote:
>      > Hello, I'm running into the following error running a large query
>     on a
>      > database restored from WAL replay:
>      >
>      > could not access status of transaction 330569126
>      > DETAIL: Could not open file "pg_clog/0C68": No such file or directory
> 
> 
>     Postgres version?
> 
> 
> Right! Sorry, that original email didn't have a lot of info. This is 
> 9.6.9 restoring a backup from 9.6.8.
> 
>     Where is the replay coming from?
> 
> 
>  From a snapshot and WAL files stored in Amazon S3.

Seems the process is not creating a consistent backup.

How are they being generated?


>     Are you sure you are not working across versions?
> 
> 
> I am sure, they are all 9.6.
> 
>     If not do pg_clog/ and 0C68 actually exist?
> 
> 
> pg_clog definitely exists, but 0C68 does not. I think I have 
> subsequently found the precise row in the specific table that seems to 
> be the problem. Specifically I can select * from TABLE where id = BADID 
> - 1 or id = BADID + 1 and the query returns. I get the error if I select 
> the row with the bad ID.
> 
> Now what I'm not sure of is how to fix.

One thing I can think of is to rebuild from a later version of your S3 
data and see if it has all the necessary files.

There is also pg_resetxlog:

https://www.postgresql.org/docs/9.6/static/app-pgresetxlog.html

I have not used it, so I can not offer much in the way of tips. Just 
from reading the docs I would suggest stopping the server and then 
creating a backup of $PG_DATA(if possible) before using pg_resetxlog.


-- 
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: Andres Freund
Date:
Subject: Re: Autovacuum degrades all other operations by keeping all buffersdirty?
Next
From: Shaun Savage
Date:
Subject: Re: using a plpgsql function argument as a table column.