Re: Does PostgreSQL check database integrity at startup? - Mailing list pgsql-general

From Edson Carlos Ericksson Richter
Subject Re: Does PostgreSQL check database integrity at startup?
Date
Msg-id 42cda197-fead-a7e5-a7e2-750d98bc4eb2@simkorp.com.br
Whole thread Raw
In response to Re: Does PostgreSQL check database integrity at startup?  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Does PostgreSQL check database integrity at startup?
Re: Does PostgreSQL check database integrity at startup?
List pgsql-general
Em 28/12/2017 10:16, Stephen Frost escreveu:
> Alvaro,
>
> * Alvaro Herrera (alvherre@alvh.no-ip.org) wrote:
>> For context: this was first reported in the Barman forum here:
>> https://groups.google.com/forum/#!msg/pgbarman/3aXWpaKWRFI/weUIZxspDAAJ
>> They are using Barman for the backups.
> Ahhhh, I see.  I wasn't aware of that history.
>
>> Stephen Frost wrote:
>>
>>>> But at some point in time, slave became corrupt (one of the base
>>>> files are zero size where it should be 16Mb in size), and IMHO a
>>>> "red alert" should arise - Slave server shall not even startup at
>>>> all.
>>> How do you know it should be 16Mb in size...?  That sounds like you're
>>> describing a WAL file, but you should be archiving your WAL files during
>>> a backup, not just using whatever is in pg_xlog/pg_wal..
>> It's not a WAL file -- it's a file backing a table.
> Interesting.
>
>>>> Since backups are taken from slave server, all backups are also corrupt.
>>> If you aren't following the appropriate process to perform a backup
>>> then, yes, you're going to end up with corrupt and useless/bad backups.
>> A few guys went over the backup-taking protocol upthread already.
>>
>> But anyway the backup tool is a moot point.  The problem doesn't
>> originate in the backup -- it originates in the standby, from where the
>> backup is taken.  The file can be seen as size 0 in the standby.
>> Edson's question is: why wasn't the problem detected in the standby?
>> It seems a valid question to me, to which we currently we don't have any
>> good answer.
> The last message on that thread seems pretty clear to me- the comment is
> "I think this is a failure in standby build."  It's not clear what that
> failure was but I agree it doesn't appear related to the backup tool
> (the comment there is "I'm using rsync"), or, really, PostgreSQL at all
> (a failure during the build of the replica isn't something we're
> necessairly going to pick up on..).
>
> As discussed on this thread, zero-byte files are entirely valid to
> appear in the PostgreSQL data directory.
>
> To try and dig into what happened, I'd probably look at what forks there
> are of that relation, the entry in pg_class, and try to figure out how
> it is that replication isn't complaining when the file on the primary
> appeared to be modified well after the last modify timestamp on the
> replica.  If it's possible to replica this into a test environment,
> maybe even do a no-op update of a row of that table and see what happens
> with replication.  One thing I wonder is if this table used to be
> unlogged and it was later turned into a logged table but something
> didn't quite happen correctly with that.  I'd also suggest looking for
> other file size mismatches between the primary and the replica.
>
> Thanks!
>
> Stephen
The table was never unlogged. From very beginning, it was always logged. 
I've dozens of databases with exactly same setup - and right now, I'm 
rebuilding the slave server. Instead of investigating something probably 
I will not find the cause, I would like to have a alert for the future.

Would be possible to include in future versions:
1) After start standby, standby run all WAL files until it is 
synchronized with master (current behavior)
3) Before getting into "accept read only queries", check if all base 
files have same size as master server (new behavior). In case something 
is different, throw an error and stop database startup?
4) Then start "accept read only queries" (current behavior)
???

Thanks,

Edson


pgsql-general by date:

Previous
From: Roger Tannous
Date:
Subject: Re: documentation help
Next
From: Brent Wood
Date:
Subject: Re: Does PostgreSQL check database integrity at startup?