Re: Does PostgreSQL check database integrity at startup? - Mailing list pgsql-general

From Edson Carlos Ericksson Richter
Subject Re: Does PostgreSQL check database integrity at startup?
Date
Msg-id 1bf27d37-d0a0-70e4-b107-d698fedb23a9@simkorp.com.br
Whole thread Raw
In response to Re: Does PostgreSQL check database integrity at startup?  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Does PostgreSQL check database integrity at startup?
List pgsql-general
Em 27/12/2017 15:02, Stephen Frost escreveu:
> Alvaro,
>
> * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
>> Stephen Frost wrote:
>>
>>> * Edson Carlos Ericksson Richter (richter@simkorp.com.br) wrote:
>>>> Anyway, instead digging into rsync functionality (or bugs - I doubt,
>>>> but who knows?), I do prefer to have a script I can run to check if
>>>> there is obvious failures in standby servers.
>>> As mentioned, zero-byte files can be perfectly valid.  PostgreSQL does
>>> have page-level CRCs, if you initialized your database with them (which
>>> I would strongly recommend).
>> Page-level checksums would not detect the problem being complained in
>> this thread, however.
> It's entirely unclear to me what the problem being complained about in
> this thread actually is.  The complaint so far was about zero-byte
> files, but those are entirely valid, so that isn't a problem that anyone
> can solve..
>
> Given the thread subject, if someone actually wanted to do a database
> integrity check before startup, they could use pgBackRest to perform a
> backup with a CRC-enabled database and at least verify that all of the
> checksums are valid.
>
> We could possibly look into adding some set of additional checks for
> files which can't actually be zero-byte, perhaps..  I know we have some
> other one-off checks already.
>
> Thanks!
>
> Stephen

Actually, the problem is:

Master => Slave => Backup

In the master server everything is fine.
But at some point in time, slave became corrupt (one of the base files 
are zero size where it should be 16Mb in size), and IMHO a "red alert" 
should arise - Slave server shall not even startup at all.

Since backups are taken from slave server, all backups are also corrupt.

I've detected the problem just because I've restored a backup (excellent 
practice perhaps - nobody should take backups if not testing it with the 
restore procedure).

In slave server there is no indication that the database is corrupt (not 
in logs, it starts normally and show it is applying stream changes 
regularly).

So that is the point: how to detect that a database is corrupt so 
cluster doesn't even start...

Regards,

Edson


pgsql-general by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Does PostgreSQL check database integrity at startup?
Next
From: Steve Atkins
Date:
Subject: Re: postgresql-10 for ubuntu-17.10 (artful)?