Re: Safe switchover - Mailing list pgsql-general

From Stephen Frost
Subject Re: Safe switchover
Date
Msg-id 20200713154722.GU12375@tamriel.snowman.net
Whole thread Raw
In response to Re: Safe switchover  (Paul Förster <paul.foerster@gmail.com>)
Responses Re: Safe switchover
List pgsql-general
Greetings,

* Paul Förster (paul.foerster@gmail.com) wrote:
> > On 10. Jul, 2020, at 17:45, Stephen Frost <sfrost@snowman.net> wrote:
> > Sure, if you know exactly why the former primary failed and have
> > confidence that nothing actually bad happened then pg_rewind can work
> > (though it's still not what I'd generally recommend).
> >
> > If you don't actually know what happened to the former primary to cause
> > it to fail then I definitely wouldn't use pg_rewind on it since it
> > doesn't have any checks to make sure that the data is actually
> > generally consistent.  These days you could get a bit of a better
> > feeling by running pg_checksums against the data dir, but that's not
> > going to be as good as doing a pgbackrest delta restore when it comes to
> > making sure that everything is valid.
>
> we use Netapp plus continuous archiving. To protect agains block corruption, all our database clusters have been
createdwith initdb -k. So they should report block corruptions in the log. 

I certainly encourage using initdb -k.

> The usual reason why a database cluster goes down is because the server is shut down which initiates a switchover and
isnot problematic. If the server goes down by a power outage, system crash or similar, then an automatic failover is
initiated,which, according to our experience, is also not problematic. Patroni seems to handle both situations well. 

Sure, Patroni will handle the failover fine- but that's not what I was
referring to.  If the server crashes and you have no idea why or what
happened, I would strongly recommend against using pg_rewind to rebuild
it to be a replica as there's no validation happening- you might
failover to it much later and, if you're lucky, discover quickly that
some blocks had gotten corrupted or if you're unlucky not discover until
much later that something was corrupted when the crash happened.  Using
initdb -k is good, but PG is only going to check the block when it goes
to read it, which might not be until much later especially on a system
that's been rebuilt as a replica.

> The worst case is, that both servers crash, which is pretty unlikely. So, the worst case is that we have to perform a
volumerestore with Netapp and replay the WAL files since that last snapshot. Should the replica database cluster be
damagedtoo, then we may need to reinit it with Patroni. This is acceptable even for large database clusters because
replicationruns fast. But the possibility is very, very small. 

This seems like an independent question and I'm not really sure what is
meant here by 'reinit it with Patroni'.

> So, in my opinion, -k should be default and if one wanted to create a non-checksummed database cluster, it would have
tobe stated on the command line explicitly. This IMHO is a reasonable way to make people migrate to checksums over time
asdatabase clusters are migrated. 

I agree that it'd be good to have -k on by default.

Thanks,

Stephen

Attachment

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Doubt in mvcc
Next
From: Paul Förster
Date:
Subject: Re: Safe switchover