Re: How to simulate sync/async standbys being closer/farther (network distance) to primary in core postgres? - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: How to simulate sync/async standbys being closer/farther (network distance) to primary in core postgres?
Date
Msg-id YlGFM1bljYzs/31Z@jrouhaud
Whole thread Raw
In response to Re: How to simulate sync/async standbys being closer/farther (network distance) to primary in core postgres?  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: How to simulate sync/async standbys being closer/farther (network distance) to primary in core postgres?  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
On Sat, Apr 09, 2022 at 02:38:50PM +0530, Bharath Rupireddy wrote:
> On Fri, Apr 8, 2022 at 10:22 PM SATYANARAYANA NARLAPURAM
> <satyanarlapuram@gmail.com> wrote:
> >
> >> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> >> > >
> >> > > Hi,
> >> > >
> >> > > I'm thinking if there's a way in core postgres to achieve $subject. In
> >> > > reality, the sync/async standbys can either be closer/farther (which
> >> > > means sync/async standbys can receive WAL at different times) to
> >> > > primary, especially in cloud HA environments with primary in one
> >> > > Availability Zone(AZ)/Region and standbys in different AZs/Regions.
> >> > > $subject may not be possible on dev systems (say, for testing some HA
> >> > > features) unless we can inject a delay in WAL senders before sending
> >> > > WAL.
> >
> > Simulation will be helpful even for end customers to simulate faults in the
> > production environments during availability zone/disaster recovery drills.
>
> Right.

I'm not sure that's actually helpful.  If you want to do some realistic testing
you need to fully simulate various network incidents and only delaying postgres
replication is never going to be close to that.  You should instead rely on
tool like tc, which can do much more than what $subject could ever do, and do
that for all your HA stack.  At the very least you don't want to validate that
your setup is working as excpected by just simulating a faulty postgres
replication connection but still having all your clients and HA agent not
having any network issue at all.



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: pg_receivewal fail to streams when the partial file to write is not fully initialized present in the wal receiver directory
Next
From: Julien Rouhaud
Date:
Subject: Re: make MaxBackends available in _PG_init