On Wed, Feb 02, 2022 at 09:14:03PM +0530, Bharath Rupireddy wrote:
>
> FYI that thread is closed, it committed the change (f61e1dd [1]) that
> pg_receivewal can read from its replication slot restart lsn.
>
> I know that providing the start pos as an option came up there [2],
> but I wanted to start the discussion fresh as that thread got closed.
Ah sorry I misunderstood your email.
I'm not sure it's a good idea. If you have missing WALs in your target
directory but have an alternative backup location, you will have to restore the
WAL from that alternative location anyway, so I'm not sure how accepting a
different start position is going to help in that scenario. On the other hand
allowing a position at the command line can also lead to accepting a bogus
position, which could possibly make things worse.
> 2) Currently, RECONNECT_SLEEP_TIME is 5sec - but I may want to have
> more reconnect time as I know that the primary can go down at any time
> for whatever reasons in production environments which can take some
> time till I bring up primary and I don't want to waste compute cycles
> in the node on which pg_receivewal is running
I don't think that attempting a connection is really costly. Also, increasing
this retry time also increases the amount of time you're not streaming WALs,
and thus the amount of data you can lose so I'm not sure that's actually a good
idea. But you might also want to make it more aggressive, so no objection to
make it configurable.