Thread: pg_receivewal - couple of improvements

pg_receivewal - couple of improvements

From
Bharath Rupireddy
Date:
Hi,

Here are some improvements we can make to pg_receivewal that were
emanated after working with it in production environments:

1) As a user, I, sometimes, want my pg_receivewal to start streaming
from the LSN that I provide as an input i.e. startpos instead of it
calculating the stream start position 1) from its target directory or
2) from its replication slot's restart_lsn or 3) after sending
IDENTIFY_SYSTEM on to the primary.
This will particularly be useful when the primary is down for some
time (for whatever reasons) and the WAL files that are required by the
pg_receivewal may have been removed by it (I know this situation is a
bit messy, but it is quite possible in production environments). Then,
the pg_receivewal will
calculate the start position from its target directory and request the
primary with it, which the primary may not have. I have to intervene
and manually delete/move the WAL files in the pg_receivewal target
directory and restart the pg_receivewal so that it can continue.
Instead, if pg_receivewal can accept a startpos as an option, it can
just go ahead and stream from the primary.

2) Currently, RECONNECT_SLEEP_TIME is 5sec - but I may want to have
more reconnect time as I know that the primary can go down at any time
for whatever reasons in production environments which can take some
time till I bring up primary and I don't want to waste compute cycles
in the node on which pg_receivewal is running and I should be able to
just set it to a higher value, say 5 min or so, after which
pg_receivewal can try to perform StreamLog(); and attempt connection
to primary.

Thoughts?

Regards,
Bharath Rupireddy.



Re: pg_receivewal - couple of improvements

From
Julien Rouhaud
Date:
Hi,

On Wed, Feb 02, 2022 at 08:53:13PM +0530, Bharath Rupireddy wrote:
> 
> Here are some improvements we can make to pg_receivewal that were
> emanated after working with it in production environments:
> 
> 1) As a user, I, sometimes, want my pg_receivewal to start streaming
> from the LSN that I provide as an input i.e. startpos instead of it
> calculating the stream start position 1) from its target directory or
> 2) from its replication slot's restart_lsn or 3) after sending
> IDENTIFY_SYSTEM on to the primary.

This is already being discussed (at least part of) as of
https://www.postgresql.org/message-id/flat/18708360.4lzOvYHigE%40aivenronan.



Re: pg_receivewal - couple of improvements

From
Bharath Rupireddy
Date:
On Wed, Feb 2, 2022 at 9:05 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> Hi,
>
> On Wed, Feb 02, 2022 at 08:53:13PM +0530, Bharath Rupireddy wrote:
> >
> > Here are some improvements we can make to pg_receivewal that were
> > emanated after working with it in production environments:
> >
> > 1) As a user, I, sometimes, want my pg_receivewal to start streaming
> > from the LSN that I provide as an input i.e. startpos instead of it
> > calculating the stream start position 1) from its target directory or
> > 2) from its replication slot's restart_lsn or 3) after sending
> > IDENTIFY_SYSTEM on to the primary.
>
> This is already being discussed (at least part of) as of
> https://www.postgresql.org/message-id/flat/18708360.4lzOvYHigE%40aivenronan.

FYI that thread is closed, it committed the change (f61e1dd [1]) that
pg_receivewal can read from its replication slot restart lsn.

I know that providing the start pos as an option came up there [2],
but I wanted to start the discussion fresh as that thread got closed.

[1]
commit f61e1dd2cee6b1a1da75c2bb0ca3bc72f18748c1
Author: Michael Paquier <michael@paquier.xyz>
Date:   Tue Oct 26 09:30:37 2021 +0900

    Allow pg_receivewal to stream from a slot's restart LSN

    Author: Ronan Dunklau
    Reviewed-by: Kyotaro Horiguchi, Michael Paquier, Bharath Rupireddy
    Discussion: https://postgr.es/m/18708360.4lzOvYHigE@aivenronan

[2] https://www.postgresql.org/message-id/20210902.144554.1303720268994714850.horikyota.ntt%40gmail.com

Regards,
Bharath Rupireddy.



Re: pg_receivewal - couple of improvements

From
Julien Rouhaud
Date:
On Wed, Feb 02, 2022 at 09:14:03PM +0530, Bharath Rupireddy wrote:
> 
> FYI that thread is closed, it committed the change (f61e1dd [1]) that
> pg_receivewal can read from its replication slot restart lsn.
> 
> I know that providing the start pos as an option came up there [2],
> but I wanted to start the discussion fresh as that thread got closed.

Ah sorry I misunderstood your email.

I'm not sure it's a good idea.  If you have missing WALs in your target
directory but have an alternative backup location, you will have to restore the
WAL from that alternative location anyway, so I'm not sure how accepting a
different start position is going to help in that scenario.  On the other hand
allowing a position at the command line can also lead to accepting a bogus
position, which could possibly make things worse.

> 2) Currently, RECONNECT_SLEEP_TIME is 5sec - but I may want to have
> more reconnect time as I know that the primary can go down at any time
> for whatever reasons in production environments which can take some
> time till I bring up primary and I don't want to waste compute cycles
> in the node on which pg_receivewal is running

I don't think that attempting a connection is really costly.  Also, increasing
this retry time also increases the amount of time you're not streaming WALs,
and thus the amount of data you can lose so I'm not sure that's actually a good
idea.  But you might also want to make it more aggressive, so no objection to
make it configurable.



Re: pg_receivewal - couple of improvements

From
Bharath Rupireddy
Date:
On Wed, Feb 2, 2022 at 9:28 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Wed, Feb 02, 2022 at 09:14:03PM +0530, Bharath Rupireddy wrote:
> >
> > FYI that thread is closed, it committed the change (f61e1dd [1]) that
> > pg_receivewal can read from its replication slot restart lsn.
> >
> > I know that providing the start pos as an option came up there [2],
> > but I wanted to start the discussion fresh as that thread got closed.
>
> Ah sorry I misunderstood your email.
>
> I'm not sure it's a good idea.  If you have missing WALs in your target
> directory but have an alternative backup location, you will have to restore the
> WAL from that alternative location anyway, so I'm not sure how accepting a
> different start position is going to help in that scenario.  On the other hand
> allowing a position at the command line can also lead to accepting a bogus
> position, which could possibly make things worse.

Isn't complex for anyone to go to the archive location which involves
extra steps - getting authentication tokens, searching there for the
required WAL file, downloading it, unzipping it, copying back to
pg_receivewal node etc. in production environments? You know, this
will just be problematic and adds more time for bringing up the
pg_receivewal. Instead if I know that the latest checkpoint LSN and
archived WAL file from the primary, I can just provide the startpos
(probably the last checkpoint LSN) to pg_receivewal so that it can
continue getting the WAL records from primary, avoiding the whole
bunch of the manual work that I had to do.

> > 2) Currently, RECONNECT_SLEEP_TIME is 5sec - but I may want to have
> > more reconnect time as I know that the primary can go down at any time
> > for whatever reasons in production environments which can take some
> > time till I bring up primary and I don't want to waste compute cycles
> > in the node on which pg_receivewal is running
>
> I don't think that attempting a connection is really costly.  Also, increasing
> this retry time also increases the amount of time you're not streaming WALs,
> and thus the amount of data you can lose so I'm not sure that's actually a good
> idea.  But you might also want to make it more aggressive, so no objection to
> make it configurable.

Yeah, making it configurable helps tune the reconnect time as per the
requirements.

Regards,
Bharath Rupireddy.



Re: pg_receivewal - couple of improvements

From
Julien Rouhaud
Date:
On Thu, Feb 03, 2022 at 06:40:55PM +0530, Bharath Rupireddy wrote:
> 
> Isn't complex for anyone to go to the archive location which involves
> extra steps - getting authentication tokens, searching there for the
> required WAL file, downloading it, unzipping it, copying back to
> pg_receivewal node etc. in production environments? You know, this
> will just be problematic and adds more time for bringing up the
> pg_receivewal. Instead if I know that the latest checkpoint LSN and
> archived WAL file from the primary, I can just provide the startpos
> (probably the last checkpoint LSN) to pg_receivewal so that it can
> continue getting the WAL records from primary, avoiding the whole
> bunch of the manual work that I had to do.

I don't get it.  If you're missing WAL it means that will you have to do that
tedious manual work to retrieve them no matter what.  So on top of that tedious
work, you also have to make sure that you don't provide a bogus start position.

Also, doesn't this scenario implies that you have both archive_command and
pg_receivewal for storing your WALs elsewhere?  In my experience this isn't
really common.

If you want to make sure you won't have to do that tedious work of retrieving
the WALs from a different location, you should probably rely on the facilities
to make sure it won't happen, like using a replication slots and monitoring the
pg_wal usage.

Maybe that's a good idea but I'm still having a hard time imagining a scenario
where it would actually be a good idea.



Re: pg_receivewal - couple of improvements

From
Michael Paquier
Date:
On Thu, Feb 03, 2022 at 10:01:42PM +0800, Julien Rouhaud wrote:
> I don't get it.  If you're missing WAL it means that will you have to do that
> tedious manual work to retrieve them no matter what.  So on top of that tedious
> work, you also have to make sure that you don't provide a bogus start position.

I may be wrong in saying that, but the primary use case I have seen
for pg_receivewal is a service integration for archiving.

> Maybe that's a good idea but I'm still having a hard time imagining a scenario
> where it would actually be a good idea.

With the defaults that we have now in place (current LSN location,
current slot's location or the archive location), I am not really
convinced that we need more control in this area with the proposed
option.
--
Michael

Attachment

Re: pg_receivewal - couple of improvements

From
Bharath Rupireddy
Date:
On Sun, Feb 6, 2022 at 12:16 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Feb 03, 2022 at 10:01:42PM +0800, Julien Rouhaud wrote:
> > I don't get it.  If you're missing WAL it means that will you have to do that
> > tedious manual work to retrieve them no matter what.  So on top of that tedious
> > work, you also have to make sure that you don't provide a bogus start position.
>
> I may be wrong in saying that, but the primary use case I have seen
> for pg_receivewal is a service integration for archiving.

Not only for archiving, but it can also be used as a
synchronous/asynchronous standby.

> > Maybe that's a good idea but I'm still having a hard time imagining a scenario
> > where it would actually be a good idea.
>
> With the defaults that we have now in place (current LSN location,
> current slot's location or the archive location), I am not really
> convinced that we need more control in this area with the proposed
> option.

Having the start position as an option for pg_receivewal can be useful
in scenarios where the LSN/WAL file calculated from the pg_receivwal's
target directory is removed by the primary (for whatever reasons). In
such scenarios, we have to manually remove the WAL files(a risky thing
IMO) in the pg_receivwal's target directory so that the pg_receivewal
will calculate the next start position from the slot info and then
from the server position via IDENTIFY_SYSTEM command.

With the start position as an option, users can just provide the LSN
from which they want to stream the WAL, in the above case, it can be
from primary's latest checkpoint LSN.

I still feel that the start position as an option would be a good
addition here, so that users can choose the start position.

If others don't agree with the start position as an option, at least
having the current start position calculation (first, from
pg_receivewal's target directory, if not, from slot's restart_lsn, if
not, from server's identifiy_system command) as a user specified
option will be of great help. Users can say, 'calculate start position
from target directory or slot's restart_lsn or server's
identify_system command).

Regards,
Bharath Rupireddy.



Re: pg_receivewal - couple of improvements

From
Michael Paquier
Date:
On Sun, Feb 06, 2022 at 01:01:41PM +0530, Bharath Rupireddy wrote:
> With the start position as an option, users can just provide the LSN
> from which they want to stream the WAL, in the above case, it can be
> from primary's latest checkpoint LSN.

This still strikes me as a dangerous thing to have, prone to errors
with a bunch of base backups wasted if one does a mistake as it would
be very easy to cause holes in the WAL stored, until one has to
redeploy from a base backup in urgency.  This kind of control is
provided by replication slots for new locations, and the current
archive location if anything is stored, so I would leave it at that.

On top of that, this kind of control is prone to more race conditions
with the backend, as a concurrent checkpoint could make the LSN you
are looking for irrelevant.
--
Michael

Attachment

Re: pg_receivewal - couple of improvements

From
Bharath Rupireddy
Date:
On Mon, Feb 7, 2022 at 8:23 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sun, Feb 06, 2022 at 01:01:41PM +0530, Bharath Rupireddy wrote:
> > With the start position as an option, users can just provide the LSN
> > from which they want to stream the WAL, in the above case, it can be
> > from primary's latest checkpoint LSN.
>
> This still strikes me as a dangerous thing to have, prone to errors
> with a bunch of base backups wasted if one does a mistake as it would
> be very easy to cause holes in the WAL stored, until one has to
> redeploy from a base backup in urgency.  This kind of control is
> provided by replication slots for new locations, and the current
> archive location if anything is stored, so I would leave it at that.

What if someone doesn't use pg_receivewal as an archive location? The
pg_receivewal can also be used for synchronous replication quorum
right? In this situation, I don't mind if some of the WAL files are
missing in pg_receivewal's target directory, but I don't want to do
the manual work of getting the WAL files to the pg_receivewal's target
directory from my archive location to make the pg_receivewal up and
connect with primary again? The primary can still remove the WAL files
needed by pg_receivewal (after the max_slot_wal_keep_size limit). If I
can tell pg_receivewal where to get the start position, then that will
be a good idea.

> On top of that, this kind of control is prone to more race conditions
> with the backend, as a concurrent checkpoint could make the LSN you
> are looking for irrelevant.

I understand that having a start position as an option is more
error-prone and creates holes in the WAL file. Why can't we allow
users to choose the current start position calculation of the
pg_receivewal? Users can choose to tell pg_receivewal either to get
start position from its target directory or from its slot's
restart_lsn or from the server's identify_system command. Thoughts?

Regards,
Bharath Rupireddy.



Re: pg_receivewal - couple of improvements

From
Julien Rouhaud
Date:
Hi,

On Mon, Feb 07, 2022 at 12:03:03PM +0530, Bharath Rupireddy wrote:
> 
> What if someone doesn't use pg_receivewal as an archive location? The
> pg_receivewal can also be used for synchronous replication quorum
> right? In this situation, I don't mind if some of the WAL files are
> missing in pg_receivewal's target directory

Those two seem entirely incompatible, why would you have a synchronous
pg_receivewal that would ask for records removed by the primary, even if part
of synchronous quorum, apart from inadequate (and dangerous) configuration?

Also, in which scenario exactly would you be willing to pay a huge overhead to
make sure that all the WAL records are safely transferred to one or multiple
alternative location but at the same time don't mind if you're missing some WAL
segments?