Thread: pg_receivewal - couple of improvements
Hi, Here are some improvements we can make to pg_receivewal that were emanated after working with it in production environments: 1) As a user, I, sometimes, want my pg_receivewal to start streaming from the LSN that I provide as an input i.e. startpos instead of it calculating the stream start position 1) from its target directory or 2) from its replication slot's restart_lsn or 3) after sending IDENTIFY_SYSTEM on to the primary. This will particularly be useful when the primary is down for some time (for whatever reasons) and the WAL files that are required by the pg_receivewal may have been removed by it (I know this situation is a bit messy, but it is quite possible in production environments). Then, the pg_receivewal will calculate the start position from its target directory and request the primary with it, which the primary may not have. I have to intervene and manually delete/move the WAL files in the pg_receivewal target directory and restart the pg_receivewal so that it can continue. Instead, if pg_receivewal can accept a startpos as an option, it can just go ahead and stream from the primary. 2) Currently, RECONNECT_SLEEP_TIME is 5sec - but I may want to have more reconnect time as I know that the primary can go down at any time for whatever reasons in production environments which can take some time till I bring up primary and I don't want to waste compute cycles in the node on which pg_receivewal is running and I should be able to just set it to a higher value, say 5 min or so, after which pg_receivewal can try to perform StreamLog(); and attempt connection to primary. Thoughts? Regards, Bharath Rupireddy.
Hi, On Wed, Feb 02, 2022 at 08:53:13PM +0530, Bharath Rupireddy wrote: > > Here are some improvements we can make to pg_receivewal that were > emanated after working with it in production environments: > > 1) As a user, I, sometimes, want my pg_receivewal to start streaming > from the LSN that I provide as an input i.e. startpos instead of it > calculating the stream start position 1) from its target directory or > 2) from its replication slot's restart_lsn or 3) after sending > IDENTIFY_SYSTEM on to the primary. This is already being discussed (at least part of) as of https://www.postgresql.org/message-id/flat/18708360.4lzOvYHigE%40aivenronan.
On Wed, Feb 2, 2022 at 9:05 PM Julien Rouhaud <rjuju123@gmail.com> wrote: > > Hi, > > On Wed, Feb 02, 2022 at 08:53:13PM +0530, Bharath Rupireddy wrote: > > > > Here are some improvements we can make to pg_receivewal that were > > emanated after working with it in production environments: > > > > 1) As a user, I, sometimes, want my pg_receivewal to start streaming > > from the LSN that I provide as an input i.e. startpos instead of it > > calculating the stream start position 1) from its target directory or > > 2) from its replication slot's restart_lsn or 3) after sending > > IDENTIFY_SYSTEM on to the primary. > > This is already being discussed (at least part of) as of > https://www.postgresql.org/message-id/flat/18708360.4lzOvYHigE%40aivenronan. FYI that thread is closed, it committed the change (f61e1dd [1]) that pg_receivewal can read from its replication slot restart lsn. I know that providing the start pos as an option came up there [2], but I wanted to start the discussion fresh as that thread got closed. [1] commit f61e1dd2cee6b1a1da75c2bb0ca3bc72f18748c1 Author: Michael Paquier <michael@paquier.xyz> Date: Tue Oct 26 09:30:37 2021 +0900 Allow pg_receivewal to stream from a slot's restart LSN Author: Ronan Dunklau Reviewed-by: Kyotaro Horiguchi, Michael Paquier, Bharath Rupireddy Discussion: https://postgr.es/m/18708360.4lzOvYHigE@aivenronan [2] https://www.postgresql.org/message-id/20210902.144554.1303720268994714850.horikyota.ntt%40gmail.com Regards, Bharath Rupireddy.
On Wed, Feb 02, 2022 at 09:14:03PM +0530, Bharath Rupireddy wrote: > > FYI that thread is closed, it committed the change (f61e1dd [1]) that > pg_receivewal can read from its replication slot restart lsn. > > I know that providing the start pos as an option came up there [2], > but I wanted to start the discussion fresh as that thread got closed. Ah sorry I misunderstood your email. I'm not sure it's a good idea. If you have missing WALs in your target directory but have an alternative backup location, you will have to restore the WAL from that alternative location anyway, so I'm not sure how accepting a different start position is going to help in that scenario. On the other hand allowing a position at the command line can also lead to accepting a bogus position, which could possibly make things worse. > 2) Currently, RECONNECT_SLEEP_TIME is 5sec - but I may want to have > more reconnect time as I know that the primary can go down at any time > for whatever reasons in production environments which can take some > time till I bring up primary and I don't want to waste compute cycles > in the node on which pg_receivewal is running I don't think that attempting a connection is really costly. Also, increasing this retry time also increases the amount of time you're not streaming WALs, and thus the amount of data you can lose so I'm not sure that's actually a good idea. But you might also want to make it more aggressive, so no objection to make it configurable.
On Wed, Feb 2, 2022 at 9:28 PM Julien Rouhaud <rjuju123@gmail.com> wrote: > > On Wed, Feb 02, 2022 at 09:14:03PM +0530, Bharath Rupireddy wrote: > > > > FYI that thread is closed, it committed the change (f61e1dd [1]) that > > pg_receivewal can read from its replication slot restart lsn. > > > > I know that providing the start pos as an option came up there [2], > > but I wanted to start the discussion fresh as that thread got closed. > > Ah sorry I misunderstood your email. > > I'm not sure it's a good idea. If you have missing WALs in your target > directory but have an alternative backup location, you will have to restore the > WAL from that alternative location anyway, so I'm not sure how accepting a > different start position is going to help in that scenario. On the other hand > allowing a position at the command line can also lead to accepting a bogus > position, which could possibly make things worse. Isn't complex for anyone to go to the archive location which involves extra steps - getting authentication tokens, searching there for the required WAL file, downloading it, unzipping it, copying back to pg_receivewal node etc. in production environments? You know, this will just be problematic and adds more time for bringing up the pg_receivewal. Instead if I know that the latest checkpoint LSN and archived WAL file from the primary, I can just provide the startpos (probably the last checkpoint LSN) to pg_receivewal so that it can continue getting the WAL records from primary, avoiding the whole bunch of the manual work that I had to do. > > 2) Currently, RECONNECT_SLEEP_TIME is 5sec - but I may want to have > > more reconnect time as I know that the primary can go down at any time > > for whatever reasons in production environments which can take some > > time till I bring up primary and I don't want to waste compute cycles > > in the node on which pg_receivewal is running > > I don't think that attempting a connection is really costly. Also, increasing > this retry time also increases the amount of time you're not streaming WALs, > and thus the amount of data you can lose so I'm not sure that's actually a good > idea. But you might also want to make it more aggressive, so no objection to > make it configurable. Yeah, making it configurable helps tune the reconnect time as per the requirements. Regards, Bharath Rupireddy.
On Thu, Feb 03, 2022 at 06:40:55PM +0530, Bharath Rupireddy wrote: > > Isn't complex for anyone to go to the archive location which involves > extra steps - getting authentication tokens, searching there for the > required WAL file, downloading it, unzipping it, copying back to > pg_receivewal node etc. in production environments? You know, this > will just be problematic and adds more time for bringing up the > pg_receivewal. Instead if I know that the latest checkpoint LSN and > archived WAL file from the primary, I can just provide the startpos > (probably the last checkpoint LSN) to pg_receivewal so that it can > continue getting the WAL records from primary, avoiding the whole > bunch of the manual work that I had to do. I don't get it. If you're missing WAL it means that will you have to do that tedious manual work to retrieve them no matter what. So on top of that tedious work, you also have to make sure that you don't provide a bogus start position. Also, doesn't this scenario implies that you have both archive_command and pg_receivewal for storing your WALs elsewhere? In my experience this isn't really common. If you want to make sure you won't have to do that tedious work of retrieving the WALs from a different location, you should probably rely on the facilities to make sure it won't happen, like using a replication slots and monitoring the pg_wal usage. Maybe that's a good idea but I'm still having a hard time imagining a scenario where it would actually be a good idea.
On Thu, Feb 03, 2022 at 10:01:42PM +0800, Julien Rouhaud wrote: > I don't get it. If you're missing WAL it means that will you have to do that > tedious manual work to retrieve them no matter what. So on top of that tedious > work, you also have to make sure that you don't provide a bogus start position. I may be wrong in saying that, but the primary use case I have seen for pg_receivewal is a service integration for archiving. > Maybe that's a good idea but I'm still having a hard time imagining a scenario > where it would actually be a good idea. With the defaults that we have now in place (current LSN location, current slot's location or the archive location), I am not really convinced that we need more control in this area with the proposed option. -- Michael
Attachment
On Sun, Feb 6, 2022 at 12:16 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Feb 03, 2022 at 10:01:42PM +0800, Julien Rouhaud wrote: > > I don't get it. If you're missing WAL it means that will you have to do that > > tedious manual work to retrieve them no matter what. So on top of that tedious > > work, you also have to make sure that you don't provide a bogus start position. > > I may be wrong in saying that, but the primary use case I have seen > for pg_receivewal is a service integration for archiving. Not only for archiving, but it can also be used as a synchronous/asynchronous standby. > > Maybe that's a good idea but I'm still having a hard time imagining a scenario > > where it would actually be a good idea. > > With the defaults that we have now in place (current LSN location, > current slot's location or the archive location), I am not really > convinced that we need more control in this area with the proposed > option. Having the start position as an option for pg_receivewal can be useful in scenarios where the LSN/WAL file calculated from the pg_receivwal's target directory is removed by the primary (for whatever reasons). In such scenarios, we have to manually remove the WAL files(a risky thing IMO) in the pg_receivwal's target directory so that the pg_receivewal will calculate the next start position from the slot info and then from the server position via IDENTIFY_SYSTEM command. With the start position as an option, users can just provide the LSN from which they want to stream the WAL, in the above case, it can be from primary's latest checkpoint LSN. I still feel that the start position as an option would be a good addition here, so that users can choose the start position. If others don't agree with the start position as an option, at least having the current start position calculation (first, from pg_receivewal's target directory, if not, from slot's restart_lsn, if not, from server's identifiy_system command) as a user specified option will be of great help. Users can say, 'calculate start position from target directory or slot's restart_lsn or server's identify_system command). Regards, Bharath Rupireddy.
On Sun, Feb 06, 2022 at 01:01:41PM +0530, Bharath Rupireddy wrote: > With the start position as an option, users can just provide the LSN > from which they want to stream the WAL, in the above case, it can be > from primary's latest checkpoint LSN. This still strikes me as a dangerous thing to have, prone to errors with a bunch of base backups wasted if one does a mistake as it would be very easy to cause holes in the WAL stored, until one has to redeploy from a base backup in urgency. This kind of control is provided by replication slots for new locations, and the current archive location if anything is stored, so I would leave it at that. On top of that, this kind of control is prone to more race conditions with the backend, as a concurrent checkpoint could make the LSN you are looking for irrelevant. -- Michael
Attachment
On Mon, Feb 7, 2022 at 8:23 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Sun, Feb 06, 2022 at 01:01:41PM +0530, Bharath Rupireddy wrote: > > With the start position as an option, users can just provide the LSN > > from which they want to stream the WAL, in the above case, it can be > > from primary's latest checkpoint LSN. > > This still strikes me as a dangerous thing to have, prone to errors > with a bunch of base backups wasted if one does a mistake as it would > be very easy to cause holes in the WAL stored, until one has to > redeploy from a base backup in urgency. This kind of control is > provided by replication slots for new locations, and the current > archive location if anything is stored, so I would leave it at that. What if someone doesn't use pg_receivewal as an archive location? The pg_receivewal can also be used for synchronous replication quorum right? In this situation, I don't mind if some of the WAL files are missing in pg_receivewal's target directory, but I don't want to do the manual work of getting the WAL files to the pg_receivewal's target directory from my archive location to make the pg_receivewal up and connect with primary again? The primary can still remove the WAL files needed by pg_receivewal (after the max_slot_wal_keep_size limit). If I can tell pg_receivewal where to get the start position, then that will be a good idea. > On top of that, this kind of control is prone to more race conditions > with the backend, as a concurrent checkpoint could make the LSN you > are looking for irrelevant. I understand that having a start position as an option is more error-prone and creates holes in the WAL file. Why can't we allow users to choose the current start position calculation of the pg_receivewal? Users can choose to tell pg_receivewal either to get start position from its target directory or from its slot's restart_lsn or from the server's identify_system command. Thoughts? Regards, Bharath Rupireddy.
Hi, On Mon, Feb 07, 2022 at 12:03:03PM +0530, Bharath Rupireddy wrote: > > What if someone doesn't use pg_receivewal as an archive location? The > pg_receivewal can also be used for synchronous replication quorum > right? In this situation, I don't mind if some of the WAL files are > missing in pg_receivewal's target directory Those two seem entirely incompatible, why would you have a synchronous pg_receivewal that would ask for records removed by the primary, even if part of synchronous quorum, apart from inadequate (and dangerous) configuration? Also, in which scenario exactly would you be willing to pay a huge overhead to make sure that all the WAL records are safely transferred to one or multiple alternative location but at the same time don't mind if you're missing some WAL segments?