Thread: [PATCH] Allow to specify restart_lsn inpg_create_physical_replication_slot()
[PATCH] Allow to specify restart_lsn inpg_create_physical_replication_slot()
From
Vyacheslav Makarov
Date:
Hello, hackers. I would like to propose a patch, which allows passing one extra parameter to pg_create_physical_replication_slot() — restart_lsn. It could be very helpful if we already have some backup with STOP_LSN from a couple of hours in the past and we want to quickly verify wether it is possible to create a replica from this backup or not. If the WAL segment for the specified restart_lsn (STOP_LSN of the backup) exists, then the function will create a physical replication slot and will keep all the WAL segments required by the replica to catch up with the primary. Otherwise, it returns error, which means that the required WAL segments have been already utilised, so we do need to take a new backup. Without passing this newly added parameter pg_create_physical_replication_slot() works as before. What do you think about this? -- Vyacheslav Makarov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Re: [PATCH] Allow to specify restart_lsn inpg_create_physical_replication_slot()
From
Michael Paquier
Date:
On Thu, Jun 18, 2020 at 03:39:09PM +0300, Vyacheslav Makarov wrote: > If the WAL segment for the specified restart_lsn (STOP_LSN of the backup) > exists, then the function will create a physical replication slot and will > keep all the WAL segments required by the replica to catch up with the > primary. Otherwise, it returns error, which means that the required WAL > segments have been already utilised, so we do need to take a new backup. > Without passing this newly added parameter > pg_create_physical_replication_slot() works as before. > > What do you think about this? I think that this was discussed in the past (perhaps one of the threads related to WAL advancing actually?), and this stuff is full of holes when it comes to think about error handling with checkpoints running in parallel, potentially doing recycling of segments you would expect to be around based on your input value for restart_lsn *while* pg_create_physical_replication_slot() is still running and manipulating the on-disk slot information. I suspect that this also breaks a couple of assumptions behind concurrent calls of the minimum LSN calculated across slots when a caller sees fit to recompute the thresholds (WAL senders mainly here, depending on the replication activity). -- Michael
Attachment
Re: [PATCH] Allow to specify restart_lsn inpg_create_physical_replication_slot()
From
Alexey Kondratov
Date:
On 2020-06-19 03:59, Michael Paquier wrote: > On Thu, Jun 18, 2020 at 03:39:09PM +0300, Vyacheslav Makarov wrote: >> If the WAL segment for the specified restart_lsn (STOP_LSN of the >> backup) >> exists, then the function will create a physical replication slot and >> will >> keep all the WAL segments required by the replica to catch up with the >> primary. Otherwise, it returns error, which means that the required >> WAL >> segments have been already utilised, so we do need to take a new >> backup. >> Without passing this newly added parameter >> pg_create_physical_replication_slot() works as before. >> >> What do you think about this? > > I think that this was discussed in the past (perhaps one of the > threads related to WAL advancing actually?), > I have searched through the archives a bit and found one thread related to slots advancing [1]. It was dedicated to a problem of advancing slots which do not reserve WAL yet, if I get it correctly. Although it is somehow related to the topic, it was a slightly different issue, IMO. > > and this stuff is full of > holes when it comes to think about error handling with checkpoints > running in parallel, potentially doing recycling of segments you would > expect to be around based on your input value for restart_lsn *while* > pg_create_physical_replication_slot() is still running and > manipulating the on-disk slot information. I suspect that this also > breaks a couple of assumptions behind concurrent calls of the minimum > LSN calculated across slots when a caller sees fit to recompute the > thresholds (WAL senders mainly here, depending on the replication > activity). > These are the right concerns, but all of them should be applicable to the pg_create_physical_replication_slot() + immediately_reserve == true in the same way, doesn't it? I think so, since in that case we are doing a pretty similar thing — trying to reserve some WAL segment that may be concurrently deleted. And this is exactly the reason why ReplicationSlotReserveWal() does it in several steps in a loop: 1. Creates a slot with some restart_lsn. 2. Does ReplicationSlotsComputeRequiredLSN() to prevent removal of the WAL segment with this restart_lsn. 3. Checks that required WAL segment is still there. 4. Repeat if this attempt to prevent WAL removal has failed. I guess that the only difference in the case of proposed scenario is that we do not have a chance for step 4, since we do need some specific restart_lsn, not any recent restart_lsn, i.e. in this case we have to: 1. Create a slot with restart_lsn specified by user. 2. Do ReplicationSlotsComputeRequiredLSN() to prevent WAL removal. 3. Check that required WAL segment is still there and report ERROR to the user if it is not. I have eyeballed the attached patch and it looks like doing exactly the same, so issues with concurrent deletion are not obvious for me. Or, there are should be the same issues for pg_create_physical_replication_slot() + immediately_reserve == true with current master implementation. [1] https://www.postgresql.org/message-id/flat/20180626071305.GH31353%40paquier.xyz Regards -- Alexey Kondratov Postgres Professional https://www.postgrespro.com Russian Postgres Company
Re: [PATCH] Allow to specify restart_lsn inpg_create_physical_replication_slot()
From
Fujii Masao
Date:
On 2020/06/19 23:20, Alexey Kondratov wrote: > On 2020-06-19 03:59, Michael Paquier wrote: >> On Thu, Jun 18, 2020 at 03:39:09PM +0300, Vyacheslav Makarov wrote: >>> If the WAL segment for the specified restart_lsn (STOP_LSN of the backup) >>> exists, then the function will create a physical replication slot and will >>> keep all the WAL segments required by the replica to catch up with the >>> primary. Otherwise, it returns error, which means that the required WAL >>> segments have been already utilised, so we do need to take a new backup. >>> Without passing this newly added parameter >>> pg_create_physical_replication_slot() works as before. >>> >>> What do you think about this? Currently pg_create_physical_replication_slot() and CREATE_REPLICATION_SLOT replication command seem to be "idential". So if we add new option into one, we should add it also into another? What happen if future LSN is specified in restart_lsn? With the patch, in this case, if the segment at that LSN exists (e.g., because it's recycled one), the slot seems to be successfully created. However if the LSN is far future and the segment doesn't exist, the creation of slot seems to fail. This behavior looks fragile and confusing. We should accept future LSN whether its segment currently exists or not? + if (!RecoveryInProgress() && !SlotIsLogical(MyReplicationSlot)) With the patch, the given restart_lsn seems to be ignored during recovery. Why? >> >> I think that this was discussed in the past (perhaps one of the >> threads related to WAL advancing actually?), >> > > I have searched through the archives a bit and found one thread related to slots advancing [1]. It was dedicated to a problemof advancing slots which do not reserve WAL yet, if I get it correctly. Although it is somehow related to the topic,it was a slightly different issue, IMO. > >> >> and this stuff is full of >> holes when it comes to think about error handling with checkpoints >> running in parallel, potentially doing recycling of segments you would >> expect to be around based on your input value for restart_lsn *while* >> pg_create_physical_replication_slot() is still running and >> manipulating the on-disk slot information. I suspect that this also >> breaks a couple of assumptions behind concurrent calls of the minimum >> LSN calculated across slots when a caller sees fit to recompute the >> thresholds (WAL senders mainly here, depending on the replication >> activity). >> > > These are the right concerns, but all of them should be applicable to the pg_create_physical_replication_slot() + immediately_reserve== true in the same way, doesn't it? I think so, since in that case we are doing a pretty similar thing— trying to reserve some WAL segment that may be concurrently deleted. > > And this is exactly the reason why ReplicationSlotReserveWal() does it in several steps in a loop: > > 1. Creates a slot with some restart_lsn. > 2. Does ReplicationSlotsComputeRequiredLSN() to prevent removal of the WAL segment with this restart_lsn. > 3. Checks that required WAL segment is still there. > 4. Repeat if this attempt to prevent WAL removal has failed. What happens if concurrent checkpoint decides to remove the segment at restart_lsn before #2 and then actually removes it after #3? The replication slot is successfully created with the given restart_lsn, but the reserved segment has already been removed? > I guess that the only difference in the case of proposed scenario is that we do not have a chance for step 4, since wedo need some specific restart_lsn, not any recent restart_lsn, i.e. in this case we have to: > > 1. Create a slot with restart_lsn specified by user. > 2. Do ReplicationSlotsComputeRequiredLSN() to prevent WAL removal. > 3. Check that required WAL segment is still there and report ERROR to the user if it is not. The similar situation as the above may happen. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: [PATCH] Allow to specify restart_lsn inpg_create_physical_replication_slot()
From
Alexey Kondratov
Date:
On 2020-06-19 21:57, Fujii Masao wrote: > On 2020/06/19 23:20, Alexey Kondratov wrote: >> On 2020-06-19 03:59, Michael Paquier wrote: >>> On Thu, Jun 18, 2020 at 03:39:09PM +0300, Vyacheslav Makarov wrote: >>>> If the WAL segment for the specified restart_lsn (STOP_LSN of the >>>> backup) >>>> exists, then the function will create a physical replication slot >>>> and will >>>> keep all the WAL segments required by the replica to catch up with >>>> the >>>> primary. Otherwise, it returns error, which means that the required >>>> WAL >>>> segments have been already utilised, so we do need to take a new >>>> backup. >>>> Without passing this newly added parameter >>>> pg_create_physical_replication_slot() works as before. >>>> >>>> What do you think about this? > > Currently pg_create_physical_replication_slot() and > CREATE_REPLICATION_SLOT > replication command seem to be "idential". So if we add new option into > one, > we should add it also into another? > I wonder how it could be used via the replication protocol, but probably this option should be added there as well for consistency. > > What happen if future LSN is specified in restart_lsn? With the patch, > in this case, if the segment at that LSN exists (e.g., because it's > recycled > one), the slot seems to be successfully created. However if the LSN is > far future and the segment doesn't exist, the creation of slot seems to > fail. > This behavior looks fragile and confusing. We should accept future LSN > whether its segment currently exists or not? > But what about a possible timeline switch? If we allow specifying it as further in the future as one wanted, then appropriate segment with specified LSN may be created in the different timeline if it would be switched, so it may be misleading. I am not even sure about allowing future LSN for existing segments, since PITR / timeline switch may occur just after the slot creation, so the pointer may never be valid. Would it be better to completely disallow future LSN? And here I noticed another moment in the patch. TimeLineID of the last restart/checkpoint is used to detect whether WAL segment file exists or not. It means that if we try to create a slot just after a timeline switch, then we could not specify the oldest LSN actually available on the disk, since it may be from the previous timeline. One can use LSN only within the current timeline. It seems to be fine, but should be covered in the docs. > > + if (!RecoveryInProgress() && !SlotIsLogical(MyReplicationSlot)) > > With the patch, the given restart_lsn seems to be ignored during > recovery. > Why? > I have the same question, not sure that this is needed here. It looks more like a forgotten copy-paste from ReplicationSlotReserveWal(). >>> >>> I think that this was discussed in the past (perhaps one of the >>> threads related to WAL advancing actually?), >>> >> >> I have searched through the archives a bit and found one thread >> related to slots advancing [1]. It was dedicated to a problem of >> advancing slots which do not reserve WAL yet, if I get it correctly. >> Although it is somehow related to the topic, it was a slightly >> different issue, IMO. >> >>> >>> and this stuff is full of >>> holes when it comes to think about error handling with checkpoints >>> running in parallel, potentially doing recycling of segments you >>> would >>> expect to be around based on your input value for restart_lsn *while* >>> pg_create_physical_replication_slot() is still running and >>> manipulating the on-disk slot information. >>> ... >> >> These are the right concerns, but all of them should be applicable to >> the pg_create_physical_replication_slot() + immediately_reserve == >> true in the same way, doesn't it? I think so, since in that case we >> are doing a pretty similar thing — trying to reserve some WAL segment >> that may be concurrently deleted. >> >> And this is exactly the reason why ReplicationSlotReserveWal() does it >> in several steps in a loop: >> >> 1. Creates a slot with some restart_lsn. >> 2. Does ReplicationSlotsComputeRequiredLSN() to prevent removal of the >> WAL segment with this restart_lsn. >> 3. Checks that required WAL segment is still there. >> 4. Repeat if this attempt to prevent WAL removal has failed. > > What happens if concurrent checkpoint decides to remove the segment > at restart_lsn before #2 and then actually removes it after #3? > The replication slot is successfully created with the given > restart_lsn, > but the reserved segment has already been removed? > I though about it a bit more and it seems that yes, there is a race even for a current pg_create_physical_replication_slot() + immediately_reserve == true, i.e. ReplicationSlotReserveWal(). However, the chance is very subtle since we take a current GetRedoRecPtr() there. Probably one could reproduce it with wal_keep_segments = 1 by holding / releasing backend doing the slot creation and checkpointer with gdb, but not sure that it is an issue anywhere in the real world. Maybe I am wrong, but it is not clear for me why current ReplicationSlotReserveWal() routine does not have that race. I will try to reproduce it though. Things get worse when we allow specifying an older LSN, since it has a higher chances to be at the horizon of deletion by checkpointer. Anyway, if I get it correctly, with a current patch slot will be created successfully, but will be obsolete and should be invalidated by the next checkpoint. Regards -- Alexey Kondratov Postgres Professional https://www.postgrespro.com Russian Postgres Company
Re: [PATCH] Allow to specify restart_lsn inpg_create_physical_replication_slot()
From
Michael Paquier
Date:
On Mon, Jun 22, 2020 at 08:18:58PM +0300, Alexey Kondratov wrote: > I wonder how it could be used via the replication protocol, but probably > this option should be added there as well for consistency. Mostly the same code path is taken by the SQL function and the replication command, so adding a new option to both when adding a new option makes sense to me for consistency. The SQL functions are actually easier to use when it comes to tests, as there is no need to worry about COPY_BOTH not supported in psql. > Things get worse when we allow specifying an older LSN, since it has a > higher chances to be at the horizon of deletion by checkpointer. Anyway, if > I get it correctly, with a current patch slot will be created successfully, > but will be obsolete and should be invalidated by the next checkpoint. Is that a behavior acceptable for the end user? For example, a physical slot that is created to immediately reserve WAL may get invalidated, causing it to actually not keep WAL around contrary to what the user has wanted the command to do. -- Michael
Attachment
Re: [PATCH] Allow to specify restart_lsn inpg_create_physical_replication_slot()
From
Alexey Kondratov
Date:
On 2020-06-23 04:18, Michael Paquier wrote: > On Mon, Jun 22, 2020 at 08:18:58PM +0300, Alexey Kondratov wrote: >> Things get worse when we allow specifying an older LSN, since it has a >> higher chances to be at the horizon of deletion by checkpointer. >> Anyway, if >> I get it correctly, with a current patch slot will be created >> successfully, >> but will be obsolete and should be invalidated by the next checkpoint. > > Is that a behavior acceptable for the end user? For example, a > physical slot that is created to immediately reserve WAL may get > invalidated, causing it to actually not keep WAL around contrary to > what the user has wanted the command to do. > I can imagine that it could be acceptable in the initially proposed scenario for someone, since creation of a slot with historical restart_lsn is already unpredictable — required segment may exist or may do not exist. However, adding here an undefined behaviour even after a slot creation does not look good to me anyway. I have looked closely on the checkpointer code and another problem is that it decides once which WAL segments to delete based on the replicationSlotMinLSN, and does not check anything before the actual file deletion. That way the gap for a possible race is even wider. I do not know how to completely get rid of this race without introducing of some locking mechanism, which may be costly. Thanks for feedback -- Alexey Kondratov Postgres Professional https://www.postgrespro.com Russian Postgres Company