Re: WAL segments removed from primary despite the fact that logical replication slot needs it. - Mailing list pgsql-bugs
From | Amit Kapila |
---|---|
Subject | Re: WAL segments removed from primary despite the fact that logical replication slot needs it. |
Date | |
Msg-id | CAA4eK1KcnTvwrVqmpRTEMpyarBeTxwr8KA+kaveQOiqJ0zYsXA@mail.gmail.com Whole thread Raw |
In response to | Re: WAL segments removed from primary despite the fact that logical replication slot needs it. (Andres Freund <andres@anarazel.de>) |
Responses |
Re: WAL segments removed from primary despite the fact that logical replication slot needs it.
|
List | pgsql-bugs |
On Tue, Nov 22, 2022 at 1:38 AM Andres Freund <andres@anarazel.de> wrote: > > On 2022-11-21 19:56:20 +0530, Amit Kapila wrote: > > I think this problem could arise when walsender exits due to some > > error like "terminating walsender process due to replication timeout". > > Here is the theory I came up with: > > > > 1. Initially the restart_lsn is updated to 1039D/83825958. This will > > allow all files till 000000000001039D00000082 to be removed. > > 2. Next the slot->candidate_restart_lsn is updated to a 1039D/8B5773D8. > > 3. walsender restarts due to replication timeout. > > 4. After restart, it starts reading WAL from 1039D/83825958 as that > > was restart_lsn. > > 5. walsender gets a message to update write, flush, apply, etc. As > > part of that, it invokes > > ProcessStandbyReplyMessage->LogicalConfirmReceivedLocation. > > 6. Due to step 5, the restart_lsn is updated to 1039D/8B5773D8 and > > replicationSlotMinLSN will also be computed to the same value allowing > > to remove of all files older than 000000000001039D0000008A. This will > > allow removing 000000000001039D00000083, 000000010001039D00000084, > > etc. > > This would require that the client acknowledged an LSN that we haven't > sent out, no? Shouldn't the > MyReplicationSlot->candidate_restart_valid <= lsn > from LogicalConfirmReceivedLocation() prevented this from happening > unless the client acknowledges up to candidate_restart_valid? > We would have sent the higher LSN prior to the restart. Let me share a snippet of LOG which I have extracted from a large LOG depesz has shared me offlist after my request: 2022-11-10 21:03:23.478+00,upgrayedd,canvas,21748,10.1.238.101:35640,636d671b.54f4,4,idle,2022-11-10 21:03:23+00,7/0,0,DEBUG,00000,received replication command: IDENTIFY_SYSTEM,,,,,,,,,focal14 2022-11-10 21:03:23.479+00,upgrayedd,canvas,21748,10.1.238.101:35640,636d671b.54f4,5,idle,2022-11-10 21:03:23+00,7/0,0,DEBUG,00000,"received replication command: START_REPLICATION SLOT ""focal14"" LOGICAL 1039D/91F41B50 (proto_version '1', publication_names '""focal14""')",,,,,,,,,focal14 2022-11-10 21:03:23.481+00,upgrayedd,canvas,21748,10.1.238.101:35640,636d671b.54f4,6,idle,2022-11-10 21:03:23+00,7/0,0,LOG,00000,"starting logical decoding for slot ""focal14""","Streaming transactions committing after 1039D/911A8DB0, reading WAL from 1039D/83825958.",,,,,"START_REPLICATION SLOT ""focal14"" LOGICAL 1039D/91F41B50 (proto_version '1', publication_names '""focal14""')",,,focal14 2022-11-10 21:03:23.481+00,upgrayedd,canvas,21748,10.1.238.101:35640,636d671b.54f4,7,idle,2022-11-10 21:03:23+00,7/0,0,DEBUG,00000,sending replication keepalive,,,,,,,,,focal14 2022-11-10 21:03:23.484+00,upgrayedd,canvas,21748,10.1.238.101:35640,636d671b.54f4,8,idle,2022-11-10 21:03:23+00,7/0,0,LOG,00000,logical decoding found consistent point at 1039D/83825958,Logical decoding will begin using saved snapshot.,,,,,"START_REPLICATION SLOT ""focal14"" LOGICAL 1039D/91F41B50 (proto_version '1', publication_names '""focal14""')",,,focal14 2022-11-10 21:03:23.484+00,upgrayedd,canvas,21748,10.1.238.101:35640,636d671b.54f4,9,idle,2022-11-10 21:03:23+00,7/0,0,DEBUG,00000,write 1039D/91F41B50 flush 1039D/91F41B50 apply 1039D/91F41B50 reply_time 2022-11-10 21:03:23.482052+00,,,,,,,,,focal14 2022-11-10 21:03:23.495+00,upgrayedd,canvas,21748,10.1.238.101:35640,636d671b.54f4,10,idle,2022-11-10 21:03:23+00,7/0,0,DEBUG,00000,updated xmin: 1 restart: 1,,,,,,,,,focal14 Here, we can see that even though we have started reading WAL from 1039D/83825958, the confiremed_flush location is 1039D/911A8DB0. Then we get a client reply (write 1039D/91F41B50 flush 1039D/91F41B50) probably for a keepalive message. The flush LSN here is higher than candidate_restart_valid. The value of candidate_restart_valid before walsender restart was 1039D/9170B010 and candidate_restart_lsn was 1039D/8B5773D8. > > > 7. Now, we got new slot->candidate_restart_lsn as 1039D/83825958. > > Remember from step 1, we are still reading WAL from that location. > > I don't think LogicalIncreaseRestartDecodingForSlot() would do anything > in that case, because of the > /* don't overwrite if have a newer restart lsn */ > check. > But later in LogicalIncreaseRestartDecodingForSlot(), we have the below check which can allow to override it. /* * Only increase if the previous values have been applied, otherwise we * might never end up updating if the receiver acks too slowly. A missed * value here will just cause some extra effort after reconnecting. */ if (slot->candidate_restart_valid == InvalidXLogRecPtr) Do we need else if here instead of if? I think that would save us from retreating the restart_lsn value. We seem to have "else if" in the corresponding case in the function LogicalIncreaseXminForSlot(): /* * Only increase if the previous values have been applied, otherwise we * might never end up updating if the receiver acks too slowly. */ else if (slot->candidate_xmin_lsn == InvalidXLogRecPtr) -- With Regards, Amit Kapila.
pgsql-bugs by date: