Thread: BUG #17903: There is a bug in the KeepLogSeg()
The following bug has been logged on the website: Bug reference: 17903 Logged by: xu xingwang Email address: xu.xw2008@163.com PostgreSQL version: 13.3 Operating system: openEuler Description: Hi, I found that KeepLogSeg() has a piece of code that is not correctly. segno may be larger than currSegNo, since the slot_keep_segs variable is of type "uint64", in this case the code "if (currSegNo - segno > slot_keep_segs)" is incorrect. "if (currSegNo - segno < keep_segs)" is also the same. Checkpoint calls the KeepLogSeg function, and there are many operations between recptr and XLogGetReplicationSlotMinimumLSN, including updating the pg_control file, so segno may be larger than currSegNo. KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo) { XLogSegNo currSegNo; XLogSegNo segno; XLogRecPtr keep; XLByteToSeg(recptr, currSegNo, wal_segment_size); segno = currSegNo; /* * Calculate how many segments are kept by slots first, adjusting for * max_slot_wal_keep_size. */ keep = XLogGetReplicationSlotMinimumLSN(); if (keep != InvalidXLogRecPtr) { XLByteToSeg(keep, segno, wal_segment_size); /* Cap by max_slot_wal_keep_size ... */ if (max_slot_wal_keep_size_mb >= 0) { uint64 slot_keep_segs; slot_keep_segs = ConvertToXSegs(max_slot_wal_keep_size_mb, wal_segment_size); if (currSegNo - segno > slot_keep_segs) segno = currSegNo - slot_keep_segs; } } /* but, keep at least wal_keep_size if that's set */ if (wal_keep_size_mb > 0) { uint64 keep_segs; keep_segs = ConvertToXSegs(wal_keep_size_mb, wal_segment_size); if (currSegNo - segno < keep_segs) { /* avoid underflow, don't go below 1 */ if (currSegNo <= keep_segs) segno = 1; else segno = currSegNo - keep_segs; } } regards. -- xu xingwang
At Wed, 19 Apr 2023 10:26:13 +0000, PG Bug reporting form <noreply@postgresql.org> wrote in > I found that KeepLogSeg() has a piece of code that is not correctly. > > segno may be larger than currSegNo, since the slot_keep_segs variable is of > type "uint64", in this case the code "if (currSegNo - segno > > slot_keep_segs)" is incorrect. > > "if (currSegNo - segno < keep_segs)" is also the same. > > Checkpoint calls the KeepLogSeg function, and there are many operations > between recptr and XLogGetReplicationSlotMinimumLSN, including updating the > pg_control file, so segno may be larger than currSegNo. Correct. Thanks for the report. If checkpointer somehow takes a long time between inserting a checkpoint record and removing WAL files, while replication advances a certain distnace, it can actually happen. Although that behavior doesn't directly affect max_slot_wal_keep_size, it does disrupt the effect of wal_keep_size. The thinko was that we incorrectly assumed the slot minimum LSN can't be larger than the checkpoint record LSN. We don't need to consider max_slot_wal_keep_size if the slot minimum LSN is already larger than currSegNo. The attached fix works. However, I can't come up with a reasonable testing script. This dates back to 13, where max_slot_wal_keep_size was introduced. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Attachment
If `segno` can be larger than `currSegNo`, your patch seems to miss the following branch, are you missing this for some particular reason? ``` /* but, keep at least wal_keep_size if that's set */ if (wal_keep_size_mb > 0) { uint64 keep_segs; keep_segs = ConvertToXSegs(wal_keep_size_mb, wal_segment_size); if (currSegNo - segno < keep_segs) <<<< see here { /* avoid underflow, don't go below 1 */ if (currSegNo <= keep_segs) segno = 1; else segno = currSegNo - keep_segs; } } ``` On Thu, Apr 20, 2023 at 11:04 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Wed, 19 Apr 2023 10:26:13 +0000, PG Bug reporting form <noreply@postgresql.org> wrote in > > I found that KeepLogSeg() has a piece of code that is not correctly. > > > > segno may be larger than currSegNo, since the slot_keep_segs variable is of > > type "uint64", in this case the code "if (currSegNo - segno > > > slot_keep_segs)" is incorrect. > > > > "if (currSegNo - segno < keep_segs)" is also the same. > > > > Checkpoint calls the KeepLogSeg function, and there are many operations > > between recptr and XLogGetReplicationSlotMinimumLSN, including updating the > > pg_control file, so segno may be larger than currSegNo. > > Correct. Thanks for the report. > > If checkpointer somehow takes a long time between inserting a > checkpoint record and removing WAL files, while replication advances a > certain distnace, it can actually happen. Although that behavior > doesn't directly affect max_slot_wal_keep_size, it does disrupt the > effect of wal_keep_size. > > The thinko was that we incorrectly assumed the slot minimum LSN can't > be larger than the checkpoint record LSN. We don't need to consider > max_slot_wal_keep_size if the slot minimum LSN is already larger than > currSegNo. > > The attached fix works. However, I can't come up with a reasonable > testing script. > > This dates back to 13, where max_slot_wal_keep_size was introduced. > > regards. > > > -- > Kyotaro Horiguchi > NTT Open Source Software Center -- Regards Junwang Zhao
At Thu, 20 Apr 2023 15:40:14 +0800, Junwang Zhao <zhjwpku@gmail.com> wrote in > If `segno` can be larger than `currSegNo`, your patch seems to miss > the following branch, > are you missing this for some particular reason? Oops! Sorry for the mistake and thanks for pointing it out. I should have kept segno within the reasonable range. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Attachment
This patch looks reasonable. +1 On Thu, Apr 20, 2023 at 4:58 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Thu, 20 Apr 2023 15:40:14 +0800, Junwang Zhao <zhjwpku@gmail.com> wrote in > > If `segno` can be larger than `currSegNo`, your patch seems to miss > > the following branch, > > are you missing this for some particular reason? > > Oops! Sorry for the mistake and thanks for pointing it out. > > I should have kept segno within the reasonable range. > > regards. > > -- > Kyotaro Horiguchi > NTT Open Source Software Center -- Regards Junwang Zhao
On Thu, Apr 20, 2023 at 05:58:14PM +0900, Kyotaro Horiguchi wrote: > + /* > + * Slot minimum LSN could be greater than recptr. In such cases, no > + * segments are protected by slots but we still need to keep segno in a > + * reasonable range for subsequent calculations to avoid underflow. > + */ > + if (segno > currSegNo) > + segno = currSegNo; > + I wonder if it'd be better to instead change if (currSegNo - segno > slot_keep_segs) to if (currSegNo > segno + slot_keep_segs) and if (currSegNo - segno < keep_segs) to if (currSegNo < segNo + keep_segs) If segno > currSegNo, the first conditional would be false as expected, and the second would be true as expected. We could also use pg_sub_u64_overflow() to detect underflow, but that might be excessive in this case. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
At Thu, 20 Apr 2023 15:02:42 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in > I wonder if it'd be better to instead change > > if (currSegNo - segno > slot_keep_segs) > to > if (currSegNo > segno + slot_keep_segs) > > and > > if (currSegNo - segno < keep_segs) > to > if (currSegNo < segNo + keep_segs) > > If segno > currSegNo, the first conditional would be false as expected, and > the second would be true as expected. We could also use > pg_sub_u64_overflow() to detect underflow, but that might be excessive in > this case. From what I understand, the XLogSegNo calculations are designed without considering the actual value range. Basically, it assumes that (XLogSegNo + <any positive int>) can overflow. If we take the actual value range into account, we can make that change. The choice lies on whether we assume the actual value range or not. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Fri, Apr 21, 2023 at 10:42:31AM +0900, Kyotaro Horiguchi wrote: > From what I understand, the XLogSegNo calculations are designed > without considering the actual value range. Basically, it assumes that > (XLogSegNo + <any positive int>) can overflow. If we take the actual > value range into account, we can make that change. > > The choice lies on whether we assume the actual value range or not. Yeah, after staring at this some more, I think your proposed fix is the way to go. Alternatively, we could adjust the conditional for the max_slot_wal_keep_size block to if (keep != InvalidXLogRecPtr && keep < recptr) but AFAICT that yields the exact same behavior. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Here is a new version of the patch. It is fundamentally the same as v2, but I've adjusted the comment and commit message a bit. Barring objections, I am planning to commit this (and back-patch to v13) in the near future. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
At Mon, 24 Apr 2023 12:14:52 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in > Here is a new version of the patch. It is fundamentally the same as v2, > but I've adjusted the comment and commit message a bit. Barring > objections, I am planning to commit this (and back-patch to v13) in the > near future. > When determining the oldest segment that must be kept for > replication slots, KeepLogSeg() might calculate a segment number > ahead of the one for the LSN given to the function. This causes Maybe the KeepLogSeg() is a mistake of XLogGetReplicationSlotMinimumLSN()? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Tue, Apr 25, 2023 at 01:14:52PM +0900, Kyotaro Horiguchi wrote: > At Mon, 24 Apr 2023 12:14:52 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in >> When determining the oldest segment that must be kept for >> replication slots, KeepLogSeg() might calculate a segment number >> ahead of the one for the LSN given to the function. This causes > > Maybe the KeepLogSeg() is a mistake of > XLogGetReplicationSlotMinimumLSN()? I adjusted the commit message to call out that function explicitly. Also, I decided to go with the "keep < recptr" approach since there's no reason to do anything in that block otherwise. Thanks! -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Thu, Apr 27, 2023 at 02:47:46PM -0700, Nathan Bossart wrote: > I adjusted the commit message to call out that function explicitly. Also, > I decided to go with the "keep < recptr" approach since there's no reason > to do anything in that block otherwise. b726236 exists in the tree, but it seems like the message of pgsql-committers is held on moderation? -- Michael
Attachment
At Thu, 27 Apr 2023 14:47:46 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in > On Tue, Apr 25, 2023 at 01:14:52PM +0900, Kyotaro Horiguchi wrote: > > At Mon, 24 Apr 2023 12:14:52 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in > >> When determining the oldest segment that must be kept for > >> replication slots, KeepLogSeg() might calculate a segment number > >> ahead of the one for the LSN given to the function. This causes > > > > Maybe the KeepLogSeg() is a mistake of > > XLogGetReplicationSlotMinimumLSN()? > > I adjusted the commit message to call out that function explicitly. Also, > I decided to go with the "keep < recptr" approach since there's no reason > to do anything in that block otherwise. That works, too. Thanks! regards. -- Kyotaro Horiguchi NTT Open Source Software Center