Re: BUG #17903: There is a bug in the KeepLogSeg() - Mailing list pgsql-bugs

From Kyotaro Horiguchi
Subject Re: BUG #17903: There is a bug in the KeepLogSeg()
Date
Msg-id 20230420.120417.1609083651022565895.horikyota.ntt@gmail.com
Whole thread Raw
In response to BUG #17903: There is a bug in the KeepLogSeg()  (PG Bug reporting form <noreply@postgresql.org>)
Responses Re: BUG #17903: There is a bug in the KeepLogSeg()
List pgsql-bugs
At Wed, 19 Apr 2023 10:26:13 +0000, PG Bug reporting form <noreply@postgresql.org> wrote in 
> I found that KeepLogSeg() has a piece of code that is not correctly.
> 
> segno may be larger than currSegNo, since the slot_keep_segs variable is of
> type "uint64", in this case the code "if (currSegNo - segno >
> slot_keep_segs)" is incorrect. 
> 
> "if (currSegNo - segno < keep_segs)" is also the same.
> 
> Checkpoint calls the KeepLogSeg function, and there are many operations
> between recptr and XLogGetReplicationSlotMinimumLSN, including updating the
> pg_control file, so segno may be larger than currSegNo.

Correct. Thanks for the report.

If checkpointer somehow takes a long time between inserting a
checkpoint record and removing WAL files, while replication advances a
certain distnace, it can actually happen. Although that behavior
doesn't directly affect max_slot_wal_keep_size, it does disrupt the
effect of wal_keep_size.

The thinko was that we incorrectly assumed the slot minimum LSN can't
be larger than the checkpoint record LSN. We don't need to consider
max_slot_wal_keep_size if the slot minimum LSN is already larger than
currSegNo.

The attached fix works. However, I can't come up with a reasonable
testing script.

This dates back to 13, where max_slot_wal_keep_size was introduced.

regards.


-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment

pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg_basebackup: errors on macOS on directories with ".DS_Store" files
Next
From: Andrey Lepikhov
Date:
Subject: Re: Clause accidentally pushed down ( Possible bug in Making Vars outer-join aware)