Hello,
Thank you for reviewing this.
At Mon, 28 Aug 2017 20:14:54 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in
<CAB7nPqT03+uaHXun3ft4LJWNDviKTgWSZDsXiqyNdtcCfeqcgg@mail.gmail.com>
> On Mon, Aug 28, 2017 at 8:02 PM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > The first patch (0001-) fixes this problem, preventing the
> > problematic state of WAL segments by retarding restart LSN of a
> > physical replication slot in a certain condition.
>
> FWIW, I have this patch marked on my list of things to look at, so you
> can count me as a reviewer. There are also some approaches that I
> would like to test because I rely on replication slots for some
> infrastructure. Still...
This test patch modifies the code for easiness. The window for
this bug to occur is from receiving the first record of a segment
to in most cases receiving the second record or after receiving
several records. Intentionally emitting a record spanning two or
more segments would work?
> + if (oldFlushPtr != InvalidXLogRecPtr &&
> + (restartLSN == InvalidXLogRecPtr ?
> + oldFlushPtr / XLOG_SEG_SIZE != flushPtr / XLOG_SEG_SIZE :
> + restartLSN / XLOG_BLCKSZ != flushPtr / XLOG_BLCKSZ))
> I find such code patterns not readable.
Yeah, I agree. I rewrote there and the result in the attached
patch is far cleaner than the blob.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center