Thread: Be strict when request to flush past end of WAL in WaitXLogInsertionsToFinish

Hi,

While working on [1], it was identified that
WaitXLogInsertionsToFinish emits a LOG message, and adjusts the upto
ptr to proceed further when caller requests to flush past the end of
generated WAL. There's a comment explaining no caller should ever do
that intentionally except in cases with bogus LSNs. For a similar
situation, XLogWrite emits a PANIC "xlog write request %X/%X is past
end of log %X/%X". Although there's no problem if
WaitXLogInsertionsToFinish emits LOG, but why can't it be a bit more
harsh and emit PANIC something like the attached to detect the corner
case?

Thoughts?

[1] https://www.postgresql.org/message-id/b43615437ac7d7fdef86a36e5d5bf3fc049bc11b.camel%40j-davis.com

On Thu, Feb 22, 2024 at 1:54 AM Jeff Davis <pgsql@j-davis.com> wrote:
>
> WaitXLogInsertionsToFinish() uses a LOG level message
> for the same situation. They should probably be the same log level, and
> I would think it would be either PANIC or WARNING. I have no idea why
> LOG was chosen.

[2]
    /*
     * No-one should request to flush a piece of WAL that hasn't even been
     * reserved yet. However, it can happen if there is a block with a bogus
     * LSN on disk, for example. XLogFlush checks for that situation and
     * complains, but only after the flush. Here we just assume that to mean
     * that all WAL that has been reserved needs to be finished. In this
     * corner-case, the return value can be smaller than 'upto' argument.
     */
    if (upto > reservedUpto)
    {
        ereport(LOG,
                (errmsg("request to flush past end of generated WAL;
request %X/%X, current position %X/%X",
                        LSN_FORMAT_ARGS(upto), LSN_FORMAT_ARGS(reservedUpto))));
        upto = reservedUpto;
    }

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment
On Fri, 2024-03-15 at 13:12 +0530, Bharath Rupireddy wrote:
> Hi,
>
> While working on [1], it was identified that
> WaitXLogInsertionsToFinish emits a LOG message, and adjusts the upto
> ptr to proceed further when caller requests to flush past the end of
> generated WAL. There's a comment explaining no caller should ever do
> that intentionally except in cases with bogus LSNs. For a similar
> situation, XLogWrite emits a PANIC "xlog write request %X/%X is past
> end of log %X/%X". Although there's no problem if
> WaitXLogInsertionsToFinish emits LOG, but why can't it be a bit more
> harsh and emit PANIC something like the attached to detect the corner
> case?
>
> Thoughts?

I'm not clear on why the callers of WaitXLogInsertionsToFinish() are
handling errors the way they are. XLogWrite PANICs, XLogFlush ERRORs
(which is likely to be escalated to a PANIC anyway), and the other
callers ignore the return value and leave it up to XLogWrite() to
PANIC.

As far as I can tell, once WaitXLogInsertionsToFinish() detects this
bogus LSN, a PANIC is a likely outcome, so your proposed change makes
sense. But then why are the callers also checking?

I haven't looked in a lot of detail.

Regards,
    Jeff Davis