Re: more descriptive message for process termination due to max_slot_wal_keep_size - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: more descriptive message for process termination due to max_slot_wal_keep_size
Date
Msg-id CALj2ACXmPefRz71OUpFtLF-kWd+bZHYAZ+cPPVSyJCBM1fcwtA@mail.gmail.com
Whole thread Raw
In response to more descriptive message for process termination due to max_slot_wal_keep_size  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-hackers
On Tue, Dec 14, 2021 at 9:35 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> Hello.
>
> As complained in pgsql-bugs [1], when a process is terminated due to
> max_slot_wal_keep_size, the related messages don't mention the root
> cause for *the termination*.  Note that the third message does not
> show for temporary replication slots.
>
> [pid=a] LOG:  "terminating process %d to release replication slot \"%s\""
> [pid=x] LOG:  FATAL:  terminating connection due to administrator command
> [pid=a] LOG:  invalidting slot "s" because its restart_lsn X/X exceeds max_slot_wal_keep_size
>
> The attached patch attaches a DETAIL line to the first message.
>
> > [17605] LOG:  terminating process 17614 to release replication slot "s1"
> + [17605] DETAIL:  The slot's restart_lsn 0/2C0000A0 exceeds max_slot_wal_keep_size.
> > [17614] FATAL:  terminating connection due to administrator command
> > [17605] LOG:  invalidating slot "s1" because its restart_lsn 0/2C0000A0 exceeds max_slot_wal_keep_size
>
> Somewhat the second and fourth lines look inconsistent each other but
> that wouldn't be such a problem.  I don't think we want to concatenate
> the two lines together as the result is a bit too long.
>
> > LOG:  terminating process 17614 to release replication slot "s1" because it's restart_lsn 0/2C0000A0 exceeds
max_slot_wal_keep_size.
>
> What do you think about this?
>
> [1] https://www.postgresql.org/message-id/20211214.101137.379073733372253470.horikyota.ntt%40gmail.com

+1 to give more context to the "terminating process %d to release
replication slot \"%s\"" message.

How about having below, instead of adding errdetail:
"terminating process %d to release replication slot \"%s\" whose
restart_lsn %X/%X exceeds max_slot_wal_keep_size"?

I think we can keep the "invalidating slot \"%s\" because its
restart_lsn %X/%X exceeds max_slot_wal_keep_size" message as-is. We
may not see "terminating process ..." and "invalidation slot ..."
messages together for the same slot, so having slightly different
wording is fine IMO.

Regards,
Bharath Rupireddy.



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Defining (and possibly skipping) useless VACUUM operations
Next
From: John Naylor
Date:
Subject: Re: cutting down the TODO list thread