Thread: pg_receivewal and SIGTERM

pg_receivewal and SIGTERM

From
Christoph Berg
Date:
There's a smallish backup tool called pg_backupcluster in Debian's
postgresql-common which also ships a systemd service that runs
pg_receivewal for wal archiving, and supplies a pg_getwal script for
reading the files back on restore, including support for .partial
files.

So far the machinery was using plain files and relied on compressing
the WALs from time to time, but now I wanted to compress the files
directly from pg_receivewal --compress=5. Unfortunately this broke the
regression tests that include a test for the .partial files where
pg_receivewal.service is shut down before the segment is full.

The problem was that systemd's default KillSignal is SIGTERM, while
pg_receivewal flushes the output compression buffers on SIGINT only.
The attached patch makes it do the same for SIGTERM as well. (Most
places in PG that install a SIGINT handler also install a SIGTERM
handler already.)

Christoph

Attachment

Re: pg_receivewal and SIGTERM

From
Daniel Gustafsson
Date:
> On 15 Aug 2022, at 14:45, Christoph Berg <myon@debian.org> wrote:

> The problem was that systemd's default KillSignal is SIGTERM, while
> pg_receivewal flushes the output compression buffers on SIGINT only.

Supporting SIGTERM here makes sense, especially given how systemd works.

> The attached patch makes it do the same for SIGTERM as well. (Most
> places in PG that install a SIGINT handler also install a SIGTERM
> handler already.)

Not really when it comes to utilities though; initdb, pg_dump and pg_test_fsync
seems to be the ones doing so.  (That's probably mostly due to them not running
in a daemon-like way as what's discussed here.)

Do you think pg_recvlogical should support SIGTERM as well?  (The signals which
it does trap should be added to the documentation which just now says "until
terminated by a signal" but that's a separate thing.)

     pqsignal(SIGINT, sigint_handler);
+    pqsignal(SIGTERM, sigint_handler);
Tiny nitpick, I think we should rename sigint_handler to just sig_handler as it
does handle more than sigint.

In relation to this.  Reading over this and looking around I realized that the
documentation for pg_waldump lacks a closing parenthesis on Ctrl+C so I will be
pushing the below to fix it:

--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -263,7 +263,7 @@ PostgreSQL documentation
        <para>
         If <application>pg_waldump</application> is terminated by signal
         <systemitem>SIGINT</systemitem>
-        (<keycombo action="simul"><keycap>Control</keycap><keycap>C</keycap></keycombo>,
+        (<keycombo action="simul"><keycap>Control</keycap><keycap>C</keycap></keycombo>),
         the summary of the statistics computed is displayed up to the
         termination point. This operation is not supported on
         <productname>Windows</productname>.

--
Daniel Gustafsson        https://vmware.com/




Re: pg_receivewal and SIGTERM

From
Christoph Berg
Date:
Re: Daniel Gustafsson
> Do you think pg_recvlogical should support SIGTERM as well?  (The signals which
> it does trap should be added to the documentation which just now says "until
> terminated by a signal" but that's a separate thing.)

Ack, that makes sense, added in the attached updated patch.

>      pqsignal(SIGINT, sigint_handler);
> +    pqsignal(SIGTERM, sigint_handler);
> Tiny nitpick, I think we should rename sigint_handler to just sig_handler as it
> does handle more than sigint.

I went with sigexit_handler since pg_recvlogical has also a
sighup_handler and "sig_handler" would be confusing there.

Christoph

Attachment

Re: pg_receivewal and SIGTERM

From
Bharath Rupireddy
Date:
On Tue, Aug 16, 2022 at 5:06 PM Christoph Berg <myon@debian.org> wrote:
>
> Re: Daniel Gustafsson
> > Do you think pg_recvlogical should support SIGTERM as well?  (The signals which
> > it does trap should be added to the documentation which just now says "until
> > terminated by a signal" but that's a separate thing.)
>
> Ack, that makes sense, added in the attached updated patch.
>
> >       pqsignal(SIGINT, sigint_handler);
> > +     pqsignal(SIGTERM, sigint_handler);
> > Tiny nitpick, I think we should rename sigint_handler to just sig_handler as it
> > does handle more than sigint.
>
> I went with sigexit_handler since pg_recvlogical has also a
> sighup_handler and "sig_handler" would be confusing there.

Can we move these signal handlers to streamutil.h/.c so that both
pg_receivewal and pg_recvlogical can make use of it avoiding duplicate
code?

-- 
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/



Re: pg_receivewal and SIGTERM

From
Daniel Gustafsson
Date:
> On 16 Aug 2022, at 13:40, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Aug 16, 2022 at 5:06 PM Christoph Berg <myon@debian.org> wrote:

>> I went with sigexit_handler since pg_recvlogical has also a
>> sighup_handler and "sig_handler" would be confusing there.
>
> Can we move these signal handlers to streamutil.h/.c so that both
> pg_receivewal and pg_recvlogical can make use of it avoiding duplicate
> code?

In general that's a good idea, but they are so trivial that I don't really see
much point in doing that in this particular case.

--
Daniel Gustafsson        https://vmware.com/




Re: pg_receivewal and SIGTERM

From
Christoph Berg
Date:
Re: Daniel Gustafsson
> In general that's a good idea, but they are so trivial that I don't really see
> much point in doing that in this particular case.

Plus the variable they set is called differently...

Christoph



Re: pg_receivewal and SIGTERM

From
Daniel Gustafsson
Date:
> On 16 Aug 2022, at 13:36, Christoph Berg <myon@debian.org> wrote:

>>     pqsignal(SIGINT, sigint_handler);
>> +    pqsignal(SIGTERM, sigint_handler);
>> Tiny nitpick, I think we should rename sigint_handler to just sig_handler as it
>> does handle more than sigint.
>
> I went with sigexit_handler since pg_recvlogical has also a
> sighup_handler and "sig_handler" would be confusing there.

Good point, sigexit_handler is a better name here.

--
Daniel Gustafsson        https://vmware.com/




Re: pg_receivewal and SIGTERM

From
Bharath Rupireddy
Date:
On Tue, Aug 16, 2022 at 5:15 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 16 Aug 2022, at 13:36, Christoph Berg <myon@debian.org> wrote:
>
> >>      pqsignal(SIGINT, sigint_handler);
> >> +    pqsignal(SIGTERM, sigint_handler);
> >> Tiny nitpick, I think we should rename sigint_handler to just sig_handler as it
> >> does handle more than sigint.
> >
> > I went with sigexit_handler since pg_recvlogical has also a
> > sighup_handler and "sig_handler" would be confusing there.
>
> Good point, sigexit_handler is a better name here.

+1.

Don't we need a similar explanation [1] for pg_recvlogical docs?

[1]
   <para>
    In the absence of fatal errors, <application>pg_receivewal</application>
-   will run until terminated by the <systemitem>SIGINT</systemitem> signal
-   (<keycombo action="simul"><keycap>Control</keycap><keycap>C</keycap></keycombo>).
+   will run until terminated by the <systemitem>SIGINT</systemitem>
+   (<keycombo action="simul"><keycap>Control</keycap><keycap>C</keycap></keycombo>)
+   or <systemitem>SIGTERM</systemitem> signal.

-- 
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/



Re: pg_receivewal and SIGTERM

From
Christoph Berg
Date:
Re: Bharath Rupireddy
> Don't we need a similar explanation [1] for pg_recvlogical docs?
> 
> [1]
>    <para>
>     In the absence of fatal errors, <application>pg_receivewal</application>
> -   will run until terminated by the <systemitem>SIGINT</systemitem> signal
> -   (<keycombo action="simul"><keycap>Control</keycap><keycap>C</keycap></keycombo>).
> +   will run until terminated by the <systemitem>SIGINT</systemitem>
> +   (<keycombo action="simul"><keycap>Control</keycap><keycap>C</keycap></keycombo>)
> +   or <systemitem>SIGTERM</systemitem> signal.

Coped that from pg_receivewal(1) now.

Christoph

Attachment

Re: pg_receivewal and SIGTERM

From
Bharath Rupireddy
Date:
On Tue, Aug 16, 2022 at 7:26 PM Christoph Berg <myon@debian.org> wrote:
>
> Re: Bharath Rupireddy
> > Don't we need a similar explanation [1] for pg_recvlogical docs?
> >
> > [1]
> >    <para>
> >     In the absence of fatal errors, <application>pg_receivewal</application>
> > -   will run until terminated by the <systemitem>SIGINT</systemitem> signal
> > -   (<keycombo action="simul"><keycap>Control</keycap><keycap>C</keycap></keycombo>).
> > +   will run until terminated by the <systemitem>SIGINT</systemitem>
> > +   (<keycombo action="simul"><keycap>Control</keycap><keycap>C</keycap></keycombo>)
> > +   or <systemitem>SIGTERM</systemitem> signal.
>
> Coped that from pg_receivewal(1) now.

Thanks.

    <application>pg_receivewal</application> will exit with status 0 when
-   terminated by the <systemitem>SIGINT</systemitem> signal.  (That is the
+   terminated by the <systemitem>SIGINT</systemitem> or
+   <systemitem>SIGTERM</systemitem> signal.  (That is the
    normal way to end it.  Hence it is not an error.)  For fatal errors or
    other signals, the exit status will be nonzero.

Can we specify the reason in the docs why a SIGTERM causes (which
typically would cause a program to end with non-zero exit code)
pg_receivewal and pg_recvlogical exit with zero exit code? Having this
in the commit message would help developers but the documentation will
help users out there.

Thoughts?

[1]
pg_receivewal, pg_recvlogical: Exit cleanly on SIGTERM

In pg_receivewal, compressed output is only flushed on clean exits. The
reason to support SIGTERM here as well is that pg_receivewal might well
be running as a daemon, and systemd's default KillSignal is SIGTERM.

Since pg_recvlogical is also supposed to run as a daemon, teach it about
SIGTERM as well.

-- 
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/



Re: pg_receivewal and SIGTERM

From
Christoph Berg
Date:
Re: Bharath Rupireddy
>     <application>pg_receivewal</application> will exit with status 0 when
> -   terminated by the <systemitem>SIGINT</systemitem> signal.  (That is the
> +   terminated by the <systemitem>SIGINT</systemitem> or
> +   <systemitem>SIGTERM</systemitem> signal.  (That is the
>     normal way to end it.  Hence it is not an error.)  For fatal errors or
>     other signals, the exit status will be nonzero.
> 
> Can we specify the reason in the docs why a SIGTERM causes (which
> typically would cause a program to end with non-zero exit code)
> pg_receivewal and pg_recvlogical exit with zero exit code? Having this
> in the commit message would help developers but the documentation will
> help users out there.

We could add "because you want that if it's running as a daemon", but
TBH, I'd rather remove the parentheses part. It sounds too much like
"it works that way because that way is the sane way".

Christoph



Re: pg_receivewal and SIGTERM

From
Bharath Rupireddy
Date:
On Fri, Aug 19, 2022 at 4:24 PM Christoph Berg <myon@debian.org> wrote:
>
> Re: Bharath Rupireddy
> >     <application>pg_receivewal</application> will exit with status 0 when
> > -   terminated by the <systemitem>SIGINT</systemitem> signal.  (That is the
> > +   terminated by the <systemitem>SIGINT</systemitem> or
> > +   <systemitem>SIGTERM</systemitem> signal.  (That is the
> >     normal way to end it.  Hence it is not an error.)  For fatal errors or
> >     other signals, the exit status will be nonzero.
> >
> > Can we specify the reason in the docs why a SIGTERM causes (which
> > typically would cause a program to end with non-zero exit code)
> > pg_receivewal and pg_recvlogical exit with zero exit code? Having this
> > in the commit message would help developers but the documentation will
> > help users out there.
>
> We could add "because you want that if it's running as a daemon", but

+1 to add "some" info in the docs (I'm not sure about the better
wording though), we can try to be more specific of the use case if
required.

-- 
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/



Re: pg_receivewal and SIGTERM

From
Michael Paquier
Date:
On Fri, Aug 19, 2022 at 05:34:56PM +0530, Bharath Rupireddy wrote:
> +1 to add "some" info in the docs (I'm not sure about the better
> wording though), we can try to be more specific of the use case if
> required.

Yes, the amount of extra docs provided by the patch proposed by
Christoph looks fine by me.

FWIW, grouping the signal handlers into a common area like
streamutil.c seems rather confusing to me, as they set different
variable names that rely on their own assumptions in their local file,
so I would leave that out, like the patch.

While looking at the last patch proposed, it strikes me that
time_to_stop should be sig_atomic_t in pg_receivewal.c, as the safe
type of variable to set in a signal handler.  We could change that,
while on it..

Backpatching this stuff is not an issue here.
--
Michael

Attachment

Re: pg_receivewal and SIGTERM

From
Christoph Berg
Date:
Re: Michael Paquier
> While looking at the last patch proposed, it strikes me that
> time_to_stop should be sig_atomic_t in pg_receivewal.c, as the safe
> type of variable to set in a signal handler.  We could change that,
> while on it..

Done in the attached patch.

> Backpatching this stuff is not an issue here.

Do you mean it can, or can not be backpatched? (I'd argue for
backpatching since the behaviour is slightly broken at the moment.)

Christoph

Attachment

Re: pg_receivewal and SIGTERM

From
Michael Paquier
Date:
On Mon, Aug 22, 2022 at 04:05:16PM +0200, Christoph Berg wrote:
> Do you mean it can, or can not be backpatched? (I'd argue for
> backpatching since the behaviour is slightly broken at the moment.)

I mean that it is fine to backpatch that, in my opinion.
--
Michael

Attachment

Re: pg_receivewal and SIGTERM

From
Daniel Gustafsson
Date:
> On 23 Aug 2022, at 02:15, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Aug 22, 2022 at 04:05:16PM +0200, Christoph Berg wrote:
>> Do you mean it can, or can not be backpatched? (I'd argue for
>> backpatching since the behaviour is slightly broken at the moment.)
>
> I mean that it is fine to backpatch that, in my opinion.

I think this can be argued both for and against backpatching.  Catching SIGTERM
makes a lot of sense, especially given systemd's behavior.  On the other hand,
This adds functionality to something arguably working as intended, regardless
of what one thinks about the intent.

The attached adds the Exit Status section to pg_recvlogical docs which is
present in pg_receivewal to make them more aligned, and tweaks comments to
pgindent standards. This is the version I think is ready to commit.

--
Daniel Gustafsson        https://vmware.com/


Attachment

Re: pg_receivewal and SIGTERM

From
Michael Paquier
Date:
On Thu, Aug 25, 2022 at 11:19:05AM +0200, Daniel Gustafsson wrote:
> I think this can be argued both for and against backpatching.  Catching SIGTERM
> makes a lot of sense, especially given systemd's behavior.  On the other hand,
> This adds functionality to something arguably working as intended, regardless
> of what one thinks about the intent.

Sure.  My view on this matter is that the behavior of the patch is
more useful to users as, on HEAD, a SIGTERM is equivalent to a drop of
the connection followed by a retry when not using -n.  Or do you think
that there could be cases where the behavior of HEAD (force a
connection drop with the backend and handle the retry infinitely in
pg_receivewal/recvlogical) is more useful?  systemd can also do
retries a certain given of times, so that's moving the ball one layer
to the other, at the end.  We could also say to just set KillSignal to
SIGINT in the docs, but my guess is that few users would actually
notice that until they see how pg_receiwal/recvlogical work with
systemd's default.

FWIW, I've worked on an archiver integration a few years ago and got
annoyed that we use SIGINT while SIGTERM was the default (systemd was
not directly used there but the signal problem was the same, so we had
to go through some loops to make the stop signal configurable, like
systemd).
--
Michael

Attachment

Re: pg_receivewal and SIGTERM

From
Christoph Berg
Date:
Re: Michael Paquier
> FWIW, I've worked on an archiver integration a few years ago and got
> annoyed that we use SIGINT while SIGTERM was the default (systemd was
> not directly used there but the signal problem was the same, so we had
> to go through some loops to make the stop signal configurable, like
> systemd).

SIGTERM is really the default for any init system or run-a-daemon system.

Christoph



Re: pg_receivewal and SIGTERM

From
Magnus Hagander
Date:
On Thu, Aug 25, 2022 at 5:13 PM Christoph Berg <myon@debian.org> wrote:
>
> Re: Michael Paquier
> > FWIW, I've worked on an archiver integration a few years ago and got
> > annoyed that we use SIGINT while SIGTERM was the default (systemd was
> > not directly used there but the signal problem was the same, so we had
> > to go through some loops to make the stop signal configurable, like
> > systemd).
>
> SIGTERM is really the default for any init system or run-a-daemon system.

It is, but there is also precedent for not using it for graceful
shutdown. Apache, for example, will do what we do today on SIGTERM and
you use SIGWINCH to make it shut down gracefully (which would be the
equivalent of us flushing the compression buffers, I'd say).

I'm not saying we shouldn't change -- I fully approve of making the
change. But the world is full of fairly prominent examples of the
other way as well.

I'm leaning towards considering it a feature-change and thus not
something to backpatch (I'd be OK sneaking it into 15 though, as that
one is not released yet and it feels like a perfectly *safe* change).
Not enough to insist on it, but it seems "slightly more correct".

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: pg_receivewal and SIGTERM

From
Michael Paquier
Date:
On Thu, Aug 25, 2022 at 08:45:05PM +0200, Magnus Hagander wrote:
> I'm leaning towards considering it a feature-change and thus not
> something to backpatch (I'd be OK sneaking it into 15 though, as that
> one is not released yet and it feels like a perfectly *safe* change).
> Not enough to insist on it, but it seems "slightly more correct".

Fine by me if both you and Daniel want to be more careful with this
change.  We could always argue about a backpatch later if there is
more ask for it, as well.
--
Michael

Attachment

Re: pg_receivewal and SIGTERM

From
Christoph Berg
Date:
Re: Daniel Gustafsson
> The attached adds the Exit Status section to pg_recvlogical docs which is
> present in pg_receivewal to make them more aligned, and tweaks comments to
> pgindent standards. This is the version I think is ready to commit.

Looks good to me.

Thanks,
Christoph



Re: pg_receivewal and SIGTERM

From
Michael Paquier
Date:
On Fri, Aug 26, 2022 at 09:51:26AM +0900, Michael Paquier wrote:
> Fine by me if both you and Daniel want to be more careful with this
> change.  We could always argue about a backpatch later if there is
> more ask for it, as well.

Daniel, are you planning to apply this one on HEAD?
--
Michael

Attachment

Re: pg_receivewal and SIGTERM

From
Daniel Gustafsson
Date:
> On 2 Sep 2022, at 10:00, Michael Paquier <michael@paquier.xyz> wrote:
> 
> On Fri, Aug 26, 2022 at 09:51:26AM +0900, Michael Paquier wrote:
>> Fine by me if both you and Daniel want to be more careful with this
>> change.  We could always argue about a backpatch later if there is
>> more ask for it, as well.
> 
> Daniel, are you planning to apply this one on HEAD?

Yes, it's on my TODO for this CF.

--
Daniel Gustafsson        https://vmware.com/




Re: pg_receivewal and SIGTERM

From
Daniel Gustafsson
Date:
> On 2 Sep 2022, at 10:00, Michael Paquier <michael@paquier.xyz> wrote:
> 
> On Fri, Aug 26, 2022 at 09:51:26AM +0900, Michael Paquier wrote:
>> Fine by me if both you and Daniel want to be more careful with this
>> change.  We could always argue about a backpatch later if there is
>> more ask for it, as well.
> 
> Daniel, are you planning to apply this one on HEAD?

I had another look over this and pushed it.

--
Daniel Gustafsson        https://vmware.com/