FATAL: terminating connection due to administrator command - Mailing list pgsql-general

From Alban Hertroys
Subject FATAL: terminating connection due to administrator command
Date
Msg-id OFACE0CA5D.70F6B677-ONC12585F4.002E818B-C12585F4.0032E67C@apollotyres.com
Whole thread Raw
Responses Re: FATAL: terminating connection due to administrator command  (Srinivasa T N <seenutn@gmail.com>)
List pgsql-general
Hi all,

We're seeing the FATAL error message from the subject pop up in our logs at regular intervals, but I haven't been able to pinpoint what is causing it. I'm hoping for some insights here.

We run a PostgreSQL 11.9 server on CentOS 7, within a vmware environment:
 PostgreSQL 11.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit

The package was installed from the PGDG repository.

I'm not even sure I should be worried, there doesn't appear to be any impact on the servers' functioning, but it does say 'FATAL'.
What we're seeing are lines like these two instances:

2020-09-30 21:45:09.999 CEST [2375] 172.30.2.25 asdf STATEMENT:  select count(*) from "util_asdf"."v_something_something2"
2020-09-30 21:45:15.018 CEST [2375] 172.30.2.25 asdf ERROR:  canceling statement due to user request
2020-09-30 21:45:15.018 CEST [2375] 172.30.2.25 asdf STATEMENT:  select count(*) from "dm_asdf"."asdf_func"
2020-09-30 21:45:20.027 CEST [2375] 172.30.2.25 asdf ERROR:  canceling statement due to user request
2020-09-30 21:45:20.027 CEST [2375] 172.30.2.25 asdf STATEMENT:  select count(*) from "dm_asdf"."asdf_func"
2020-09-30 22:27:51.422 CEST [20270] 172.30.2.26 selfservice_sales ERROR:  canceling statement due to user request
2020-09-30 22:27:51.422 CEST [20270] 172.30.2.26 selfservice_sales STATEMENT:  select count(*) from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.422 CEST [30649]   FATAL:  terminating connection due to administrator command
2020-09-30 22:27:51.422 CEST [30649]   STATEMENT:  select count(*) from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.422 CEST [30648]   FATAL:  terminating connection due to administrator command
2020-09-30 22:27:51.422 CEST [30648]   STATEMENT:  select count(*) from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.422 CEST [30647]   FATAL:  terminating connection due to administrator command
2020-09-30 22:27:51.422 CEST [30647]   STATEMENT:  select count(*) from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.422 CEST [30646]   FATAL:  terminating connection due to administrator command
2020-09-30 22:27:51.422 CEST [30646]   STATEMENT:  select count(*) from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.422 CEST [30645]   FATAL:  terminating connection due to administrator command
2020-09-30 22:27:51.422 CEST [30645]   STATEMENT:  select count(*) from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.435 CEST [6482]   LOG:  background worker "parallel worker" (PID 30646) exited with exit code 1
2020-09-30 22:27:51.435 CEST [6482]   LOG:  background worker "parallel worker" (PID 30645) exited with exit code 1
2020-09-30 22:27:51.435 CEST [6482]   LOG:  background worker "parallel worker" (PID 30647) exited with exit code 1
2020-09-30 22:27:51.435 CEST [6482]   LOG:  background worker "parallel worker" (PID 30648) exited with exit code 1
2020-09-30 22:27:51.435 CEST [6482]   LOG:  background worker "parallel worker" (PID 30649) exited with exit code 1
2020-09-30 22:27:56.446 CEST [20270] 172.30.2.26 selfservice_sales ERROR:  canceling statement due to user request
2020-09-30 22:27:56.446 CEST [20270] 172.30.2.26 selfservice_sales STATEMENT:  select count(*) from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30659]   FATAL:  terminating connection due to administrator command
2020-09-30 22:27:56.446 CEST [30659]   STATEMENT:  select count(*) from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30658]   FATAL:  terminating connection due to administrator command
2020-09-30 22:27:56.446 CEST [30658]   STATEMENT:  select count(*) from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30657]   FATAL:  terminating connection due to administrator command
2020-09-30 22:27:56.446 CEST [30657]   STATEMENT:  select count(*) from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30656]   FATAL:  terminating connection due to administrator command
2020-09-30 22:27:56.446 CEST [30656]   STATEMENT:  select count(*) from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30655]   FATAL:  terminating connection due to administrator command
2020-09-30 22:27:56.446 CEST [30655]   STATEMENT:  select count(*) from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.459 CEST [6482]   LOG:  background worker "parallel worker" (PID 30655) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482]   LOG:  background worker "parallel worker" (PID 30656) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482]   LOG:  background worker "parallel worker" (PID 30657) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482]   LOG:  background worker "parallel worker" (PID 30658) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482]   LOG:  background worker "parallel worker" (PID 30659) exited with exit code 1
2020-09-30 22:43:08.459 CEST [8055] 172.30.2.25 selfservice_prd ERROR:  schema "somethingelse" does not exist at character 71

Apparently, something is sending SIGTERM to our pg processes. I know that I'm not doing that, certainly not at those hours, and I'm the one who set up this system and am the only DBA of it.

Advice I found on the Internet is to use systemtap with some tap-script, but the scripts that I found just displayed the PID's of processes without telling me their names, which I didn't find all that useful in figuring out who was responsible, so I made an attempt (I have no experience with stap) at modifying it to print process names of signal sender and target:

/*
 * killsnoop-nd.stp     Trace process signals.
 *                      For Linux, uses SystemTap (non-debuginfo).
 *
 * Copyright (C) 2015 Brendan Gregg.
(etc)
 */
global target;
global signal;

probe begin
{
        printf("%-6s %-12s %-5s %-6s %6s\n", "FROM", "COMMAND", "SIG", "TO", "COMMAND");
}

probe nd_syscall.kill
{
        target[tid()] = uint_arg(1);
        signal[tid()] = uint_arg(2);
}

probe nd_syscall.kill.return
{
        if (signal[tid()] == 15 && target[tid()] != 0) {
                printf("%-6d %-12s %-5d %-6d %12s\n"
                                , pid(), execname()
                                , signal[tid()]
                                , target[tid()], pid2execname(target[tid()]));
        }
        delete target[tid()];
        delete signal[tid()];
}


The output of last night was:

FROM   COMMAND      SIG   TO     COMMAND
30068  systemd-udevd 15    14151  systemd-udevd
30068  systemd-udevd 15    14836  systemd-udevd
(...)
6482   postmaster   15    30649    postmaster
6482   postmaster   15    30648    postmaster
6482   postmaster   15    30647    postmaster
6482   postmaster   15    30646    postmaster
6482   postmaster   15    30645    postmaster
6482   postmaster   15    30659    postmaster
6482   postmaster   15    30658    postmaster
6482   postmaster   15    30657    postmaster
6482   postmaster   15    30656    postmaster
6482   postmaster   15    30655    postmaster
6482   postmaster   15    2065     postmaster
6482   postmaster   15    2064     postmaster
6482   postmaster   15    2063     postmaster

Several of these TO-pid's match those in the PG log.

$ ps aux | grep 6482
postgres  6482  0.0  0.5 51755776 1043872 ?    Ss   Sep28   3:25 /usr/pgsql-11/bin/postmaster -D /data/11/data

Am I correct in concluding that postmaster is killing its own processes? If so, what is going on here? And more importantly, what do I do about it?

Regards,
Alban Hertroys.

P.S. I'm mailing from my work account, as this is work related. You may remember me from my private account.

Alban Hertroys    

D: +31 (0)53 4 888 888 | T: +31 (0)53 4888 888 | E: alban.hertroys@apollovredestein.com
Apollo Vredestein B.V.| Ir. E.L.C. Schiffstraat 370, 7547 RD Enschede, The Netherlands
Chamber of Commerce number: 34223268


 
 

The information contained in this e-mail is intended solely for the use of the individual or entity to whom it is addressed. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or action in relation to the contents of this information is strictly prohibited and may be unlawful and request you to delete this message and any attachments and advise the sender by return e-mail. The confidentiality of this message is not warranted. Apollo Vredestein and its subsidiaries rule out any and every liability resulting from this or any other electronic transmission
   Please consider the environment before printing this e-mail

Attachment

pgsql-general by date:

Previous
From: Matthias Apitz
Date:
Subject: Re: Problem close curser after rollback
Next
From: Srinivasa T N
Date:
Subject: Re: FATAL: terminating connection due to administrator command