BUG #18985: fast shutdown does not close connections from qlik data gateway data movement aka. replicate - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #18985: fast shutdown does not close connections from qlik data gateway data movement aka. replicate
Date
Msg-id 18985-64431d78bcabae95@postgresql.org
Whole thread Raw
Responses Re: BUG #18985: fast shutdown does not close connections from qlik data gateway data movement aka. replicate
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      18985
Logged by:          Erik Dobak
Email address:      erik.dobak@gmail.com
PostgreSQL version: 14.8
Operating system:   Windows server 2022
Description:

We use postgresql 14.18 at a windows server 2022 as a repository db for the
Qlik Sense product. Additionally we connect to the postgresql db from a
remote server running Qlik Data Gateway Data Movement server which
replicates data and changes to various cloud DBs by using replication slots.

When we try to shutdown the postgresql db service (per windows services or
pg_ctl -m fast) almost all connections to the posgresql server are closed
but the Qlik Data Gateway Data Movement connections can be seen as
ESTABLISHED when using the netstat -ano command.

From the posgresql logs -> this happens when i try to shutdown postgresql:

2025-07-15 08:20:36.676 UTC [14740] LOG:  could not receive data from
client: An existing connection was forcibly closed by the remote host.
2025-07-15 08:20:43.980 UTC [14516] LOG:  received fast shutdown request
2025-07-15 08:20:43.986 UTC [14516] LOG:  aborting any active transactions
2025-07-15 08:20:43.986 UTC [14604] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:43.987 UTC [8020] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:43.987 UTC [13800] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:43.987 UTC [15232] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:43.987 UTC [13600] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:43.987 UTC [10176] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:43.987 UTC [7648] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:43.988 UTC [2272] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:43.988 UTC [11072] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:43.988 UTC [13304] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:43.987 UTC [12656] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:43.988 UTC [15496] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:43.988 UTC [2276] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:43.988 UTC [13256] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:43.988 UTC [10828] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:43.988 UTC [6016] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:43.988 UTC [13232] FATAL:  terminating connection due to
administrator command
2025-07-15 08:20:44.047 UTC [14516] LOG:  background worker "logical
replication launcher" (PID 6400) exited with exit code 1
2025-07-15 08:20:44.053 UTC [5448] LOG:  shutting down
2025-07-15 08:20:56.512 UTC [4364] LOG:  connection received: host=::1
port=52514
2025-07-15 08:20:56.512 UTC [4364] FATAL:  the database system is shutting
down
2025-07-15 08:20:56.547 UTC [13968] LOG:  connection received:
host=127.0.0.1 port=52515
2025-07-15 08:20:56.547 UTC [13968] FATAL:  the database system is shutting
down
2025-07-15 08:21:56.486 UTC [10456] LOG:  connection received: host=::1
port=52543
2025-07-15 08:21:56.487 UTC [10456] FATAL:  the database system is shutting
down
2025-07-15 08:21:56.522 UTC [4812] LOG:  connection received: host=127.0.0.1
port=52544
2025-07-15 08:21:56.522 UTC [4812] FATAL:  the database system is shutting
down
2025-07-15 08:21:56.561 UTC [15444] LOG:  connection received: host=::1
port=52545
2025-07-15 08:21:56.562 UTC [15444] FATAL:  the database system is shutting
down
2025-07-15 08:21:56.595 UTC [12844] LOG:  connection received:
host=127.0.0.1 port=52546
2025-07-15 08:21:56.595 UTC [12844] FATAL:  the database system is shutting
down
2025-07-15 08:22:56.505 UTC [15004] LOG:  connection received: host=::1
port=52574
...


this did go on for 30minutes or more. then i checked with netstat (we are
using port 4432 instead of 5432 for postgresdb):

PS C:\Users\redacted> netstat -ano|findstr 4432
  TCP    0.0.0.0:4432           0.0.0.0:0              LISTENING       14516
  TCP    redacted_postgres_ip:4432         readacted_dgdm_ip:35818
ESTABLISHED     14516
  TCP    redacted_postgres_ip:4432         redacted_dgdm_ip:35886
ESTABLISHED     14516
  TCP    [::]:4432              [::]:0                 LISTENING       14516

here you can see 2 active network connection even though we have initiated a
fast shutdown 30mins or more before.

i decided to shutdown our Qlik Data Gateway Data Movement service on the
other server and immediately the postgresDB did stop correctly:

...
2025-07-15 08:50:51.799 UTC [5448] LOG:  checkpoint starting: shutdown
immediate
2025-07-15 08:50:51.859 UTC [5448] LOG:  checkpoint complete: wrote 9
buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.004 s,
sync=0.019 s, total=0.067 s; sync files=7, longest=0.004 s, average=0.003 s;
distance=38 kB, estimate=22505 kB
2025-07-15 08:50:51.910 UTC [14516] LOG:  database system is shut down.

-------

According to https://www.postgresql.org/docs/14/app-pg-ctl.html :
...
“Fast” mode (the default) does not wait for clients to disconnect and will
terminate an online backup in progress. All active transactions are rolled
back and clients are forcibly disconnected, then the server is shut down.
...

In our case there are at least 2 client connection not forcibly disconnected
and this is why i open this bug.


pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #18984: Empty prepared statement from psql \parse triggers assert in PortalRunMulti
Next
From: "David G. Johnston"
Date:
Subject: Re: BUG #18985: fast shutdown does not close connections from qlik data gateway data movement aka. replicate