The same again with 16.9 : was Re: PostgreSQL 16.6 , query stuck with STAT Ssl, wait_event_type : IPC , wait_event : ParallelFinish - Mailing list pgsql-admin
From | Achilleas Mantzios |
---|---|
Subject | The same again with 16.9 : was Re: PostgreSQL 16.6 , query stuck with STAT Ssl, wait_event_type : IPC , wait_event : ParallelFinish |
Date | |
Msg-id | 1332203e-fe17-4800-aadc-4de4a93fc85d@cloud.gatewaynet.com Whole thread Raw |
In response to | PostgreSQL 16.6 , query stuck with STAT Ssl, wait_event_type : IPC , wait_event : ParallelFinish (Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com>) |
Responses |
Re: The same again with 16.9 : was Re: PostgreSQL 16.6 , query stuck with STAT Ssl, wait_event_type : IPC , wait_event : ParallelFinish
|
List | pgsql-admin |
Hi ,
we had the same problem today again.
postgres@[local]/dynacom=# select * from pg_stat_activity where application_name~*'dbmirr';
-[ RECORD 1 ]----+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------------
datid | 207491653
datname | dynacom
pid | 1821681
leader_pid |
usesysid | 10
usename | postgres
application_name | DBMIRROR
client_addr | 10.9.0.10
client_hostname |
client_port | 45051
backend_start | 2025-08-22 03:58:32.321683+03
xact_start | 2025-08-22 04:06:08.897252+03
query_start | 2025-08-22 04:06:09.254048+03
state_change | 2025-08-22 04:06:09.254049+03
wait_event_type | IPC
wait_event | ParallelFinish
state | active
backend_xid |
backend_xmin | 222705697
query_id | -3929522546936394707
query | SELECT pd.XID,MAX(SeqId) FROM dbmirror_Pending pd LEFT JOIN dbmirror_MirroredTransaction mt INNER JOIN dbmirror_MirrorHost mh ON mt.MirrorHostId = mh.MirrorHostId AND m
h.HostName= '192.168.145.1' ON pd.XID = mt.XID WHERE mt.XID is null and (pd.slaveid is null or pd.slaveid = '4826') GROUP BY pd.XID ORDER BY MAX(pd.SeqId)
backend_type | client backend
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 1821681 0.5 4.8 37111844 3177260 ? Ssl 03:58 2:25 postgres: postgres dynacom 10.9.0.10(45051) SELECT
postgres@smadb:~$
Again the process is stuck with this Ssl state.
strace: Process 1821681 attached
epoll_wait(12,
postgres 1821681 postgres DEL REG 0,1 33 /SYSV00041c60
postgres 1821681 postgres 3u a_inode 0,15 0 1059 [signalfd]
postgres 1821681 postgres 4r FIFO 0,14 0t0 1292545666 pipe
postgres 1821681 postgres 12u a_inode 0,15 0 1059 [eventpoll:3,4]
10.9.0.10(45051) [1821681] 68a7c0b8.1bcbf1 2025-08-22 11:09:32.517 EEST DBMIRROR postgres@dynacom line:27 STATEMENT: SELECT pd.XID,MAX(SeqId) FROM dbmirror_Pending pd LEFT JOIN dbmirror
_MirroredTransaction mt INNER JOIN dbmirror_MirrorHost mh ON mt.MirrorHostId = mh.MirrorHostId AND mh.HostName= '192.168.145.1' ON pd.XID = mt.XID WHERE mt.XID is null and (pd.slaveid
is null or pd.slaveid = '4826') GROUP BY pd.XID ORDER BY MAX(pd.SeqId)
10.9.0.10(45051) [1821681] 68a7c0b8.1bcbf1 2025-08-22 11:09:32.519 EEST DBMIRROR postgres@dynacom line:28 LOG: disconnection: session time: 7:11:00.197 user=postgres database=dynacom ho
st=10.9.0.10 port=45051
On 1/6/25 07:19, Tom Lane wrote:
Yes. sorry, I didn't include this info, you are spot on, yes this the output of ps aux .Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> writes:a query is stuck with the above, it seems it waits for parallel worker to finish, however , there are no parallel works running :You didn't explain the subject about "STAT Ssl", but if you mean that that was what ps was showing for the backend process, there's something very wrong there. According to "man ps", the "l" means l is multi-threaded (using CLONE_THREAD, like NPTL pthreads do)which is something that a Postgres backend should never be (in existing releases anyway). So I'm speculating that the process somehow became multi-threaded and then some wakeup signal went to the wrong thread. We've had issues with perl or python introducing multi-threading because of plperl or plpython functions doing things they probably shouldn't. Do you have any of those in your system?Yes we have two perl functions only that I'd be happy to get rid off :
postgres@[local]/dynacom=# select p.proname, l.lanname from pg_language l, pg_proc p where p.prolang=l.oid and l.lanname ~* '.*perl.*';
proname | lanname
----------+---------
basename | plperlu
filetype | plperlu
(2 rows)
Nothing used in the app, just some two utility functions to help us batch insert some attachments, guess mimetype etc. However the calling client is Perl , based on libpg-perl (not DBI), basically this is a descendant of DBMirror.pl (we are still using it).The strange thing is that we run pgsql 16.* since November, also we run our version of DBMirror since 2005 (and PostgreSQL since 2001) and we never had this problem before (at least from what I know).
regards, tom lane
pgsql-admin by date: