Thread: BUG #12833: Cannot cancel query or terminate backend if it client is SIGSTOPed
BUG #12833: Cannot cancel query or terminate backend if it client is SIGSTOPed
From
eshkinkot@gmail.com
Date:
The following bug has been logged on the website: Bug reference: 12833 Logged by: Sergey Burladyan Email address: eshkinkot@gmail.com PostgreSQL version: 9.4.1 Operating system: Slackware 14.1 Description: I run this command in bash: $ ../bin/psql -X -At -c 'copy (select * from generate_series(1, 100000000)) to stdout' & ( sleep 2; kill -STOP $!; ) $ ps f --ppid $$ PID TTY STAT TIME COMMAND 24773 pts/23 R+ 0:00 ps f --ppid 5021 24685 pts/23 T 0:00 ../bin/psql -X -At -c copy (select * from generate_series(1, 100000000)) to stdout Now psql is stopped and I try to cancel it backend with pg_cancel_backend and pg_terminate_backend, but it not canceled or stopped. Select from pg_stat_activity still show it as active: -[ RECORD 1 ]----+------------------------------------------------------------- datid | 16384 datname | sergey pid | 24688 usesysid | 10 usename | sergey application_name | psql client_addr | <NULL> client_hostname | <NULL> client_port | -1 backend_start | 2015-03-05 19:17:03.028235+03 xact_start | 2015-03-05 19:17:03.030116+03 query_start | 2015-03-05 19:17:03.030116+03 state_change | 2015-03-05 19:17:03.030118+03 waiting | f state | active backend_xid | <NULL> backend_xmin | 1268 query | copy (select * from generate_series(1, 100000000)) to stdout $ strace -p 24688 Process 24688 attached sendto(8, "\nd\0\0\0\n19628\nd\0\0\0\n19629\nd\0\0\0\n1963"..., 8192, 0, NULL, 0) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) --- SIGINT {si_signo=SIGINT, si_code=SI_USER, si_pid=24610, si_uid=1000} --- rt_sigreturn() = 44 sendto(8, "\nd\0\0\0\n19628\nd\0\0\0\n19629\nd\0\0\0\n1963"..., 8192, 0, NULL, 0) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=24610, si_uid=1000} --- rt_sigreturn() = 44 sendto(8, "\nd\0\0\0\n19628\nd\0\0\0\n19629\nd\0\0\0\n1963"..., 8192, 0, NULL, 0 (gdb) bt #0 0x00007f0cd2bde88d in send () from /lib64/libc.so.6 #1 0x00000000005da809 in secure_write (port=<optimized out>, ptr=<optimized out>, len=<optimized out>) at be-secure.c:458 #2 0x00000000005e205b in internal_flush () at pqcomm.c:1324 #3 0x00000000005e21ad in internal_putbytes (s=0x1e982fa "372\n", s@entry=0x1e982f8 "20372\n", len=4) at pqcomm.c:1270 #4 0x00000000005e3342 in pq_putmessage (msgtype=msgtype@entry=100 'd', s=0x1e982f8 "20372\n", len=<optimized out>) at pqcomm.c:1467 #5 0x000000000055ccbb in CopySendEndOfRow (cstate=cstate@entry=0x1e97ee8) at copy.c:546 #6 0x000000000055d58a in CopyOneRowTo (cstate=cstate@entry=0x1e97ee8, tupleOid=tupleOid@entry=0, values=0x1ea6d20, nulls=0x1ea6d40 "") at copy.c:1939 #7 0x000000000055e195 in copy_dest_receive (slot=0x1ea5ff8, self=0x1ea1f10) at copy.c:4310 #8 0x00000000005af282 in ExecutePlan (dest=0x1ea1f10, direction=<optimized out>, numberTuples=0, sendTuples=1 '\001', operation=CMD_SELECT, planstate=0x1ea5cf0, estate=0x1ea5bd8) at execMain.c:1511 #9 standard_ExecutorRun (queryDesc=0x1ea1f68, direction=<optimized out>, count=0) at execMain.c:319 #10 0x000000000055decf in CopyTo (cstate=0x1e97ee8) at copy.c:1836 #11 DoCopyTo (cstate=cstate@entry=0x1e97ee8) at copy.c:1659 #12 0x0000000000561a97 in DoCopy (stmt=stmt@entry=0x1e62f30, queryString=0x1e61e48 "copy (select * from generate_series(1, 100000000)) to stdout", processed=processed@entry=0x7fffb48692f8) at copy.c:878 #13 0x00000000006b0509 in standard_ProcessUtility (parsetree=0x1e62f30, queryString=<optimized out>, context=<optimized out>, params=0x0, dest=<optimized out>, completionTag=<optimized out>) at utility.c:525 #14 0x00000000006ad3c1 in PortalRunUtility (portal=portal@entry=0x1e9dba8, utilityStmt=utilityStmt@entry=0x1e62f30, isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x1e632d8, completionTag=completionTag@entry=0x7fffb4869650 "") at pquery.c:1187 #15 0x00000000006ae06a in PortalRunMulti (portal=portal@entry=0x1e9dba8, isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x1e632d8, altdest=altdest@entry=0x1e632d8, completionTag=completionTag@entry=0x7fffb4869650 "") at pquery.c:1318 #16 0x00000000006aebff in PortalRun (portal=portal@entry=0x1e9dba8, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x1e632d8, altdest=altdest@entry=0x1e632d8, completionTag=completionTag@entry=0x7fffb4869650 "") at pquery.c:816 #17 0x00000000006ac6ed in exec_simple_query (query_string=0x1e61e48 "copy (select * from generate_series(1, 100000000)) to stdout") at postgres.c:1072 #18 PostgresMain (argc=<optimized out>, argv=argv@entry=0x1dd3730, dbname=0x1dd3590 "sergey", username=<optimized out>) at postgres.c:4074 #19 0x000000000045eb58 in BackendRun (port=0x1e1bac0) at postmaster.c:4155 #20 BackendStartup (port=0x1e1bac0) at postmaster.c:3829 #21 ServerLoop () at postmaster.c:1597 #22 0x0000000000652279 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x1dd2750) at postmaster.c:1244 #23 0x000000000045f9af in main (argc=3, argv=0x1dd2750) at main.c:228 git 2570e28 REL9_4_STABLE | PostgreSQL 9.4.1 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.2, 64-bit
Re: BUG #12833: Cannot cancel query or terminate backend if it client is SIGSTOPed
From
Tom Lane
Date:
eshkinkot@gmail.com writes: > I run this command in bash: > $ ../bin/psql -X -At -c 'copy (select * from generate_series(1, 100000000)) > to stdout' & ( sleep 2; kill -STOP $!; ) > $ ps f --ppid $$ > PID TTY STAT TIME COMMAND > 24773 pts/23 R+ 0:00 ps f --ppid 5021 > 24685 pts/23 T 0:00 ../bin/psql -X -At -c copy (select * from > generate_series(1, 100000000)) to stdout > Now psql is stopped and I try to cancel it backend with > pg_cancel_backend and pg_terminate_backend, but it not canceled or stopped. [ shrug... ] It'll probably terminate the query whenever the kernel returns from send(). There aren't a lot of options here: the only way we could get out of this without waiting for the client is a catastrophic termination of the session, which is not really what either of those operations authorizes. There's no way to do anything less drastic without breaking protocol sync. regards, tom lane
Re: BUG #12833: Cannot cancel query or terminate backend if it client is SIGSTOPed
From
Andres Freund
Date:
On 2015-03-05 12:33:22 -0500, Tom Lane wrote: > eshkinkot@gmail.com writes: > > I run this command in bash: > > $ ../bin/psql -X -At -c 'copy (select * from generate_series(1, 100000000)) > > to stdout' & ( sleep 2; kill -STOP $!; ) > > > $ ps f --ppid $$ > > PID TTY STAT TIME COMMAND > > 24773 pts/23 R+ 0:00 ps f --ppid 5021 > > 24685 pts/23 T 0:00 ../bin/psql -X -At -c copy (select * from > > generate_series(1, 100000000)) to stdout > > > Now psql is stopped and I try to cancel it backend with > > pg_cancel_backend and pg_terminate_backend, but it not canceled or stopped. 9.5 should allow sessions to be terminated, but not cancelled. Unfortunately this is too big a change to be backported, so you'll have to wait for that. > [ shrug... ] It'll probably terminate the query whenever the kernel > returns from send(). There aren't a lot of options here: the only > way we could get out of this without waiting for the client is a > catastrophic termination of the session, which is not really what > either of those operations authorizes. There's no way to do anything > less drastic without breaking protocol sync. Well, terminate pretty much authorizes it, no? At least thats what we decided in the "Escaping from blocked send() reprised." thread. If we were blocked in a send() and asked to die we'll now do so. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services