BUG #12833: Cannot cancel query or terminate backend if it client is SIGSTOPed
The following bug has been logged on the website:
Bug reference: 12833
Logged by: Sergey Burladyan
Email address: eshkinkot@gmail.com
PostgreSQL version: 9.4.1
Operating system: Slackware 14.1
Description:
I run this command in bash:
$ ../bin/psql -X -At -c 'copy (select * from generate_series(1, 100000000))
to stdout' & ( sleep 2; kill -STOP $!; )
$ ps f --ppid $$
PID TTY STAT TIME COMMAND
24773 pts/23 R+ 0:00 ps f --ppid 5021
24685 pts/23 T 0:00 ../bin/psql -X -At -c copy (select * from
generate_series(1, 100000000)) to stdout
Now psql is stopped and I try to cancel it backend with
pg_cancel_backend and pg_terminate_backend, but it not canceled or stopped.
Select from pg_stat_activity still show it as active:
-[ RECORD 1
]----+-------------------------------------------------------------
datid | 16384
datname | sergey
pid | 24688
usesysid | 10
usename | sergey
application_name | psql
client_addr | <NULL>
client_hostname | <NULL>
client_port | -1
backend_start | 2015-03-05 19:17:03.028235+03
xact_start | 2015-03-05 19:17:03.030116+03
query_start | 2015-03-05 19:17:03.030116+03
state_change | 2015-03-05 19:17:03.030118+03
waiting | f
state | active
backend_xid | <NULL>
backend_xmin | 1268
query | copy (select * from generate_series(1, 100000000)) to
stdout
$ strace -p 24688
Process 24688 attached
sendto(8, "\nd\0\0\0\n19628\nd\0\0\0\n19629\nd\0\0\0\n1963"..., 8192, 0,
NULL, 0) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGINT {si_signo=SIGINT, si_code=SI_USER, si_pid=24610, si_uid=1000}
---
rt_sigreturn() = 44
sendto(8, "\nd\0\0\0\n19628\nd\0\0\0\n19629\nd\0\0\0\n1963"..., 8192, 0,
NULL, 0) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=24610, si_uid=1000}
---
rt_sigreturn() = 44
sendto(8, "\nd\0\0\0\n19628\nd\0\0\0\n19629\nd\0\0\0\n1963"..., 8192, 0,
NULL, 0
(gdb) bt
#0 0x00007f0cd2bde88d in send () from /lib64/libc.so.6
#1 0x00000000005da809 in secure_write (port=<optimized out>, ptr=<optimized
out>, len=<optimized out>) at be-secure.c:458
#2 0x00000000005e205b in internal_flush () at pqcomm.c:1324
#3 0x00000000005e21ad in internal_putbytes (s=0x1e982fa "372\n",
s@entry=0x1e982f8 "20372\n", len=4) at pqcomm.c:1270
#4 0x00000000005e3342 in pq_putmessage (msgtype=msgtype@entry=100 'd',
s=0x1e982f8 "20372\n", len=<optimized out>) at pqcomm.c:1467
#5 0x000000000055ccbb in CopySendEndOfRow (cstate=cstate@entry=0x1e97ee8)
at copy.c:546
#6 0x000000000055d58a in CopyOneRowTo (cstate=cstate@entry=0x1e97ee8,
tupleOid=tupleOid@entry=0, values=0x1ea6d20, nulls=0x1ea6d40 "") at
copy.c:1939
#7 0x000000000055e195 in copy_dest_receive (slot=0x1ea5ff8, self=0x1ea1f10)
at copy.c:4310
#8 0x00000000005af282 in ExecutePlan (dest=0x1ea1f10, direction=<optimized
out>, numberTuples=0, sendTuples=1 '\001', operation=CMD_SELECT,
planstate=0x1ea5cf0, estate=0x1ea5bd8)
at execMain.c:1511
#9 standard_ExecutorRun (queryDesc=0x1ea1f68, direction=<optimized out>,
count=0) at execMain.c:319
#10 0x000000000055decf in CopyTo (cstate=0x1e97ee8) at copy.c:1836
#11 DoCopyTo (cstate=cstate@entry=0x1e97ee8) at copy.c:1659
#12 0x0000000000561a97 in DoCopy (stmt=stmt@entry=0x1e62f30,
queryString=0x1e61e48 "copy (select * from generate_series(1, 100000000)) to
stdout", processed=processed@entry=0x7fffb48692f8)
at copy.c:878
#13 0x00000000006b0509 in standard_ProcessUtility (parsetree=0x1e62f30,
queryString=<optimized out>, context=<optimized out>, params=0x0,
dest=<optimized out>,
completionTag=<optimized out>) at utility.c:525
#14 0x00000000006ad3c1 in PortalRunUtility (portal=portal@entry=0x1e9dba8,
utilityStmt=utilityStmt@entry=0x1e62f30, isTopLevel=isTopLevel@entry=1
'\001', dest=dest@entry=0x1e632d8,
completionTag=completionTag@entry=0x7fffb4869650 "") at pquery.c:1187
#15 0x00000000006ae06a in PortalRunMulti (portal=portal@entry=0x1e9dba8,
isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x1e632d8,
altdest=altdest@entry=0x1e632d8,
completionTag=completionTag@entry=0x7fffb4869650 "") at pquery.c:1318
#16 0x00000000006aebff in PortalRun (portal=portal@entry=0x1e9dba8,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001',
dest=dest@entry=0x1e632d8,
altdest=altdest@entry=0x1e632d8,
completionTag=completionTag@entry=0x7fffb4869650 "") at pquery.c:816
#17 0x00000000006ac6ed in exec_simple_query (query_string=0x1e61e48 "copy
(select * from generate_series(1, 100000000)) to stdout") at
postgres.c:1072
#18 PostgresMain (argc=<optimized out>, argv=argv@entry=0x1dd3730,
dbname=0x1dd3590 "sergey", username=<optimized out>) at postgres.c:4074
#19 0x000000000045eb58 in BackendRun (port=0x1e1bac0) at postmaster.c:4155
#20 BackendStartup (port=0x1e1bac0) at postmaster.c:3829
#21 ServerLoop () at postmaster.c:1597
#22 0x0000000000652279 in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x1dd2750) at postmaster.c:1244
#23 0x000000000045f9af in main (argc=3, argv=0x1dd2750) at main.c:228
git 2570e28 REL9_4_STABLE | PostgreSQL 9.4.1 on x86_64-unknown-linux-gnu,
compiled by gcc (GCC) 4.8.2, 64-bit
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
eshkinkot@gmail.com writes:
I run this command in bash:
$ ../bin/psql -X -At -c 'copy (select * from generate_series(1, 100000000))
to stdout' & ( sleep 2; kill -STOP $!; )
$ ps f --ppid $$
PID TTY STAT TIME COMMAND
24773 pts/23 R+ 0:00 ps f --ppid 5021
24685 pts/23 T 0:00 ../bin/psql -X -At -c copy (select * from
generate_series(1, 100000000)) to stdout
Now psql is stopped and I try to cancel it backend with
pg_cancel_backend and pg_terminate_backend, but it not canceled or stopped.
[ shrug... ] It'll probably terminate the query whenever the kernel
returns from send(). There aren't a lot of options here: the only
way we could get out of this without waiting for the client is a
catastrophic termination of the session, which is not really what
either of those operations authorizes. There's no way to do anything
less drastic without breaking protocol sync.
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On 2015-03-05 12:33:22 -0500, Tom Lane wrote:
eshkinkot@gmail.com writes:
I run this command in bash:
$ ../bin/psql -X -At -c 'copy (select * from generate_series(1, 100000000))
to stdout' & ( sleep 2; kill -STOP $!; )$ ps f --ppid $$
PID TTY STAT TIME COMMAND
24773 pts/23 R+ 0:00 ps f --ppid 5021
24685 pts/23 T 0:00 ../bin/psql -X -At -c copy (select * from
generate_series(1, 100000000)) to stdoutNow psql is stopped and I try to cancel it backend with
pg_cancel_backend and pg_terminate_backend, but it not canceled or stopped.
9.5 should allow sessions to be terminated, but not
cancelled. Unfortunately this is too big a change to be backported, so
you'll have to wait for that.
[ shrug... ] It'll probably terminate the query whenever the kernel
returns from send(). There aren't a lot of options here: the only
way we could get out of this without waiting for the client is a
catastrophic termination of the session, which is not really what
either of those operations authorizes. There's no way to do anything
less drastic without breaking protocol sync.
Well, terminate pretty much authorizes it, no? At least thats what we
decided in the "Escaping from blocked send() reprised." thread. If we
were blocked in a send() and asked to die we'll now do so.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs