FATAL: terminating connection due to administrator command
Hi all,
We're seeing the FATAL error message from the subject pop up in our logs
at regular intervals, but I haven't been able to pinpoint what is causing
it. I'm hoping for some insights here.
We run a PostgreSQL 11.9 server on CentOS 7, within a vmware environment:
PostgreSQL 11.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-39), 64-bit
The package was installed from the PGDG repository.
I'm not even sure I should be worried, there doesn't appear to be any
impact on the servers' functioning, but it does say 'FATAL'.
What we're seeing are lines like these two instances:
2020-09-30 21:45:09.999 CEST [2375] 172.30.2.25 asdf STATEMENT: select
count(*) from "util_asdf"."v_something_something2"
2020-09-30 21:45:15.018 CEST [2375] 172.30.2.25 asdf ERROR: canceling
statement due to user request
2020-09-30 21:45:15.018 CEST [2375] 172.30.2.25 asdf STATEMENT: select
count(*) from "dm_asdf"."asdf_func"
2020-09-30 21:45:20.027 CEST [2375] 172.30.2.25 asdf ERROR: canceling
statement due to user request
2020-09-30 21:45:20.027 CEST [2375] 172.30.2.25 asdf STATEMENT: select
count(*) from "dm_asdf"."asdf_func"
2020-09-30 22:27:51.422 CEST [20270] 172.30.2.26 selfservice_sales ERROR:
canceling statement due to user request
2020-09-30 22:27:51.422 CEST [20270] 172.30.2.26 selfservice_sales
STATEMENT: select count(*) from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.422 CEST [30649] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:51.422 CEST [30649] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.422 CEST [30648] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:51.422 CEST [30648] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.422 CEST [30647] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:51.422 CEST [30647] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.422 CEST [30646] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:51.422 CEST [30646] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.422 CEST [30645] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:51.422 CEST [30645] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.435 CEST [6482] LOG: background worker "parallel
worker" (PID 30646) exited with exit code 1
2020-09-30 22:27:51.435 CEST [6482] LOG: background worker "parallel
worker" (PID 30645) exited with exit code 1
2020-09-30 22:27:51.435 CEST [6482] LOG: background worker "parallel
worker" (PID 30647) exited with exit code 1
2020-09-30 22:27:51.435 CEST [6482] LOG: background worker "parallel
worker" (PID 30648) exited with exit code 1
2020-09-30 22:27:51.435 CEST [6482] LOG: background worker "parallel
worker" (PID 30649) exited with exit code 1
2020-09-30 22:27:56.446 CEST [20270] 172.30.2.26 selfservice_sales ERROR:
canceling statement due to user request
2020-09-30 22:27:56.446 CEST [20270] 172.30.2.26 selfservice_sales
STATEMENT: select count(*) from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30659] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:56.446 CEST [30659] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30658] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:56.446 CEST [30658] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30657] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:56.446 CEST [30657] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30656] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:56.446 CEST [30656] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30655] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:56.446 CEST [30655] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker "parallel
worker" (PID 30655) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker "parallel
worker" (PID 30656) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker "parallel
worker" (PID 30657) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker "parallel
worker" (PID 30658) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker "parallel
worker" (PID 30659) exited with exit code 1
2020-09-30 22:43:08.459 CEST [8055] 172.30.2.25 selfservice_prd ERROR:
schema "somethingelse" does not exist at character 71
Apparently, something is sending SIGTERM to our pg processes. I know that
I'm not doing that, certainly not at those hours, and I'm the one who set
up this system and am the only DBA of it.
Advice I found on the Internet is to use systemtap with some tap-script,
but the scripts that I found just displayed the PID's of processes without
telling me their names, which I didn't find all that useful in figuring
out who was responsible, so I made an attempt (I have no experience with
stap) at modifying it to print process names of signal sender and target:
/*
* killsnoop-nd.stp Trace process signals.
* For Linux, uses SystemTap (non-debuginfo).
*
* Copyright (C) 2015 Brendan Gregg.
(etc)
*/
global target;
global signal;
probe begin
{
printf("%-6s %-12s %-5s %-6s %6s\n", "FROM", "COMMAND", "SIG",
"TO", "COMMAND");
}
probe nd_syscall.kill
{
target[tid()] = uint_arg(1);
signal[tid()] = uint_arg(2);
}
probe nd_syscall.kill.return
{
if (signal[tid()] == 15 && target[tid()] != 0) {
printf("%-6d %-12s %-5d %-6d %12s\n"
, pid(), execname()
, signal[tid()]
, target[tid()],
pid2execname(target[tid()]));
}
delete target[tid()];
delete signal[tid()];
}
The output of last night was:
FROM COMMAND SIG TO COMMAND
30068 systemd-udevd 15 14151 systemd-udevd
30068 systemd-udevd 15 14836 systemd-udevd
(...)
6482 postmaster 15 30649 postmaster
6482 postmaster 15 30648 postmaster
6482 postmaster 15 30647 postmaster
6482 postmaster 15 30646 postmaster
6482 postmaster 15 30645 postmaster
6482 postmaster 15 30659 postmaster
6482 postmaster 15 30658 postmaster
6482 postmaster 15 30657 postmaster
6482 postmaster 15 30656 postmaster
6482 postmaster 15 30655 postmaster
6482 postmaster 15 2065 postmaster
6482 postmaster 15 2064 postmaster
6482 postmaster 15 2063 postmaster
Several of these TO-pid's match those in the PG log.
$ ps aux | grep 6482
postgres 6482 0.0 0.5 51755776 1043872 ? Ss Sep28 3:25
/usr/pgsql-11/bin/postmaster -D /data/11/data
Am I correct in concluding that postmaster is killing its own processes?
If so, what is going on here? And more importantly, what do I do about it?
Regards,
Alban Hertroys.
P.S. I'm mailing from my work account, as this is work related. You may
remember me from my private account.
Alban Hertroys
D: +31 (0)53 4 888 888 | T: +31 (0)53 4888 888 | E: alban.hertroys@apollovredestein.com
Apollo Vredestein B.V.| Ir. E.L.C. Schiffstraat 370, 7547 RD Enschede, The
Netherlands
Chamber of Commerce number: 34223268
The information contained in this e-mail is intended solely for the use of the
individual or entity to whom it is addressed. If you are not the intended
recipient, you are hereby notified that any disclosure, copying, distribution
or action in relation to the contents of this information is strictly
prohibited and may be unlawful and request you to delete this message and any
attachments and advise the sender by return e-mail. The confidentiality of this
message is not warranted. Apollo Vredestein and its subsidiaries rule out any
and every liability resulting from this or any other electronic transmission
Please consider the environment before printing this e-mail
On Thu, Oct 1, 2020 at 2:47 PM Alban Hertroys <
alban.hertroys@apollovredestein.com> wrote:
Hi all,
We're seeing the FATAL error message from the subject pop up in our logs
at regular intervals, but I haven't been able to pinpoint what is causing
it. I'm hoping for some insights here.We run a PostgreSQL 11.9 server on CentOS 7, within a vmware environment:
PostgreSQL 11.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-39), 64-bitThe package was installed from the PGDG repository.
I'm not even sure I should be worried, there doesn't appear to be any
impact on the servers' functioning, but it does say 'FATAL'.
What we're seeing are lines like these two instances:2020-09-30 22:27:56.446 CEST [30659] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30658] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:56.446 CEST [30658] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30657] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:56.446 CEST [30657] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30656] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:56.446 CEST [30656] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30655] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:56.446 CEST [30655] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker "parallel
worker" (PID 30655) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker "parallel
worker" (PID 30656) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker "parallel
worker" (PID 30657) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker "parallel
worker" (PID 30658) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker "parallel
worker" (PID 30659) exited with exit code 1
2020-09-30 22:43:08.459 CEST [8055] 172.30.2.25 selfservice_prd ERROR:
schema "somethingelse" does not exist at character 71I am guessing that 6 background workers are started, 1 worker had the
result and hence killing the other 5 workers. Maybe, some more pg experts
can comment. Anyway, explain of your query helps.
Regards,
Seenu.
Apparently, something is sending SIGTERM to our pg processes. I know that
Show quoted text
I'm not doing that, certainly not at those hours, and I'm the one who set
up this system and am the only DBA of it.Advice I found on the Internet is to use systemtap with some tap-script,
but the scripts that I found just displayed the PID's of processes without
telling me their names, which I didn't find all that useful in figuring out
who was responsible, so I made an attempt (I have no experience with stap)
at modifying it to print process names of signal sender and target:*The information contained in this e-mail is intended solely for the use
of the individual or entity to whom it is addressed. If you are not the
intended recipient, you are hereby notified that any disclosure, copying,
distribution or action in relation to the contents of this information is
strictly prohibited and may be unlawful and request you to delete this
message and any attachments and advise the sender by return e-mail. The
confidentiality of this message is not warranted. Apollo Vredestein and its
subsidiaries rule out any and every liability resulting from this or any
other electronic transmission*
Please consider the environment before printing this e-mail
Attachments:
"Srinivasa T N" <seenutn@gmail.com> wrote on 01/10/2020 11:47:33:
On Thu, Oct 1, 2020 at 2:47 PM Alban Hertroys <
alban.hertroys@apollovredestein.com> wrote:
Hi all,We're seeing the FATAL error message from the subject pop up in our
logs at regular intervals, but I haven't been able to pinpoint what
is causing it. I'm hoping for some insights here.We run a PostgreSQL 11.9 server on CentOS 7, within a vmware
environment:
PostgreSQL 11.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-39), 64-bitThe package was installed from the PGDG repository.
I'm not even sure I should be worried, there doesn't appear to be
any impact on the servers' functioning, but it does say 'FATAL'.
What we're seeing are lines like these two instances:2020-09-30 22:27:56.446 CEST [30659] STATEMENT: select count(*)
from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30658] FATAL: terminating
connection due to administrator command
2020-09-30 22:27:56.446 CEST [30658] STATEMENT: select count(*)
from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30657] FATAL: terminating
connection due to administrator command
2020-09-30 22:27:56.446 CEST [30657] STATEMENT: select count(*)
from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30656] FATAL: terminating
connection due to administrator command
2020-09-30 22:27:56.446 CEST [30656] STATEMENT: select count(*)
from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30655] FATAL: terminating
connection due to administrator command
2020-09-30 22:27:56.446 CEST [30655] STATEMENT: select count(*)
from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker
"parallel worker" (PID 30655) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker
"parallel worker" (PID 30656) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker
"parallel worker" (PID 30657) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker
"parallel worker" (PID 30658) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker
"parallel worker" (PID 30659) exited with exit code 1
2020-09-30 22:43:08.459 CEST [8055] 172.30.2.25 selfservice_prd
ERROR: schema "somethingelse" does not exist at character 71
I am guessing that 6 background workers are started, 1 worker had
the result and hence killing the other 5 workers. Maybe, some more
pg experts can comment. Anyway, explain of your query helps.
I think you may have the right of it:
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=3065970.74..3065970.75 rows=1 width=8)
-> Gather (cost=3065970.21..3065970.72 rows=5 width=8)
Workers Planned: 5
-> Partial Aggregate (cost=3064970.21..3064970.22 rows=1
width=8)
-> Nested Loop Left Join (cost=2772.30..2743631.23
rows=128535594 width=0)
Join Filter: ((avl.xxxxxxxxxx)::text <> ''::text)
-> Parallel Hash Left Join (cost=2772.01..943286.00
rows=5574292 width=13)
Hash Cond: (avl.xxxxxxxxxxxxxxxxx =
(dc.xxxxxxxxxxxxx)::integer)
-> Parallel Seq Scan on xxxxxxxxxxxxxxxxxxxx
avl (cost=0.00..596772.71 rows=5574171 width=21)
-> Parallel Hash (cost=2262.01..2262.01
rows=40800 width=8)
-> Parallel Index Only Scan using
xxxxxxxxxxxxxxxxxxxxxxxxxxxx on xxxxxxxxxxxxxxxxxxxx dc (cost=0.42..2
-> Index Scan using ix_xxxxxxxxxxxxxxxxxxxxxxxxxxxx
on xxxxxxxxxxxxxxxxxxx dm (cost=0.29..0.31 rows=1 width=19)
Index Cond: ((avl.xxxxxxxxxx)::text =
(xxxxxxxxxx)::text)
Filter: ((xxxxxxxxxx)::text <> ''::text)
(14 rows)
So, apparently these FATAL errors are just caused by parallel workers
being aborted because they're no longer needed. Good to know.
Regards,
Alban.
Alban Hertroys
D: +31 (0)53 4 888 888 | T: +31 (0)53 4888 888 | E: alban.hertroys@apollovredestein.com
Apollo Vredestein B.V.| Ir. E.L.C. Schiffstraat 370, 7547 RD Enschede, The
Netherlands
Chamber of Commerce number: 34223268
The information contained in this e-mail is intended solely for the use of the
individual or entity to whom it is addressed. If you are not the intended
recipient, you are hereby notified that any disclosure, copying, distribution
or action in relation to the contents of this information is strictly
prohibited and may be unlawful and request you to delete this message and any
attachments and advise the sender by return e-mail. The confidentiality of this
message is not warranted. Apollo Vredestein and its subsidiaries rule out any
and every liability resulting from this or any other electronic transmission
Please consider the environment before printing this e-mail