postmaster deadlock while logging after syslogger exited
Hello,
I ran into what appears to be a deadlock in the logging subsystem. It
looks like what happened was that the syslogger process exited because it
ran out of memory. But before the postmaster got a chance to handle the
SIGCLD to restart it, it handled a SIGUSR1 to start an autovacuum worker.
That also failed, and the postmaster went to log a message about it, but
it's blocked on the pipe that's normally connected to the syslogger,
presumably because the pipe is full because the syslogger is gone and
hasn't read from it. The net result is that new connections to PostgreSQL
hang, and any query that causes PostgreSQL to log appears to hang, though
queries appear to work on existing connections as long as they do not cause
the backend to log anything. I don't believe the system would ever get out
of this state.
I ran into this with version 9.2.4 on SmartOS (illumos), but it doesn't
look to me like any of the relevant code has changed in 9.6 or 10, nor is
it platform-specific. Version 10.0 still:
- restarts the syslogger in response to a SIGCHLD signal (in "reaper")
- starts autovacuum workers in response to SIGUSR1 (in "sigusr1_handler")
- mask SIGCHLD at the beginning of the SIGUSR1 handler
- potentially logs from StartAutoVacWorker (and, of course, many other
places in the postmaster)
which I think means the problem is still possible. Am I missing
something? I couldn't find anything about this issue. Is this just a
rarely seen bug?
--------------------
Here's the raw data: my initial observation was that when I ran "psql" to
connect to the database, it would hang (for hours, it turned out). My
postmaster process was blocked here, writing to its own stderr pipe:
$ mdb core.41349
Loading modules: [ libumem.so.1 libc.so.1 ld.so.1 ]
$C
fffffd7fffdf9df0 libc.so.1`__write+0xa()
fffffd7fffdfb220 write_pipe_chunks+0x142()
fffffd7fffdfb280 send_message_to_server_log+0x55b()
fffffd7fffdfb2a0 EmitErrorReport+0xe6()
fffffd7fffdfb390 errfinish+0x1ac()
fffffd7fffdfb3b0 StartAutoVacWorker+0x5c()
fffffd7fffdfb3e0 StartAutovacuumWorker+0x6f()
fffffd7fffdfb410 sigusr1_handler+0x185()
fffffd7fffdfb420 libc.so.1`__sighndlr+6()
fffffd7fffdfb4b0 libc.so.1`call_user_handler+0x1db(10, 0, fffffd7fffdfb520)
fffffd7fffdfb500 libc.so.1`sigacthandler+0x116(10, 0, fffffd7fffdfb520)
fffffd7fffdfb970 libc.so.1`__pollsys+0xa()
fffffd7fffdfba90 libc.so.1`pselect+0x1cb(5, fffffd7fffdfbaf0, 0, 0,
fffffd7fffdfbaa0, 0)
fffffd7fffdfbae0 libc.so.1`select+0x5a(5, fffffd7fffdfbaf0, 0, 0,
fffffd7fffdfdaf0)
fffffd7fffdffb40 ServerLoop+0xb5()
fffffd7fffdffbe0 PostmasterMain+0xec2()
fffffd7fffdffc00 main+0x23a()
fffffd7fffdffc10 _start+0x6c()
From inspecting kernel state, I found that the only other file descriptor
on the system that corresponds to the other side of this pipe is in the
postmaster itself, which it keeps open in case it needs to restart the
syslogger. Based on the code, I expected the syslogger subprocess to have
the pipe open, but that process appears to have exited due to a fatal error
(out of memory). (I know it exited because the process still exists in the
kernel -- it hasn't been reaped yet -- and I think it ran out of memory
based on a log message I found from around the time when the process
exited.) Then I expected the postmaster would have restarted it, but I
found that it currently has SIGCHLD both pending and masked:
$ pflags 41349
41349: /opt/postgresql/9.2.4/bin/postgres -D /manatee/pg/data
data model = _LP64 flags = ORPHAN|MSACCT|MSFORK
sigpend = 0x00028000,0x00000000,0x00000000
/1: flags = ASLEEP write(0x2,0xfffffd7fffdf9e10,0x65)
sigmask = 0xfebff047,0xfffffff7,0x000003ff
The signal mask may be system-specific, but I've decoded it and confirmed
that SIGCHLD is in "sigpend" as well as "sigmask". And that makes sense
because the stack above indicates we're in "sigusr1_handler", which masks
SIGCHLD when it starts.
If it wasn't clear from the above, the logging configuration looks like
this:
log_destination = 'stderr'
logging_collector = on
log_directory = '/var/pg/'
log_filename = 'postgresql.log'
Thanks,
Dave
David Pacheco <dap@joyent.com> writes:
I ran into what appears to be a deadlock in the logging subsystem. It
looks like what happened was that the syslogger process exited because it
ran out of memory. But before the postmaster got a chance to handle the
SIGCLD to restart it, it handled a SIGUSR1 to start an autovacuum worker.
That also failed, and the postmaster went to log a message about it, but
it's blocked on the pipe that's normally connected to the syslogger,
presumably because the pipe is full because the syslogger is gone and
hasn't read from it.
Ugh.
... that process appears to have exited due to a fatal error
(out of memory). (I know it exited because the process still exists in the
kernel -- it hasn't been reaped yet -- and I think it ran out of memory
based on a log message I found from around the time when the process
exited.)
Could we see the exact log message(s) involved? It's pretty hard to
believe that the logger would have consumed much memory.
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Mon, Nov 6, 2017 at 12:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Pacheco <dap@joyent.com> writes:
... that process appears to have exited due to a fatal error
(out of memory). (I know it exited because the process still exists inthe
kernel -- it hasn't been reaped yet -- and I think it ran out of memory
based on a log message I found from around the time when the process
exited.)Could we see the exact log message(s) involved? It's pretty hard to
believe that the logger would have consumed much memory.
Thanks for the quick reply!
Based on kernel state about the dead but unreaped syslogger process, I
believe the process exited at 2017-10-27T23:46:21.258Z. Here are all of
the entries in the PostgreSQL log from 23:19:12 until the top of the next
hour:
https://gist.githubusercontent.com/davepacheco/c5541bb464532075f2da761dd990a457/raw/2ba242055aca2fb374e9118045a830d08c590e0a/gistfile1.txt
There's no log entry at exactly 23:46:21 or even immediately before that,
but there are a lot of "out of memory" errors and a FATAL one at 23:47:28.
Unfortunately, we haven't configured logging to include the pid, so I can't
be sure which messages came from the syslogger.
There are also many log entries for some very long SQL queries. I'm sure
that contributed to this problem by filling up the pipe. I was able to
extract the contents of the pipe while the system was hung, and it was more
of these giant query strings.
I think it's likely that this database instance was running in a container
with way too small a memory cap for the number of processes configured.
(This was a zone (a lightweight container) allocated with 2GB of memory and
configured with 512MB of shared_buffers and up to 200 connections.) I
expect that the system got to a persistent state of having basically no
memory available, at which point nearly any attempt to allocate memory
could fail. The syslogger itself may not have been using much memory.
So I'm not so much worried about the memory usage itself, but it would be
nice if this condition were handled better. Handling out-of-memory is
obviously hard, especially when it means being unable to fork, but even
crashing would have been better for our use-case. And of course, there are
other reasons that the syslogger could exit prematurely besides being low
on memory, and those might be more recoverable.
Thanks,
Dave
On Mon, Nov 6, 2017 at 12:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Pacheco <dap@joyent.com> writes:
I ran into what appears to be a deadlock in the logging subsystem. It
looks like what happened was that the syslogger process exited because it
ran out of memory. But before the postmaster got a chance to handle the
SIGCLD to restart it, it handled a SIGUSR1 to start an autovacuum worker.
That also failed, and the postmaster went to log a message about it, but
it's blocked on the pipe that's normally connected to the syslogger,
presumably because the pipe is full because the syslogger is gone and
hasn't read from it.Ugh.
Should I file a bug on this issue?
Thanks,
Dave
On 2017-11-06 15:35:03 -0500, Tom Lane wrote:
David Pacheco <dap@joyent.com> writes:
I ran into what appears to be a deadlock in the logging subsystem. It
looks like what happened was that the syslogger process exited because it
ran out of memory. But before the postmaster got a chance to handle the
SIGCLD to restart it, it handled a SIGUSR1 to start an autovacuum worker.
That also failed, and the postmaster went to log a message about it, but
it's blocked on the pipe that's normally connected to the syslogger,
presumably because the pipe is full because the syslogger is gone and
hasn't read from it.Ugh.
I'm somewhat inclined to say that one has to live with this if the
system is so resource constrainted that processes barely using memory
get killed.
We could work around a situation like that if we made postmaster use a
*different* pipe as stderr than the one we're handing to normal
backends. If postmaster created a new pipe and closed the read end
whenever forking a syslogger, we should get EPIPEs when writing after
syslogger died and could fall back to proper stderr or such.
Greetings,
Andres Freund
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Fri, Nov 17, 2017 at 10:50 AM, Andres Freund <andres@anarazel.de> wrote:
On 2017-11-06 15:35:03 -0500, Tom Lane wrote:
David Pacheco <dap@joyent.com> writes:
I ran into what appears to be a deadlock in the logging subsystem. It
looks like what happened was that the syslogger process exited because it
ran out of memory. But before the postmaster got a chance to handle the
SIGCLD to restart it, it handled a SIGUSR1 to start an autovacuum worker.
That also failed, and the postmaster went to log a message about it, but
it's blocked on the pipe that's normally connected to the syslogger,
presumably because the pipe is full because the syslogger is gone and
hasn't read from it.Ugh.
I'm somewhat inclined to say that one has to live with this if the
system is so resource constrainted that processes barely using memory
get killed.We could work around a situation like that if we made postmaster use a
*different* pipe as stderr than the one we're handing to normal
backends. If postmaster created a new pipe and closed the read end
whenever forking a syslogger, we should get EPIPEs when writing after
syslogger died and could fall back to proper stderr or such.
I don't have the code on top of my mind, but isn't a custom fd causing
a small penalty when redirection_done is switched to true because the
first process generating a message to the syslogger pipe needs to open
it first if not done yet? So you'd need proper locking to save from
race conditions. Or is the first message redirected message always
generated by the postmaster or the syslogger? I don't recall that this
is actually true..
--
Michael
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On 2017-11-17 11:09:56 +0900, Michael Paquier wrote:
On Fri, Nov 17, 2017 at 10:50 AM, Andres Freund <andres@anarazel.de> wrote:
On 2017-11-06 15:35:03 -0500, Tom Lane wrote:
David Pacheco <dap@joyent.com> writes:
I ran into what appears to be a deadlock in the logging subsystem. It
looks like what happened was that the syslogger process exited because it
ran out of memory. But before the postmaster got a chance to handle the
SIGCLD to restart it, it handled a SIGUSR1 to start an autovacuum worker.
That also failed, and the postmaster went to log a message about it, but
it's blocked on the pipe that's normally connected to the syslogger,
presumably because the pipe is full because the syslogger is gone and
hasn't read from it.Ugh.
I'm somewhat inclined to say that one has to live with this if the
system is so resource constrainted that processes barely using memory
get killed.We could work around a situation like that if we made postmaster use a
*different* pipe as stderr than the one we're handing to normal
backends. If postmaster created a new pipe and closed the read end
whenever forking a syslogger, we should get EPIPEs when writing after
syslogger died and could fall back to proper stderr or such.I don't have the code on top of my mind, but isn't a custom fd causing
a small penalty
Sure, there's some minor overhead because every process would need to
close another fd after forking.
when redirection_done is switched to true because the first process
generating a message to the syslogger pipe needs to open it first if
not done yet?
I can't follow. The syslogger pipe is created when the first syslogger
is started (before it's forked!). Which happens before other processes
are created, because they all need to inherit that file descriptor.
So you'd need proper locking to save from race conditions.
I completely fail to see why this'd be the case. All I'm talking about
is using another pipe between syslogger and postmaster than between
other-processes and syslogger.
Greetings,
Andres Freund
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Fri, Nov 17, 2017 at 11:14 AM, Andres Freund <andres@anarazel.de> wrote:
On 2017-11-17 11:09:56 +0900, Michael Paquier wrote:
when redirection_done is switched to true because the first process
generating a message to the syslogger pipe needs to open it first if
not done yet?I can't follow. The syslogger pipe is created when the first syslogger
is started (before it's forked!). Which happens before other processes
are created, because they all need to inherit that file descriptor.
Ah, OK. I didn't recall this dependency. Sorry for the confusion.
--
Michael
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Andres Freund <andres@anarazel.de> writes:
On 2017-11-06 15:35:03 -0500, Tom Lane wrote:
David Pacheco <dap@joyent.com> writes:
I ran into what appears to be a deadlock in the logging subsystem. It
looks like what happened was that the syslogger process exited because it
ran out of memory. But before the postmaster got a chance to handle the
SIGCLD to restart it, it handled a SIGUSR1 to start an autovacuum worker.
That also failed, and the postmaster went to log a message about it, but
it's blocked on the pipe that's normally connected to the syslogger,
presumably because the pipe is full because the syslogger is gone and
hasn't read from it.
Ugh.
I'm somewhat inclined to say that one has to live with this if the
system is so resource constrainted that processes barely using memory
get killed.
David's report isn't too clear: did the syslogger process actually run
out of memory and exit of its own volition after an ENOMEM, or did it get
killed by the dreaded OOM killer? In either case, it's unclear whether
it was really using an excessive amount of memory. We have not heard
reports suggesting a memory leak in the syslogger, but maybe there is
one under unusual circumstances?
I think you're probably right that the real cause here is the OOM
killer just randomly seizing on the syslogger as a victim process;
although since the syslogger disconnects from shared memory, it's
not very clear why it would score high on the OOM killer's metrics.
The whole thing is definitely odd.
We could work around a situation like that if we made postmaster use a
*different* pipe as stderr than the one we're handing to normal
backends. If postmaster created a new pipe and closed the read end
whenever forking a syslogger, we should get EPIPEs when writing after
syslogger died and could fall back to proper stderr or such.
I think that's nonsense, unfortunately. If the postmaster had its
own pipe, that would reduce the risk of this deadlock because only
the postmaster would be filling that pipe, not the postmaster and
all its other children --- but it wouldn't eliminate the risk.
I doubt the increase in reliability would be enough to justify the
extra complexity and cost.
What might be worth thinking about is allowing the syslogger process to
inherit the postmaster's OOM-kill-proofness setting, instead of dropping
down to the same vulnerability as the postmaster's other child processes.
That presumes that this was an otherwise-unjustified OOM kill, which
I'm not quite sure of ... but it does seem like a situation that could
arise from time to time.
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On 2017-11-16 21:39:49 -0500, Tom Lane wrote:
We could work around a situation like that if we made postmaster use a
*different* pipe as stderr than the one we're handing to normal
backends. If postmaster created a new pipe and closed the read end
whenever forking a syslogger, we should get EPIPEs when writing after
syslogger died and could fall back to proper stderr or such.I think that's nonsense, unfortunately.
Nice phrasing.
If the postmaster had its own pipe, that would reduce the risk of this
deadlock because only the postmaster would be filling that pipe, not
the postmaster and all its other children --- but it wouldn't
eliminate the risk.
The deadlock happens because postmaster is waiting for syslogger accept
a message, and syslogger waits for postmaster to restart it. To resolve
the deadlock postmasterneeds to not wait for a dead sylogger, even if it
hasn't yet received & processed the SIGCLD - what other postmaster
children do or don't do doesn't matter for resolving that cycle. The
reason postmaster currently block on writing to the pipe, instead of
getting EPIPE, is because both ends of the pipe are still
existing. Which in turn is the case because we need to be able to
restart syslogger without passing a new file descriptor to all
subprocesses. If postmaster instead uses a different pipe to write to
it'll not block anymore, instead getting EPIPE, and can continue towards
starting a new syslogger. So I don't think the described deadlock
exists if we were to apply my proposed fix.
What this obviously would not *not* guarantee is being able start a new
syslogger, but it seems fairly impossible to guarantee that. So sure,
other processes would still block until syslogger has successfully
restarted - but it's a resolvable situation rather than a hard deadlock,
which the described situation appears to be.
Note that there's plenty of cases where you could run into this even
without being unable to fork new processes. You'd e.g. could also run
into this while logging the exit of some other subprocess or such,
there's enough ereports in postmaster.
I doubt the increase in reliability would be enough to justify the
extra complexity and cost.
I'm doubtful about that too.
What might be worth thinking about is allowing the syslogger process to
inherit the postmaster's OOM-kill-proofness setting, instead of dropping
down to the same vulnerability as the postmaster's other child processes.
That presumes that this was an otherwise-unjustified OOM kill, which
I'm not quite sure of ... but it does seem like a situation that could
arise from time to time.
Hm. I'm a bit scared about that - it doesn't seem that inconceivable
that various backends log humongous multi-line messages, leading to
syslogger *actually* taking up a fair amount of memory. Note that we're
using plain stringinfos that ereport(ERROR) out of memory situations,
rather than failing more gracefully.
- Andres
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Andres Freund <andres@anarazel.de> writes:
On 2017-11-16 21:39:49 -0500, Tom Lane wrote:
What might be worth thinking about is allowing the syslogger process to
inherit the postmaster's OOM-kill-proofness setting, instead of dropping
down to the same vulnerability as the postmaster's other child processes.
Hm. I'm a bit scared about that - it doesn't seem that inconceivable
that various backends log humongous multi-line messages, leading to
syslogger *actually* taking up a fair amount of memory. Note that we're
using plain stringinfos that ereport(ERROR) out of memory situations,
rather than failing more gracefully.
True, but there's no hard limits on the postmaster's memory consumption
either ... and if the syslogger does get killed on such a basis, we have
at the least lost a bunch of log output. On the whole I think we'd be
better off trying to prevent OOM kills on the syslogger. (That doesn't
preclude other mitigation measures.)
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On November 16, 2017 7:06:23 PM PST, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
On 2017-11-16 21:39:49 -0500, Tom Lane wrote:
What might be worth thinking about is allowing the syslogger process
to
inherit the postmaster's OOM-kill-proofness setting, instead of
dropping
down to the same vulnerability as the postmaster's other child
processes.
Hm. I'm a bit scared about that - it doesn't seem that inconceivable
that various backends log humongous multi-line messages, leading to
syslogger *actually* taking up a fair amount of memory. Note thatwe're
using plain stringinfos that ereport(ERROR) out of memory situations,
rather than failing more gracefully.True, but there's no hard limits on the postmaster's memory consumption
either ...
Is there a credible scenario where it'd allocate many gigabytes of memory?
and if the syslogger does get killed on such a basis, we
have at the least lost a bunch of log output. On the whole I think we'd be
better off trying to prevent OOM kills on the syslogger. (That doesn't
preclude other mitigation measures.)
It doesn't seem impossible to get into a situation where syslogger is the source of the OOM. Just enabling a lot of logging in a workload with many large query strings might do it. So making it less likely to be killed might make the problem worse...
Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Andres Freund <andres@anarazel.de> writes:
It doesn't seem impossible to get into a situation where syslogger is
the source of the OOM. Just enabling a lot of logging in a workload with
many large query strings might do it. So making it less likely to be
killed might make the problem worse...
Hm, so that's another angle David didn't report on: is it possible that
his workload could have resulted in a very large volume of incomplete
in-progress log messages?
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Responding to several points below.
Tom Lane wrote:
David's report isn't too clear: did the syslogger process actually run
out of memory and exit of its own volition after an ENOMEM, or did it get
killed by the dreaded OOM killer? In either case, it's unclear whether
it was really using an excessive amount of memory. We have not heard
reports suggesting a memory leak in the syslogger, but maybe there is
one under unusual circumstances?
I'm running on illumos, where there is no OOM killer.
I expect what happened is that the syslogger process attempted to allocate
memory, failed because the system was low, and explicitly exited. That's
consistent with an exited process, no core file generated, and the "FATAL"
"out
of memory" entries in the log I posted.
Of course, that doesn't mean the syslogger was using much memory. It means
only
that it attempted to allocate any memory in a context where it could not
handle
failing.
Andres Freund wrote:
I'm somewhat inclined to say that one has to live with this if the
system is so resource constrainted that processes barely using memory
get killed.
Again, there's no OOM killer here, so the process has the opportunity to
respond
differently. I can understand the database not functioning well when the
system
is out of memory, but this half-functioning, half-deadlocked state is quite
disruptive. By comparison, while not ideal, if the syslogger had raised
SIGABRT
instead, that would have been significantly better for my use-case. Our
process
manager would have terminated all of the processes associated with the
database
and restarted it.
Tom Lane wrote:
Hm, so that's another angle David didn't report on: is it possible that
his workload could have resulted in a very large volume of incomplete
in-progress log messages?
Yes. I mentioned in my Nov 6 mail that large log messages over a short
period
appear to have been a major contributing factor:
Here are all of the entries in the PostgreSQL log from 23:19:12 until the
top of
the next hour:
There are also many log entries for some very long SQL queries. I'm sure
that
contributed to this problem by filling up the pipe. I was able to
extract the
contents of the pipe while the system was hung, and it was more of these
giant
query strings.
Andres Freund wrote:
Note that there's plenty of cases where you could run into this even
without being unable to fork new processes. You'd e.g. could also run
into this while logging the exit of some other subprocess or such,
there's enough ereports in postmaster.
This is a critical point. As far as I can tell, all that's necessary for
this
deadlock to occur is:
- the syslogger is unable to make forward progress (e.g., because it
segfaulted)
- enough other processes write to the pipe that it fills up before the
postmaster can restart the syslogger
- the postmaster attempts to write to the log as part of any of its signal
handling operations
It seems to me that the same thing can happen if there were a bug in the
syslogger that resulted in a segfault around the same time that the
postmaster
attempted to start an autovacuum worker, for example (if there was also
reasonably high log traffic).
Thanks,
Dave
Hi,
On 2017-11-20 10:13:57 -0800, David Pacheco wrote:
I expect what happened is that the syslogger process attempted to allocate
memory, failed because the system was low, and explicitly exited. That's
consistent with an exited process, no core file generated, and the "FATAL"
"out
of memory" entries in the log I posted.
Well unfortunately we're presumably not going to see output from
syslogger's death itself, chicken and egg etc.
Tom Lane wrote:
Hm, so that's another angle David didn't report on: is it possible that
his workload could have resulted in a very large volume of incomplete
in-progress log messages?Yes. I mentioned in my Nov 6 mail that large log messages over a
short period appear to have been a major contributing factor:
So it's actually quite possible that syslogger did use a fair amount of
memory. Could you show its memory usage, just via ps?
This is a critical point. As far as I can tell, all that's necessary
for this deadlock to occur is:- the syslogger is unable to make forward progress (e.g., because it
segfaulted)
This specific case I don't care that much about, but ...
- enough other processes write to the pipe that it fills up before the
postmaster can restart the syslogger
- the postmaster attempts to write to the log as part of any of its signal
handling operationsIt seems to me that the same thing can happen if there were a bug in the
syslogger that resulted in a segfault around the same time that the
postmaster
attempted to start an autovacuum worker, for example (if there was also
reasonably high log traffic).
Greetings,
Andres Freund
Thanks again for looking at this issue.
On Mon, Nov 20, 2017 at 10:34 AM, Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2017-11-20 10:13:57 -0800, David Pacheco wrote:
I expect what happened is that the syslogger process attempted to
allocate
memory, failed because the system was low, and explicitly exited. That's
consistent with an exited process, no core file generated, and the"FATAL"
"out
of memory" entries in the log I posted.Well unfortunately we're presumably not going to see output from
syslogger's death itself, chicken and egg etc.
If that's true, I think that's a separate bug. If a process receives
ENOMEM from an attempt to allocate memory, it can still write out a log
message to the already-open file descriptor indicating that it's exiting
because of that failure. For all I know, that's what happened here, as
there are FATAL "out of memory" entries in the log. (I'd be able to tell
for sure if we had configured logging to include pids in the log messages.)
Tom Lane wrote:
Hm, so that's another angle David didn't report on: is it possible that
his workload could have resulted in a very large volume of incomplete
in-progress log messages?Yes. I mentioned in my Nov 6 mail that large log messages over a
short period appear to have been a major contributing factor:So it's actually quite possible that syslogger did use a fair amount of
memory. Could you show its memory usage, just via ps?
I don't have information from the problematic case because the process had
already exited. Right now, for the same database (actually a different
instance, because this incident triggered a failover):
$ ps -opid,rss,vsz,args -p 37627
PID RSS VSZ COMMAND
37627 2980 14968 /opt/postgresql/9.2.4/bin/postgres -D /manatee/pg/data
I'm not sure what we can infer from that, as this is a different system,
and the workload that generates the very large query strings only runs
occasionally. Those strings are also not logged unless something's gone
wrong.
This is a critical point. As far as I can tell, all that's necessary
for this deadlock to occur is:
- the syslogger is unable to make forward progress (e.g., because it
segfaulted)This specific case I don't care that much about, but ...
I understand if the community isn't interested in fixing this case if other
users aren't seeing it much, but surely it's still a bug that this unusual
case can result in a deadlock?
-- Dave
Hi,
On 2017-11-20 11:12:08 -0800, David Pacheco wrote:
This is a critical point. As far as I can tell, all that's necessary
for this deadlock to occur is:- the syslogger is unable to make forward progress (e.g., because it
segfaulted)This specific case I don't care that much about, but ...
I understand if the community isn't interested in fixing this case if other
users aren't seeing it much, but surely it's still a bug that this unusual
case can result in a deadlock?
I care about the deadlock issue if triggered by out of memory
cases. That's something that can be expected during operation. Being
fully resistant against segfaults in arbitrary places however is not
unlikely to introduce more bugs than cases where it saves the day.
Greetings,
Andres Freund
On Mon, Nov 20, 2017 at 11:12 AM, David Pacheco <dap@joyent.com> wrote:
I understand if the community isn't interested in fixing this case if
other users aren't seeing it much, but surely it's still a bug that this
unusual case can result in a deadlock?
I've filed bug 14945 to cover this issue:
/messages/by-id/20171204201055.27108.18283@wrigleys.postgresql.org
While the underlying cause in this case was likely operator error (i.e.,
too little memory provisioned), the deadlock seems clearly like an issue
that could be hit by others and requires operator intervention to restore
service. I understand it might still be sufficiently rare, and the fix
sufficiently complex, that the community doesn't end up prioritizing it.
Thanks,
Dave
Hi,
On 2017-11-20 11:12:08 -0800, David Pacheco wrote:
$ ps -opid,rss,vsz,args -p 37627
PID RSS VSZ COMMAND
37627 2980 14968 /opt/postgresql/9.2.4/bin/postgres -D /manatee/pg/dataI'm not sure what we can infer from that, as this is a different system,
and the workload that generates the very large query strings only runs
occasionally. Those strings are also not logged unless something's gone
wrong.
FWIW, I'd like to see a report of this around the time the issue
occurred before doing anything further here.
Greetings,
Andres Freund
On Mon, Dec 4, 2017 at 12:23 PM, Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2017-11-20 11:12:08 -0800, David Pacheco wrote:
$ ps -opid,rss,vsz,args -p 37627
PID RSS VSZ COMMAND
37627 2980 14968 /opt/postgresql/9.2.4/bin/postgres -D /manatee/pg/dataI'm not sure what we can infer from that, as this is a different system,
and the workload that generates the very large query strings only runs
occasionally. Those strings are also not logged unless something's gone
wrong.FWIW, I'd like to see a report of this around the time the issue
occurred before doing anything further here.
This failure begins when this process exits, so the best you could get is
memory in use immediately before it exited. I obviously can't get that now
for the one instance I saw weeks ago, but maybe PostgreSQL could log
information about current memory usage when it's about to exit because of
ENOMEM? That way if anybody hits a similar condition in the future, the
data will be available to answer your question.
That said, I think the deadlock itself is pretty well explained by the data
we have already.
-- Dave