Found this in the server log on MAC OSX

Started by Sean Shannyabout 22 years ago10 messagesgeneral
Jump to latest
#1Sean Shanny
shannyconsulting@earthlink.net

To all,

I restarted postgres this morning to make a configuration file change
take effect and when I looked at the serverlog to make sure things were
ok I found the prior shutdown message(*in bold*). This shut down was
not initiated by me or anyone else who has access to the machine. Any
ideas as to what might be causing this?

OSX Server 10.3.2 running on a dual 2.ghz G5 with 8GB memory attached to
a XRaid 3.5TB system via Fibre channel. We are running 7.4.1. The
LOG: unexpected EOF on client connection messages are from our JDBC
implementation, they are benign.

Thanks.

--sean

LOG: unexpected EOF on client connection
*LOG: received smart shutdown request *
*FATAL: the database system is shutting down
FATAL: the database system is shutting down
LOG: server process (PID 4691) was terminated by signal 9
LOG: terminating any other active server processes
LOG: statistics collector process (PID 361) was terminated by signal 9
LOG: all server processes terminated; reinitializing
LOG: database system was interrupted at 2004-02-23 07:50:15 EST
LOG: checkpoint record is at D0/B2FD5AB4
LOG: redo record is at D0/B2FD5AB4; undo record is at 0/0; shutdown FALSE
LOG: next transaction ID: 31473487; next OID: 3103140043
LOG: database system was not properly shut down; automatic recovery in
progress
LOG: record with zero length at D0/B2FD5AF4
LOG: redo is not required
LOG: database system is ready*
LOG: shutting down
LOG: database system is shut down
LOG: database system was shut down at 2004-02-23 08:12:28 EST
LOG: checkpoint record is at D0/B2FD5B34
LOG: redo record is at D0/B2FD5B34; undo record is at 0/0; shutdown TRUE
LOG: next transaction ID: 31473487; next OID: 3103140043
LOG: database system is ready

#2Ed L.
pgsql@bluepolka.net
In reply to: Sean Shanny (#1)
Re: Found this in the server log on MAC OSX

On Monday February 23 2004 8:43, Sean Shanny wrote:

I restarted postgres this morning to make a configuration file change
take effect and when I looked at the serverlog to make sure things were
ok I found the prior shutdown message(*in bold*). This shut down was
not initiated by me or anyone else who has access to the machine. Any
ideas as to what might be causing this?

LOG: unexpected EOF on client connection
*LOG: received smart shutdown request *
*FATAL: the database system is shutting down
FATAL: the database system is shutting down
LOG: server process (PID 4691) was terminated by signal 9
LOG: terminating any other active server processes
LOG: statistics collector process (PID 361) was terminated by signal 9

Looks like it got a SIGTERM, which is what you might get if someone shutdown
OSX (osx pls gurus correct me). Maybe your machine rebooted? Logging
pids and timestamps might help tell the tale next time ...

#3Sean Shanny
shannyconsulting@earthlink.net
In reply to: Ed L. (#2)
Re: Found this in the server log on MAC OSX

Ed,

If I continue to see the behavior I will turn the logging stuff on as
you suggested. The machine uptime is show 3+ days so it did not reboot.

Thanks.

--sean

Ed L. wrote:

Show quoted text

On Monday February 23 2004 8:43, Sean Shanny wrote:

I restarted postgres this morning to make a configuration file change
take effect and when I looked at the serverlog to make sure things were
ok I found the prior shutdown message(*in bold*). This shut down was
not initiated by me or anyone else who has access to the machine. Any
ideas as to what might be causing this?

LOG: unexpected EOF on client connection
*LOG: received smart shutdown request *
*FATAL: the database system is shutting down
FATAL: the database system is shutting down
LOG: server process (PID 4691) was terminated by signal 9
LOG: terminating any other active server processes
LOG: statistics collector process (PID 361) was terminated by signal 9

Looks like it got a SIGTERM, which is what you might get if someone shutdown
OSX (osx pls gurus correct me). Maybe your machine rebooted? Logging
pids and timestamps might help tell the tale next time ...

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#2)
Re: Found this in the server log on MAC OSX

"Ed L." <pgsql@bluepolka.net> writes:

On Monday February 23 2004 8:43, Sean Shanny wrote:

*LOG: received smart shutdown request *
*FATAL: the database system is shutting down
FATAL: the database system is shutting down
LOG: server process (PID 4691) was terminated by signal 9
LOG: terminating any other active server processes
LOG: statistics collector process (PID 361) was terminated by signal 9

Looks like it got a SIGTERM, which is what you might get if someone shutdown
OSX (osx pls gurus correct me).

Uh, no, signal 9 is SIGKILL not SIGTERM. I'm not aware of any automatic
mechanism in OS X that would issue SIGKILL against a Postgres backend.
Certainly Postgres itself would not. Some Linux kernels issue SIGKILL
to get out of out-of-memory situations, but I believe OS X to be better
behaved than that.

My private opinion is that Sean is mistaken and that the above trace
shows someone manually (and not very competently) shutting down the
database. First they tried a SIGTERM against the postmaster (or
equivalently "pg_ctl stop") and after getting tired of waiting for
the clients to shut down, they did SIGKILL against the backends.
It's too bad we have no timestamps in this log, as the intervals
between the above-recorded entries would be very revealing.

regards, tom lane

#5Ed L.
pgsql@bluepolka.net
In reply to: Tom Lane (#4)
Re: Found this in the server log on MAC OSX

On Monday February 23 2004 10:50, Tom Lane wrote:

"Ed L." <pgsql@bluepolka.net> writes:

On Monday February 23 2004 8:43, Sean Shanny wrote:

*LOG: received smart shutdown request *
*FATAL: the database system is shutting down
FATAL: the database system is shutting down
LOG: server process (PID 4691) was terminated by signal 9
LOG: terminating any other active server processes
LOG: statistics collector process (PID 361) was terminated by signal
9

Looks like it got a SIGTERM, which is what you might get if someone
shutdown OSX (osx pls gurus correct me).

Uh, no, signal 9 is SIGKILL not SIGTERM. I'm not aware of any automatic
mechanism in OS X that would issue SIGKILL against a Postgres backend.
Certainly Postgres itself would not. Some Linux kernels issue SIGKILL
to get out of out-of-memory situations, but I believe OS X to be better
behaved than that.

Uh, no, I didn't say signal 9 is SIGTERM. Isn't a "smart" shutdown request
an indication of a SIGTERM? I'm just speculating about what happened, but
isn't that what you'd see during a system shutdown? The kernel sending
SIGTERMs?

#6Sean Shanny
shannyconsulting@earthlink.net
In reply to: Tom Lane (#4)
Re: Found this in the server log on MAC OSX

To all,

It looks like it is the fault of the shutdown script <said Sean with
sheepish look on my face>

We have to take a look a the scripts, found them on the web, to figure
out what they are doing to cause this.

Sorry for the clutter in your mailboxes.

--sean

waiting for postmaster to shut
down................................................................ failed
pg_ctl: postmaster does not shut down
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: ??: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: S: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: 3:35.75: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: postgres::
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: tripmaster:
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: tripmaster:
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill:
192.168.34.19: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: idle: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: ??: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: S: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: 0:03.56: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: postgres::
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: tripmaster:
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: tripmaster:
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill:
192.168.35.16: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: idle: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: in: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: transaction:
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: std-: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: S: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: 0:00.71: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: postgres::
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: stats: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: buffer: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: process: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: std-: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: S: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: 0:03.78: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: postgres::
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: stats: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: collector:
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: process: no
such pid

Tom Lane wrote:

Show quoted text

"Ed L." <pgsql@bluepolka.net> writes:

On Monday February 23 2004 8:43, Sean Shanny wrote:

*LOG: received smart shutdown request *
*FATAL: the database system is shutting down
FATAL: the database system is shutting down
LOG: server process (PID 4691) was terminated by signal 9
LOG: terminating any other active server processes
LOG: statistics collector process (PID 361) was terminated by signal 9

Looks like it got a SIGTERM, which is what you might get if someone shutdown
OSX (osx pls gurus correct me).

Uh, no, signal 9 is SIGKILL not SIGTERM. I'm not aware of any automatic
mechanism in OS X that would issue SIGKILL against a Postgres backend.
Certainly Postgres itself would not. Some Linux kernels issue SIGKILL
to get out of out-of-memory situations, but I believe OS X to be better
behaved than that.

My private opinion is that Sean is mistaken and that the above trace
shows someone manually (and not very competently) shutting down the
database. First they tried a SIGTERM against the postmaster (or
equivalently "pg_ctl stop") and after getting tired of waiting for
the clients to shut down, they did SIGKILL against the backends.
It's too bad we have no timestamps in this log, as the intervals
between the above-recorded entries would be very revealing.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed L. (#5)
Re: Found this in the server log on MAC OSX

"Ed L." <pgsql@bluepolka.net> writes:

Uh, no, I didn't say signal 9 is SIGTERM. Isn't a "smart" shutdown request
an indication of a SIGTERM? I'm just speculating about what happened, but
isn't that what you'd see during a system shutdown? The kernel sending
SIGTERMs?

Yes, the trace is sort of consistent with the idea of a system shutdown:
you'd see SIGTERMs issued, followed some time later by SIGKILL.
I thought Sean had said that the machine did not shut down during this
interval, and so mentally eliminated that theory --- but based on his
latest comment I guess that is what happened after all.

So that does leave me with a question: why didn't it work more cleanly?
Our signal responses are designed around the assumption that during
shutdown the kernel will send SIGTERM to *all* the Postgres processes.
Backends interpret that as an immediate shutdown and should exit quickly
enough to avoid getting SIGKILL'd later. It looks like either the
postmaster was sent SIGTERM but the backends weren't, or the interval
between SIGTERM and SIGKILL was unreasonably short. I don't think I
believe the latter; the last time I checked this on Darwin, it seemed to
be using the traditional 20-second grace period.

Another question: if that was a shutdown we were looking at, how did the
postmaster live long enough to record the final log lines? It shoulda
gotten SIGKILL'd at the same time as its children.

In short, there's something pretty odd about the way these signals are
being passed around. It looks something like a standard system shutdown
sequence, but not enough like it.

regards, tom lane

#8Sean Shanny
shannyconsulting@earthlink.net
In reply to: Tom Lane (#7)
Re: Found this in the server log on MAC OSX

Tom,

Let me clarify.... I was meant shutdown in the context of issuing a stop
against postgres not shutting down the OS. Sorry if I am confusing things.

The scripts we are using to issue start, stop etc for postgres seem to
be causing the issue. I changed the config to use timestamps in the log
and the act of stopping and starting the server caused the same error to
occur. :-(

From the scripts we are using:

StartService ()
{
if [ "${POSTGRES:=-YES-}" = "-YES-" ]; then

ConsoleMessage "Starting PostgreSQL database server"
su - postgres -c '/usr/local/pgsql/bin/pg_ctl start -D
/usr/local/pgsql/data -l /usr/local/pgsql/data/logfile -o -i'

fi
}

StopService()
{
ConsoleMessage "Stopping PostgreSQL database services"
/usr/local/pgsql/bin/pg_ctl stop -D /usr/local/pgsql/data
x=`/bin/ps axc | /usr/bin/grep postgres`
if /bin/test "$x"
then
set $x
kill -9 $x
fi
}

Thanks.

--sean

Tom Lane wrote:

Show quoted text

"Ed L." <pgsql@bluepolka.net> writes:

Uh, no, I didn't say signal 9 is SIGTERM. Isn't a "smart" shutdown request
an indication of a SIGTERM? I'm just speculating about what happened, but
isn't that what you'd see during a system shutdown? The kernel sending
SIGTERMs?

Yes, the trace is sort of consistent with the idea of a system shutdown:
you'd see SIGTERMs issued, followed some time later by SIGKILL.
I thought Sean had said that the machine did not shut down during this
interval, and so mentally eliminated that theory --- but based on his
latest comment I guess that is what happened after all.

So that does leave me with a question: why didn't it work more cleanly?
Our signal responses are designed around the assumption that during
shutdown the kernel will send SIGTERM to *all* the Postgres processes.
Backends interpret that as an immediate shutdown and should exit quickly
enough to avoid getting SIGKILL'd later. It looks like either the
postmaster was sent SIGTERM but the backends weren't, or the interval
between SIGTERM and SIGKILL was unreasonably short. I don't think I
believe the latter; the last time I checked this on Darwin, it seemed to
be using the traditional 20-second grace period.

Another question: if that was a shutdown we were looking at, how did the
postmaster live long enough to record the final log lines? It shoulda
gotten SIGKILL'd at the same time as its children.

In short, there's something pretty odd about the way these signals are
being passed around. It looks something like a standard system shutdown
sequence, but not enough like it.

regards, tom lane

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sean Shanny (#8)
Re: Found this in the server log on MAC OSX

Sean Shanny <shannyconsulting@earthlink.net> writes:

StopService()
{
ConsoleMessage "Stopping PostgreSQL database services"
/usr/local/pgsql/bin/pg_ctl stop -D /usr/local/pgsql/data
x=`/bin/ps axc | /usr/bin/grep postgres`
if /bin/test "$x"
then
set $x
kill -9 $x
fi
}

Gack. Whoever wrote this should be shot. Take out everything after the
"pg_ctl stop" line. You might want to add "-m immediate" to the pg_ctl
command as a less dangerous way of encouraging a fast shutdown...

regards, tom lane

#10David Garamond
lists@zara.6.isreserved.com
In reply to: Tom Lane (#9)
Re: Found this in the server log on MAC OSX

Tom Lane wrote:

StopService()
{
ConsoleMessage "Stopping PostgreSQL database services"
/usr/local/pgsql/bin/pg_ctl stop -D /usr/local/pgsql/data
x=`/bin/ps axc | /usr/bin/grep postgres`
if /bin/test "$x"
then
set $x
kill -9 $x
fi
}

Gack. Whoever wrote this should be shot. Take out everything after the
"pg_ctl stop" line. You might want to add "-m immediate" to the pg_ctl
command as a less dangerous way of encouraging a fast shutdown...

BTW, I remember vaguely that the Cpanel 3 scripts (Cpanel is a
webhosting control panel product) used to do something like this for
every daemon... the horror...

--
dave