Problems Restarting PostgreSQL Daemon

Started by Rich Shepardover 17 years ago29 messagesgeneral
Jump to latest
#1Rich Shepard
rshepard@appl-ecosys.com

My server is rebooted infrequently, usually after a kernel upgrade and
on very rare occasions when something causes it to hang. After rebooting I
always have serious issues getting postgresql running again, even though the
startup script is part of the boot sequence. Yesterday was one of those
highly unusual hangs, and I cannot restart the service. I'd like to
understand why.

When I run the Slackware script, '/etc/rc.d/rc.postgresql start' (script
attached), I'm shown a process ID and told the daemon is already running.
For example:

Starting PostgreSQL
15342
PostgreSQL daemon already running

However, there is no process ID 15342, and no postgres running. I manually
removed /tmp/.s.PGSQL.5432 and its log file. Also -- apparently in error --
the .pid file. Makes no difference.

Perhaps there's an error in the script that I'm not seeing (I didn't write
it). Regardless, if I learn why there's a problem I can fix the script and
avoid this delay and hassle restarting postgres after the daemon's been shut
down.

TIA,

Rich

--
Richard B. Shepard, Ph.D. | Integrity Credibility
Applied Ecosystem Services, Inc. | Innovation
<http://www.appl-ecosys.com&gt; Voice: 503-667-4517 Fax: 503-667-8863

Attachments:

rc.postgresqltext/plain; charset=US-ASCII; name=rc.postgresqlDownload
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rich Shepard (#1)
Re: Problems Restarting PostgreSQL Daemon

Rich Shepard <rshepard@appl-ecosys.com> writes:

My server is rebooted infrequently, usually after a kernel upgrade and
on very rare occasions when something causes it to hang. After rebooting I
always have serious issues getting postgresql running again, even though the
startup script is part of the boot sequence. Yesterday was one of those
highly unusual hangs, and I cannot restart the service. I'd like to
understand why.

When I run the Slackware script, '/etc/rc.d/rc.postgresql start' (script
attached), I'm shown a process ID and told the daemon is already running.

The short answer is probably "don't use Slackware's startup script".
Some distros have PG start scripts that have had the bugs beaten out
of them, and others not so much.

Perhaps there's an error in the script that I'm not seeing (I didn't write
it). Regardless, if I learn why there's a problem I can fix the script and
avoid this delay and hassle restarting postgres after the daemon's been shut
down.

Have you read the script to see what condition causes it to issue the
mentioned error? I'd imagine that it's looking at some other lockfile
than you think.

regards, tom lane

#3Rich Shepard
rshepard@appl-ecosys.com
In reply to: Tom Lane (#2)
Re: Problems Restarting PostgreSQL Daemon

On Tue, 22 Jul 2008, Tom Lane wrote:

The short answer is probably "don't use Slackware's startup script". Some
distros have PG start scripts that have had the bugs beaten out of them,
and others not so much.

Excellent advice, Tom. I'll take it.

Have you read the script to see what condition causes it to issue the
mentioned error? I'd imagine that it's looking at some other lockfile
than you think.

I tried following the logic, and it appears the issue now is 'invalid data
in PID file "/var/lib/pgsql/data/postmaster.pid" '. If I delete that file,
is it automatically recreated? I'm using /usr/bin/pg_ctl as user postgres.

Thanks,

Rich

--
Richard B. Shepard, Ph.D. | Integrity Credibility
Applied Ecosystem Services, Inc. | Innovation
<http://www.appl-ecosys.com&gt; Voice: 503-667-4517 Fax: 503-667-8863

#4Jeff Soules
soules@gmail.com
In reply to: Rich Shepard (#3)
Re: Problems Restarting PostgreSQL Daemon

I tried following the logic, and it appears the issue now is 'invalid data
in PID file "/var/lib/pgsql/data/postmaster.pid" '. If I delete that file,
is it automatically recreated?

Why not just move it and rename it? If it's recreated, great; if not,
you still have the corrupted file on hand to try to fix, no?

Show quoted text

On Tue, Jul 22, 2008 at 11:15 AM, Rich Shepard <rshepard@appl-ecosys.com> wrote:

On Tue, 22 Jul 2008, Tom Lane wrote:

The short answer is probably "don't use Slackware's startup script". Some
distros have PG start scripts that have had the bugs beaten out of them,
and others not so much.

Excellent advice, Tom. I'll take it.

Have you read the script to see what condition causes it to issue the
mentioned error? I'd imagine that it's looking at some other lockfile
than you think.

I tried following the logic, and it appears the issue now is 'invalid data
in PID file "/var/lib/pgsql/data/postmaster.pid" '. If I delete that file,
is it automatically recreated? I'm using /usr/bin/pg_ctl as user postgres.

Thanks,

Rich

--
Richard B. Shepard, Ph.D. | Integrity Credibility
Applied Ecosystem Services, Inc. | Innovation
<http://www.appl-ecosys.com&gt; Voice: 503-667-4517 Fax: 503-667-8863

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rich Shepard (#3)
Re: Problems Restarting PostgreSQL Daemon

Rich Shepard <rshepard@appl-ecosys.com> writes:

I tried following the logic, and it appears the issue now is 'invalid data
in PID file "/var/lib/pgsql/data/postmaster.pid" '. If I delete that file,
is it automatically recreated? I'm using /usr/bin/pg_ctl as user postgres.

If you're certain there's no postmaster running, it's safe to remove
postmaster.pid. However you really shouldn't have to; the postmaster
is generally able to figure out whether a pidfile is live or not.

The "invalid data" bit is interesting though. It looks like pg_ctl
would produce that error if the pidfile exists but is empty when it
looks. This seems like a race condition hazard, though the odds of
hitting it are tiny. What's in the file exactly?

regards, tom lane

#6Rich Shepard
rshepard@appl-ecosys.com
In reply to: Tom Lane (#5)
Re: Problems Restarting PostgreSQL Daemon

On Tue, 22 Jul 2008, Tom Lane wrote:

If you're certain there's no postmaster running, it's safe to remove
postmaster.pid. However you really shouldn't have to; the postmaster is
generally able to figure out whether a pidfile is live or not.

Tom,

I thought the postmaster knew what was current and what needed to be
replaced, but the process ID in the pidfile did not exist.

The "invalid data" bit is interesting though. It looks like pg_ctl would
produce that error if the pidfile exists but is empty when it looks. This
seems like a race condition hazard, though the odds of hitting it are
tiny. What's in the file exactly?

I deleted the .pid, but still could not get the postmaster running. Then I
'touched' the name so I had an empty file. Made no difference. While pg_ctl
tells me the server is starting, there is no /tmp/.s.PGSQL*, no pidfile, and
no postmaster process.

In the past I've managed to start the postmaster daemon manually, but
today I seem to have it FUBARed.

Rich

--
Richard B. Shepard, Ph.D. | Integrity Credibility
Applied Ecosystem Services, Inc. | Innovation
<http://www.appl-ecosys.com&gt; Voice: 503-667-4517 Fax: 503-667-8863

#7Andrej Ricnik-Bay
andrej.groups@gmail.com
In reply to: Rich Shepard (#1)
Re: Problems Restarting PostgreSQL Daemon

On 23/07/2008, Rich Shepard <rshepard@appl-ecosys.com> wrote:

When I run the Slackware script, '/etc/rc.d/rc.postgresql start' (script
attached), I'm shown a process ID and told the daemon is already running.
For example:

Since there are no official Slackware postgres packages
I'd like to ask where that script came from :) and how you
installed postges in the first place. Happy to communicate
of the list if you prefer that.

TIA,

Rich

Cheers,
Andrej

#8Rich Shepard
rshepard@appl-ecosys.com
In reply to: Andrej Ricnik-Bay (#7)
Re: Problems Restarting PostgreSQL Daemon

On Wed, 23 Jul 2008, Andrej Ricnik-Bay wrote:

Since there are no official Slackware postgres packages I'd like to ask
where that script came from :) and how you installed postges in the first
place. Happy to communicate of the list if you prefer that.

Andrej,

Unless others consider this topic to be not appropriate for the list, I
don't mind a public conversation. I thought that I attached the script to my
original message; regardless, here's the attribution:

# PostgreSQL startup script for Slackware Linux
# Copyright 2007 Adis Nezirovic <adis _at_ linux.org.ba>
# Licensed under GNU GPL v2

I upgraded postgres manually, not creating and using a Slackware package.
It worked just fine until yesterday's reboot.

Thanks,

Rich

--
Richard B. Shepard, Ph.D. | Integrity Credibility
Applied Ecosystem Services, Inc. | Innovation
<http://www.appl-ecosys.com&gt; Voice: 503-667-4517 Fax: 503-667-8863

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rich Shepard (#6)
Re: Problems Restarting PostgreSQL Daemon

Rich Shepard <rshepard@appl-ecosys.com> writes:

On Tue, 22 Jul 2008, Tom Lane wrote:

The "invalid data" bit is interesting though. It looks like pg_ctl would
produce that error if the pidfile exists but is empty when it looks. This
seems like a race condition hazard, though the odds of hitting it are
tiny. What's in the file exactly?

I deleted the .pid, but still could not get the postmaster running. Then I
'touched' the name so I had an empty file. Made no difference. While pg_ctl
tells me the server is starting, there is no /tmp/.s.PGSQL*, no pidfile, and
no postmaster process.

Sounds to me like the postmaster tries to start and fails. Look into
the postmaster log. (If the log is going to /dev/null, send it
someplace else...)

regards, tom lane

#10Andrej Ricnik-Bay
andrej.groups@gmail.com
In reply to: Rich Shepard (#8)
Re: Problems Restarting PostgreSQL Daemon

On 23/07/2008, Rich Shepard <rshepard@appl-ecosys.com> wrote:

Andrej,

Hi Rich,

Unless others consider this topic to be not appropriate for the list, I
don't mind a public conversation. I thought that I attached the script to
my original message; regardless, here's the attribution:

You did - my bad. I usually ignore attachments on mailing-lists,
and did so with yours.

I upgraded postgres manually, not creating and using a Slackware package.
It worked just fine until yesterday's reboot.

Now there's an interesting piece of information :) How long
ago did you upgrade it?

From which version of pg to which version did you upgrade,

and how did you go about it? Chances are indeed that the
postmasters logfile (/var/log/postgres) may hold crucial
information as Tom suggested.

Thanks,

Rich

Cheers,
Andrej

--
Please don't top post, and don't use HTML e-Mail :} Make your quotes concise.

http://www.american.edu/econ/notes/htmlmail.htm

#11Rich Shepard
rshepard@appl-ecosys.com
In reply to: Andrej Ricnik-Bay (#10)
Re: Problems Restarting PostgreSQL Daemon

On Wed, 23 Jul 2008, Andrej Ricnik-Bay wrote:

Now there's an interesting piece of information :) How long
ago did you upgrade it?

Andrej,

A month ago; June 17th to be exact.

From which version of pg to which version did you upgrade,

From 8.1.13 to 8.3.3.

and how did you go about it? Chances are indeed that the postmasters
logfile (/var/log/postgres) may hold crucial information as Tom suggested.

Well, after digging myself into a hole, I received help here and climbed
out. It was working last week (when I made some entries into my accounting
system and viewed the local version of our web site). However, ...

... something broke during the reboot. From /var/log/postgresql:

FATAL: database files are incompatible with server
DETAIL: The database cluster was initialized with PG_CONTROL_VERSION 812,
but the server was compiled with PG_CONTROL_VERSION 833.
HINT: It looks like you need to initdb.

I still have the old pgsql (8.1.13) still in a non-standard directory. I
had run initdb after cleaning up the upgrade. Should I do so again?

Thanks,

Rich

--
Richard B. Shepard, Ph.D. | Integrity Credibility
Applied Ecosystem Services, Inc. | Innovation
<http://www.appl-ecosys.com&gt; Voice: 503-667-4517 Fax: 503-667-8863

#12Joshua D. Drake
jd@commandprompt.com
In reply to: Rich Shepard (#11)
Re: Problems Restarting PostgreSQL Daemon

On Tue, 2008-07-22 at 18:05 -0700, Rich Shepard wrote:

On Wed, 23 Jul 2008, Andrej Ricnik-Bay wrote:

Now there's an interesting piece of information :) How long
ago did you upgrade it?

... something broke during the reboot. From /var/log/postgresql:

FATAL: database files are incompatible with server
DETAIL: The database cluster was initialized with PG_CONTROL_VERSION 812,
but the server was compiled with PG_CONTROL_VERSION 833.
HINT: It looks like you need to initdb.

I still have the old pgsql (8.1.13) still in a non-standard directory. I
had run initdb after cleaning up the upgrade. Should I do so again?

It looks to me like your init script just isn't pointing to the 8.3.3
data directory. If you are unsure you can do this:

find / -name PG_VERSION

You likely have 2 or 3 of them. Find the one that says 8.3 and make sure
your start up script points there.

Joshua D. Drake

Thanks,

Rich

--
Richard B. Shepard, Ph.D. | Integrity Credibility
Applied Ecosystem Services, Inc. | Innovation
<http://www.appl-ecosys.com&gt; Voice: 503-667-4517 Fax: 503-667-8863

--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#13Rich Shepard
rshepard@appl-ecosys.com
In reply to: Andrej Ricnik-Bay (#10)
Re: Problems Restarting PostgreSQL Daemon

On Wed, 23 Jul 2008, Andrej Ricnik-Bay wrote:

Now there's an interesting piece of information :) How long
ago did you upgrade it?

Andrej,

I found the thread in the archives for June of this year.

Re-reading the posted results of running initdb I tried a different
approach to starting the server. Instead of using pg_ctl I used 'postgres -D
/var/lib/pgsql/data &' (while logged in as user postgres, of course.) That
cleaned up a bad shutdown (when I had to reboot the system after it hung),
fixed the missing socket, and replaced the .pid. So, it's up and running
once again.

My question is how best to modify the startup script so the postmaster
fires up when the system is rebooted. I don't see an option to 'su' to
specify the postgres user's password so I can script this. Have you any
recommendation?

Thanks,

Rich

--
Richard B. Shepard, Ph.D. | Integrity Credibility
Applied Ecosystem Services, Inc. | Innovation
<http://www.appl-ecosys.com&gt; Voice: 503-667-4517 Fax: 503-667-8863

#14Andrej Ricnik-Bay
andrej.groups@gmail.com
In reply to: Rich Shepard (#13)
Re: Problems Restarting PostgreSQL Daemon

On 27/07/2008, Rich Shepard <rshepard@appl-ecosys.com> wrote:

Andrej,

Hi Rich,

I found the thread in the archives for June of this year.

Re-reading the posted results of running initdb I tried a different
approach to starting the server. Instead of using pg_ctl I used 'postgres
-D
/var/lib/pgsql/data &' (while logged in as user postgres, of course.) That
cleaned up a bad shutdown (when I had to reboot the system after it hung),
fixed the missing socket, and replaced the .pid. So, it's up and running
once again.

My question is how best to modify the startup script so the postmaster
fires up when the system is rebooted. I don't see an option to 'su' to
specify the postgres user's password so I can script this. Have you any
recommendation?

Since Slackware doesn't use the SysV style of inits but default the
easiest way for you to achieve an automatic start-up of postgres
on reboot would be to add something like
if [ -x /etc/rc.d/rc.postgres ]; then
/etc/rc.d/rc.postgres start
fi
to your /etc/rc.d/rc.local

Thanks,

Rich

Cheers,
Andrej

--
Please don't top post, and don't use HTML e-Mail :} Make your quotes concise.

http://www.american.edu/econ/notes/htmlmail.htm

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rich Shepard (#13)
Re: Problems Restarting PostgreSQL Daemon

Rich Shepard <rshepard@appl-ecosys.com> writes:

My question is how best to modify the startup script so the postmaster
fires up when the system is rebooted. I don't see an option to 'su' to
specify the postgres user's password so I can script this.

Startup scripts invariably run as root, so 'su' isn't going to ask
for a password...

regards, tom lane

#16Andrej Ricnik-Bay
andrej.groups@gmail.com
In reply to: Tom Lane (#15)
Re: Problems Restarting PostgreSQL Daemon

On 27/07/2008, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Startup scripts invariably run as root, so 'su' isn't going to ask
for a password...

And it's nothing to worry about because the script he's using
is suing to the postgres user anyway ...

regards, tom lane

Cheerw,
Andrej

--
Please don't top post, and don't use HTML e-Mail :} Make your quotes concise.

http://www.american.edu/econ/notes/htmlmail.htm

#17Rich Shepard
rshepard@appl-ecosys.com
In reply to: Andrej Ricnik-Bay (#14)
Re: Problems Restarting PostgreSQL Daemon

On Sun, 27 Jul 2008, Andrej Ricnik-Bay wrote:

if [ -x /etc/rc.d/rc.postgres ]; then
/etc/rc.d/rc.postgres start
fi
to your /etc/rc.d/rc.local

Well, that's the problem, Andrej. I have that script, and it worked fine
with postgres-6.x through -8.1, but failed to correctly start the postmaster
after the system reboot.

I can try twiddling with the script; it calls pg_ctl, and that should
work, but apparently something broke last week.

Thanks,

Rich

--
Richard B. Shepard, Ph.D. | Integrity Credibility
Applied Ecosystem Services, Inc. | Innovation
<http://www.appl-ecosys.com&gt; Voice: 503-667-4517 Fax: 503-667-8863

#18Rich Shepard
rshepard@appl-ecosys.com
In reply to: Tom Lane (#15)
Re: Problems Restarting PostgreSQL Daemon

On Sat, 26 Jul 2008, Tom Lane wrote:

Startup scripts invariably run as root, so 'su' isn't going to ask for a
password...

Tom,

That occurred to me after I wrote the message. Think that I'll tune the
script to use a command that I know is working with 8.3.3.

Many thanks,

Rich

--
Richard B. Shepard, Ph.D. | Integrity Credibility
Applied Ecosystem Services, Inc. | Innovation
<http://www.appl-ecosys.com&gt; Voice: 503-667-4517 Fax: 503-667-8863

#19Andrej Ricnik-Bay
andrej.groups@gmail.com
In reply to: Rich Shepard (#17)
Re: Problems Restarting PostgreSQL Daemon

On 27/07/2008, Rich Shepard <rshepard@appl-ecosys.com> wrote:

Well, that's the problem, Andrej. I have that script, and it worked fine
with postgres-6.x through -8.1, but failed to correctly start the
postmaster after the system reboot.

I thought we had established that this issue was caused by
the current instance pointing at the old installs data directory?

I can try twiddling with the script; it calls pg_ctl, and that should
work, but apparently something broke last week.

That should be quite easy to tweak, really ... my current script
(slightly modified from the one in contrib/startup-scripts) is attached...
You may need to change the dirs in the script yet a bit.

Thanks,

Rich

Cheers,
Andrej

--
Please don't top post, and don't use HTML e-Mail :} Make your quotes concise.

http://www.american.edu/econ/notes/htmlmail.htm

Attachments:

rc.postgresapplication/octet-stream; name=rc.postgresDownload
#20Rich Shepard
rshepard@appl-ecosys.com
In reply to: Andrej Ricnik-Bay (#19)
Re: Problems Restarting PostgreSQL Daemon

On Sun, 27 Jul 2008, Andrej Ricnik-Bay wrote:

I thought we had established that this issue was caused by the current
instance pointing at the old installs data directory?

No, that wasn't the problem.

If I use 'postgres -D /var/lib/pgsql/data &' the postmaster starts
correctly and everything runs as intended. If I use '/etc/rc.d/rc.postgresql
start' I get error messages about the postmaster already running and an
invalid .pid.

That should be quite easy to tweak, really ... my current script (slightly
modified from the one in contrib/startup-scripts) is attached... You may
need to change the dirs in the script yet a bit.

Thank you. I think that for some reason using pg_ctl to start the
postmaster is no longer working here. As I have time, I'll look into why.

Rich

--
Richard B. Shepard, Ph.D. | Integrity Credibility
Applied Ecosystem Services, Inc. | Innovation
<http://www.appl-ecosys.com&gt; Voice: 503-667-4517 Fax: 503-667-8863

#21Andrej Ricnik-Bay
andrej.groups@gmail.com
In reply to: Rich Shepard (#20)
#22Rich Shepard
rshepard@appl-ecosys.com
In reply to: Andrej Ricnik-Bay (#21)
#23Yi Zhao
yi.zhao@alibaba-inc.com
In reply to: Rich Shepard (#1)
#24Craig Ringer
craig@2ndquadrant.com
In reply to: Yi Zhao (#23)
#25Yi Zhao
yi.zhao@alibaba-inc.com
In reply to: Craig Ringer (#24)
#26Craig Ringer
craig@2ndquadrant.com
In reply to: Yi Zhao (#25)
#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Craig Ringer (#26)
#28Yi Zhao
yi.zhao@alibaba-inc.com
In reply to: Tom Lane (#27)
#29Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Yi Zhao (#28)