sick DB - ??
Postgres 7.1.2, FreeBSD 3.4
Box got sick, had to bounce it. Postgres wasn't brought down in a
graceful fashion..
restart didn't bring the DB back properly, so as the postgres user, did
the following:
/usr/local/pgsql/bin/postmaster -d5 start
it dumps the initial environment variables, and then returns nothing. CPU
is pegged at 100%. No reporting, no information as to what's happening.
Solutions? It the DB corrupted badly? Where do I go from here?
thanks,
--pete
As a followup - the line from top:
1641 postgres 105 0 2684K 1384K CPU1 0 8:26 99.02% 99.02%
postgres
As you can see, it's barely taking up any RAM - the process is going nuts
right off the bat..
On Wed, 18 Jul 2001, Pete Leonard wrote:
Show quoted text
Postgres 7.1.2, FreeBSD 3.4
Box got sick, had to bounce it. Postgres wasn't brought down in a
graceful fashion..restart didn't bring the DB back properly, so as the postgres user, did
the following:/usr/local/pgsql/bin/postmaster -d5 start
it dumps the initial environment variables, and then returns nothing. CPU
is pegged at 100%. No reporting, no information as to what's happening.Solutions? It the DB corrupted badly? Where do I go from here?
thanks,
--pete
Followup ^2 -
The reason this happened was that for whatever reason (we're still
investigating), /tmp was writeable only by root.
I only noticed this when using initdb to create a new data directory.
postmaster offered no suggestion that there was a problem here, even when
running at -d5.
chmod 777 /tmp fixed everything.
my best guess (I don't know how postmaster is operating, I didn't run any
of the system-level diagnostic tools to check) is that if postmaster fails
on opening a pipe/tmpfile, rather than check the error properly, it
changes the filename and tries again ad infinitum? Perhaps printing some
error code (especially at debug level 5) would help?
thanks,
--pete
On Wed, 18 Jul 2001, Pete Leonard wrote:
Show quoted text
As a followup - the line from top:
1641 postgres 105 0 2684K 1384K CPU1 0 8:26 99.02% 99.02%
postgresAs you can see, it's barely taking up any RAM - the process is going nuts
right off the bat..On Wed, 18 Jul 2001, Pete Leonard wrote:
Postgres 7.1.2, FreeBSD 3.4
Box got sick, had to bounce it. Postgres wasn't brought down in a
graceful fashion..restart didn't bring the DB back properly, so as the postgres user, did
the following:/usr/local/pgsql/bin/postmaster -d5 start
it dumps the initial environment variables, and then returns nothing. CPU
is pegged at 100%. No reporting, no information as to what's happening.Solutions? It the DB corrupted badly? Where do I go from here?
thanks,
--pete
Pete Leonard <pete@hero.com> writes:
restart didn't bring the DB back properly, so as the postgres user, did
the following:
/usr/local/pgsql/bin/postmaster -d5 start
it dumps the initial environment variables, and then returns nothing. CPU
is pegged at 100%. No reporting, no information as to what's happening.
This is kind of a random guess, but we recently noticed that 7.1 has a
bug whereby the postmaster can go into an infinite loop at startup if
the $PGDATA directory is not writable. Check permissions. It might
also be a good idea to remove the old postmaster.pid file by hand.
regards, tom lane
On Wed, Jul 18, 2001 at 09:36:38AM -0700, Pete Leonard wrote:
chmod 777 /tmp fixed everything.
That should be 1777.
mrc
--
Mike Castle dalgoda@ix.netcom.com www.netcom.com/~dalgoda/
We are all of us living in the shadow of Manhattan. -- Watchmen
fatal ("You are in a maze of twisty compiler features, all different"); -- gcc
Pete Leonard <pete@hero.com> writes:
The reason this happened was that for whatever reason (we're still
investigating), /tmp was writeable only by root.
Ah. Hadn't thought about it before, but the infinite-loop-on-
nonwritable-$PGDATA bug would also trigger for nonwritable /tmp.
(The bug was actually in CreateLockFile, which is used both to
create a lockfile in $PGDATA and one in /tmp. Sigh.)
This is fixed in current sources. If we were going to do a 7.1.3
then I'd backpatch the fix into the REL7_1 branch, but at this point
I suspect there won't be a 7.1.3 --- we'll probably go into 7.2 beta
in another five or six weeks, so there's not much point.
regards, tom lane