postmaster failure with 2-23 snapshot

Started by Brian P Millettalmost 27 years ago17 messages
#1Brian P Millett
bpm@ec-group.com

uname -a
SunOS vlad 5.7 Generic_106541-01 sun4u sparc SUNW,Ultra-5_10
gcc -v
Reading specs from
/opt/gnu/lib/gcc-lib/sparc-sun-solaris2.7/egcs-2.91.60/specs
gcc version egcs-2.91.60 19981201 (egcs-1.1.1 release)

I just compiled the snapshot using this command to configure pgsql:

configure --prefix=/opt/pgsql \
--with-template=solaris_sparc_gcc \
--with-tcl \
--with-perl \
--with-tclconfig=/opt/tcl/lib \
--with-includes=/opt/tcl/include

All compiles fine, but when I try to run the postmaster I get the
following:

vlad: postmaster -i
IpcMemoryCreate: shmget failed (Invalid argument) key=5432001,
size=1137426, permission=600
FATAL 1: ShmemCreate: cannot create region

Thought it might help with the development.
Thanks.

--
Brian Millett
Enterprise Consulting Group "Heaven can not exist,
(314) 205-9030 If the family is not eternal"
bpm@ec-group.com F. Ballard Washburn

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Brian P Millett (#1)
Re: [HACKERS] postmaster failure with 2-23 snapshot

vlad: postmaster -i
IpcMemoryCreate: shmget failed (Invalid argument) key=5432001,
size=1137426, permission=600

I think shmget returns that error code when the requested size is
larger than the system limit on shared memory block size. Check
your kernel parameters (SHMMAX and friends).

You might find that starting the postmaster with a smaller value
of -N is an easier answer than reconfiguring your kernel.

regards, tom lane

#3Daryl W. Dunbar
daryl@www.com
In reply to: Tom Lane (#2)
RE: [HACKERS] postmaster failure with 2-23 snapshot

Here is what I added to my /etc/system on Solaris 7:

set shmsys:shminfo_shmmax=16777216
set shmsys:shminfo_shmmin=1
set shmsys:shminfo_shmmni=128
set shmsys:shminfo_shmseg=51
*
set semsys:seminfo_semmap=128
set semsys:seminfo_semmni=128
set semsys:seminfo_semmns=8192
set semsys:seminfo_semmnu=8192
set semsys:seminfo_semmsl=64
set semsys:seminfo_semopm=32
set semsys:seminfo_semume=32

Of course, this is way more than you need to run 64 backends, this
will accommodate thousands of semaphores, but not much more than 128
backends due to the shared memory needs... You might want to run a
sysdef to see the defaults first and then pick your tunables.

DwD

Show quoted text

-----Original Message-----
From: owner-pgsql-hackers@postgreSQL.org
[mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Tom Lane
Sent: Tuesday, February 23, 1999 7:04 PM
To: Brian P Millett
Cc: pgsql-hackers
Subject: Re: [HACKERS] postmaster failure with 2-23 snapshot

vlad: postmaster -i
IpcMemoryCreate: shmget failed (Invalid argument) key=5432001,
size=1137426, permission=600

I think shmget returns that error code when the requested size is
larger than the system limit on shared memory block size. Check
your kernel parameters (SHMMAX and friends).

You might find that starting the postmaster with a smaller value
of -N is an easier answer than reconfiguring your kernel.

regards, tom lane

#4Brian P Millett
bpm@ec-group.com
In reply to: Daryl W. Dunbar (#3)
Re: [HACKERS] postmaster failure with 2-23 snapshot

"Daryl W. Dunbar" wrote:

Here is what I added to my /etc/system on Solaris 7:

set shmsys:shminfo_shmmax=16777216
set shmsys:shminfo_shmmin=1
set shmsys:shminfo_shmmni=128
set shmsys:shminfo_shmseg=51
*
set semsys:seminfo_semmap=128
set semsys:seminfo_semmni=128
set semsys:seminfo_semmns=8192
set semsys:seminfo_semmnu=8192
set semsys:seminfo_semmsl=64
set semsys:seminfo_semopm=32
set semsys:seminfo_semume=32

Thanks for the quick reply, Yes I looked at the /etc/system & I did have

set semsys:seminfo_semmap=128
set semsys:seminfo_semmni=128
set semsys:seminfo_semmns=8192
set semsys:seminfo_semmnu=8192
set semsys:seminfo_semmsl=64
set semsys:seminfo_semopm=32
set semsys:seminfo_semume=32

BUT I didn't have
set shmsys:shminfo_shmmax=16777216
set shmsys:shminfo_shmmin=1
set shmsys:shminfo_shmseg=51

That was it.

Thanks!

--
Brian Millett
Enterprise Consulting Group "Heaven can not exist,
(314) 205-9030 If the family is not eternal"
bpm@ec-group.com F. Ballard Washburn

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Brian P Millett (#4)
Re: [HACKERS] postmaster failure with 2-23 snapshot

Brian P Millett <bpm@ec-group.com> writes:

BUT I didn't have
set shmsys:shminfo_shmmax=16777216
That was it.

I'll bet the default value of SHMMAX on your kernel is 1MB.
You said 6.4.x worked for you, right?

A stock version of 6.4.x creates a shared memory segment of about
830K if you don't alter the default -B setting. Thanks to some
changes I made recently in the memory space estimation stuff,
the current CVS sources will try to make a shm segment of about
1100K with the default -B and -N settings.

If 1MB is a popular SHMMAX default, it might be a good idea to
trim down the safety margin a little bit so we come out short of
1MB at the default settings ...

regards, tom lane

In reply to: Tom Lane (#5)
Re: [HACKERS] postmaster failure with 2-23 snapshot

Tom Lane <tgl@sss.pgh.pa.us> writes:

A stock version of 6.4.x creates a shared memory segment of about
830K if you don't alter the default -B setting. Thanks to some
changes I made recently in the memory space estimation stuff,
the current CVS sources will try to make a shm segment of about
1100K with the default -B and -N settings.

Have there also been changes to the semaphore usage over the last 10
days? A February 15th snapshot is fine on my systems, as long as I
apply the patches that appeared here yesterday to get Kerberos going,
but after 'cvs update' yesterday (February 23rd), the postmaster is
refusing to start, claiming that semget() failed to allocate a block
of 16 semaphores. The default maximum here is 60 semaphores, so I
guess it must have allocated at least 44 of them before the failure.

This is under NetBSD/i386-1.3I.

-tih
--
Popularity is the hallmark of mediocrity. --Niles Crane, "Frasier"

In reply to: Tom Lane (#5)
Re: [HACKERS] postmaster failure with 2-23 snapshot

I wrote:

but after 'cvs update' yesterday (February 23rd), the postmaster is
refusing to start, claiming that semget() failed to allocate a block
of 16 semaphores. The default maximum here is 60 semaphores, so I
guess it must have allocated at least 44 of them before the failure.

Looking more closely into it, the postmaster is trying to allocate 64
semaphores in four groups of 16, so I built a new kernel with a higher
limit, and it's now OK.

This is as it should be, I hope? It's not a case of something being
misconfigured now, using semaphores instead of some other facility?

-tih
--
Popularity is the hallmark of mediocrity. --Niles Crane, "Frasier"

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Ivar Helbekkmo (#7)
Re: [HACKERS] postmaster failure with 2-23 snapshot

Tom Ivar Helbekkmo <tih@nhh.no> writes:

Looking more closely into it, the postmaster is trying to allocate 64
semaphores in four groups of 16, so I built a new kernel with a higher
limit, and it's now OK.
This is as it should be, I hope? It's not a case of something being
misconfigured now, using semaphores instead of some other facility?

Yes, this is an intentional change --- I guess you haven't been reading
the hackers list very closely. The postmaster is now set up to grab
all the semaphores Postgres could need (for the specified number of
backend processes) immediately at postmaster startup. Failing then
for lack of semaphores seems a better idea than failing under load
when you try to start the N+1'st client, which is what used to happen.

There has been some discussion of reducing the default number-of-
backends limit to 32 so that a stock installation is less likely
to run out of semaphores.

regards, tom lane

#9The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#8)
Re: [HACKERS] postmaster failure with 2-23 snapshot

On Thu, 25 Feb 1999, Tom Lane wrote:

Tom Ivar Helbekkmo <tih@nhh.no> writes:

Looking more closely into it, the postmaster is trying to allocate 64
semaphores in four groups of 16, so I built a new kernel with a higher
limit, and it's now OK.
This is as it should be, I hope? It's not a case of something being
misconfigured now, using semaphores instead of some other facility?

Yes, this is an intentional change --- I guess you haven't been reading
the hackers list very closely. The postmaster is now set up to grab
all the semaphores Postgres could need (for the specified number of
backend processes) immediately at postmaster startup. Failing then
for lack of semaphores seems a better idea than failing under load
when you try to start the N+1'st client, which is what used to happen.

There has been some discussion of reducing the default number-of-
backends limit to 32 so that a stock installation is less likely
to run out of semaphores.

Is there any way (sysctl?) of determining the max # of semaphores
configured into a system?

I just looked at a sys/sysconfig.h under Solaris, and it appears they have
an "undocumented function" that does this...but I can't seem to find
anything right off...

For that matter, being able to do a configure check to see if semaphores
are even compiled into the system or not (ala FreeBSD) might be nice
too...

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#10Brian P Millett
bpm@ec-group.com
In reply to: The Hermit Hacker (#9)
Re: [HACKERS] postmaster failure with 2-23 snapshot

The Hermit Hacker wrote:

On Thu, 25 Feb 1999, Tom Lane wrote:

Is there any way (sysctl?) of determining the max # of semaphores
configured into a system?

Mark, I don't know if this is what you want, but with solaris, I can see what is
the current setup using "sysdef". At the end of the output, I have this:

*
* IPC Semaphores
*
128 entries in semaphore map (SEMMAP)
128 semaphore identifiers (SEMMNI)
8192 semaphores in system (SEMMNS)
8192 undo structures in system (SEMMNU)
64 max semaphores per id (SEMMSL)
32 max operations per semop call (SEMOPM)
32 max undo entries per process (SEMUME)
32767 semaphore maximum value (SEMVMX)
16384 adjust on exit max value (SEMAEM)
*
* IPC Shared Memory
*
16777216 max shared memory segment size (SHMMAX)
1 min shared memory segment size (SHMMIN)
100 shared memory identifiers (SHMMNI)
51 max attached shm segments per process (SHMSEG)
*
* Time Sharing Scheduler Tunables
*
60 maximum time sharing user priority (TSMAXUPRI)
SYS system class name (SYS_NAME)

--
Brian Millett
Enterprise Consulting Group "Heaven can not exist,
(314) 205-9030 If the family is not eternal"
bpm@ec-group.com F. Ballard Washburn

#11Ross J. Reedstrom
reedstrm@rice.edu
In reply to: The Hermit Hacker (#9)
Re: [HACKERS] postmaster failure with 2-23 snapshot

The Hermit Hacker wrote:

On Thu, 25 Feb 1999, Tom Lane wrote:

Yes, this is an intentional change --- I guess you haven't been reading
the hackers list very closely. The postmaster is now set up to grab
all the semaphores Postgres could need (for the specified number of
backend processes) immediately at postmaster startup. Failing then
for lack of semaphores seems a better idea than failing under load
when you try to start the N+1'st client, which is what used to happen.

There has been some discussion of reducing the default number-of-
backends limit to 32 so that a stock installation is less likely
to run out of semaphores.

Is there any way (sysctl?) of determining the max # of semaphores
configured into a system?

<snip comment on undocumented solaris call>

Perhaps on startup the postmaster can allocate the max number of
semaphores, thus preserving the 'fail now, not later' behavior, but then
release all but a smaller block? (say, 16)? Kind of an amalgam of the
new and old allocation strategies. that way, other programs that
potentially need a large number of sems can have them, if postgresql
isn't using them right now, without needing a reconfigured kernel.

Hmm, that does open a window for failure if there are sufficient sems at
startup, but not latter, when the high load kicks in. Perhaps a
configuration flag, for "greedy semaphore allocation?" This lets the
DBA decide what should fail under the high load, scarce sems condition.
If the db is mission critical, it's worth reconfiguring, and letting it
have all the sems. Even if "non-greedy", still do the test, and fail if
there's not enough potential sems for a max num of backends, though
(don't plan the timebomb, basically).

Ross
--
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St., Houston, TX 77005

#12Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Tom Lane (#8)
Re: [HACKERS] postmaster failure with 2-23 snapshot

Tom Ivar Helbekkmo <tih@nhh.no> writes:

Looking more closely into it, the postmaster is trying to allocate 64
semaphores in four groups of 16, so I built a new kernel with a higher
limit, and it's now OK.
This is as it should be, I hope? It's not a case of something being
misconfigured now, using semaphores instead of some other facility?

Yes, this is an intentional change --- I guess you haven't been reading
the hackers list very closely. The postmaster is now set up to grab
all the semaphores Postgres could need (for the specified number of
backend processes) immediately at postmaster startup. Failing then
for lack of semaphores seems a better idea than failing under load
when you try to start the N+1'st client, which is what used to happen.

There has been some discussion of reducing the default number-of-
backends limit to 32 so that a stock installation is less likely
to run out of semaphores.

Tom, better lower that limit soon. People are having trouble with the
snapshots.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#13Bruce Momjian
maillist@candle.pha.pa.us
In reply to: The Hermit Hacker (#9)
Re: [HACKERS] postmaster failure with 2-23 snapshot

Is there any way (sysctl?) of determining the max # of semaphores
configured into a system?

I just looked at a sys/sysconfig.h under Solaris, and it appears they have
an "undocumented function" that does this...but I can't seem to find
anything right off...

For that matter, being able to do a configure check to see if semaphores
are even compiled into the system or not (ala FreeBSD) might be nice
too...

None of the commercial db's do that, so I assume there is no portable
way. We will lower the limit so it will pass most/all kernels, and help
people who need to up it. Perhaps an FAQ for kernels.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#14The Hermit Hacker
scrappy@hub.org
In reply to: Bruce Momjian (#13)
Re: [HACKERS] postmaster failure with 2-23 snapshot

On Thu, 25 Feb 1999, Bruce Momjian wrote:

Is there any way (sysctl?) of determining the max # of semaphores
configured into a system?

I just looked at a sys/sysconfig.h under Solaris, and it appears they have
an "undocumented function" that does this...but I can't seem to find
anything right off...

For that matter, being able to do a configure check to see if semaphores
are even compiled into the system or not (ala FreeBSD) might be nice
too...

None of the commercial db's do that, so I assume there is no portable
way. We will lower the limit so it will pass most/all kernels, and help
people who need to up it. Perhaps an FAQ for kernels.

None of the commercial db's use configure and source code :)

Even if its a header file that we can check for a default setting?

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: The Hermit Hacker (#14)
Re: [HACKERS] postmaster failure with 2-23 snapshot

The Hermit Hacker <scrappy@hub.org> writes:

None of the commercial db's use configure and source code :)

Even if its a header file that we can check for a default setting?

AFAIK there's no *portable* way of finding out what the kernel's
configuration parameters are --- it's possible to find out, on most
flavors of Unix, but the place to look differs from platform to
platform.

I think our best bet is just to trim Postgres' default settings enough
so that an unmodified installation will run on most platforms. People
who really want more backends or more buffers will have had to learn how
to adjust their kernel params anyway...

regards, tom lane

In reply to: The Hermit Hacker (#9)
Re: [HACKERS] postmaster failure with 2-23 snapshot

The Hermit Hacker <scrappy@hub.org> writes:

Is there any way (sysctl?) of determining the max # of semaphores
configured into a system?

On NetBSD (default configuration; I had to change it for PostgreSQL):

athene:tih> ipcs -S
seminfo:
semmap: 30 (# of entries in semaphore map)
semmni: 10 (# of semaphore identifiers)
semmns: 60 (# of semaphores in system)
semmnu: 30 (# of undo structures in system)
semmsl: 60 (max # of semaphores per id)
semopm: 100 (max # of operations per semop call)
semume: 10 (max # of undo entries per process)
semusz: 100 (size in bytes of undo structure)
semvmx: 32767 (semaphore maximum value)
semaem: 16384 (adjust on exit max value)

athene:tih> ipcs -Q
msginfo:
msgmax: 16384 (max characters in a message)
msgmni: 40 (# of message queues)
msgmnb: 2048 (max characters in a message queue)
msgtql: 40 (max # of messages in system)
msgssz: 8 (size of a message segment)
msgseg: 2048 (# of message segments in system)

athene:tih> ipcs -M
shminfo:
shmmax: 4194304 (max shared memory segment size)
shmmin: 1 (min shared memory segment size)
shmmni: 128 (max number of shared memory identifiers)
shmseg: 32 (max shared memory segments per process)
shmall: 1024 (max amount of shared memory in pages)

For that matter, being able to do a configure check to see if
semaphores are even compiled into the system or not (ala FreeBSD)
might be nice too...

Again, on NetBSD:

athene:tih> sysctl -a | grep sysv
kern.sysvmsg = 1
kern.sysvsem = 1
kern.sysvshm = 1

-tih
--
Popularity is the hallmark of mediocrity. --Niles Crane, "Frasier"

#17Bruce Momjian
maillist@candle.pha.pa.us
In reply to: The Hermit Hacker (#14)
Re: [HACKERS] postmaster failure with 2-23 snapshot]

None of the commercial db's use configure and source code :)

Even if its a header file that we can check for a default setting?

If it possible, let's do it. I just suspect is it not.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026