open file counts in 8.1.2?
We're trying to make sense of the number of open files on an
HP-UX 11.23 system that's getting several new 8.1.2 clusters,
and in particular why the numbers appear to be significantly
larger than our 7.4 clusters on similar hardware. Would there
be anything particular to 8.1.2 over 7.4 that would lead to a
larger number of open files?
Ed
"Ed L." <pgsql@bluepolka.net> writes:
We're trying to make sense of the number of open files on an
HP-UX 11.23 system that's getting several new 8.1.2 clusters,
and in particular why the numbers appear to be significantly
larger than our 7.4 clusters on similar hardware. Would there
be anything particular to 8.1.2 over 7.4 that would lead to a
larger number of open files?
This is much too handwavy to provide an intelligent comment on.
Get a copy of "lsof" and find out exactly which processes have
how many files open, then we'll have some idea what's going on...
regards, tom lane
On Tuesday March 14 2006 10:25 am, Tom Lane wrote:
"Ed L." <pgsql@bluepolka.net> writes:
We're trying to make sense of the number of open files on an
HP-UX 11.23 system that's getting several new 8.1.2
clusters, and in particular why the numbers appear to be
significantly larger than our 7.4 clusters on similar
hardware. Would there be anything particular to 8.1.2 over
7.4 that would lead to a larger number of open files?This is much too handwavy to provide an intelligent comment
on. Get a copy of "lsof" and find out exactly which processes
have how many files open, then we'll have some idea what's
going on...
We have 3 clusters with 24K, 34K, and 47K open files according to
lsof. These same clusters have 164, 179, and 210 active
connections, respectively. Their schemas, counting the number
of user and system entries in pg_class as a generously rough
measure of potential open files, contain roughly 2000 entries
each. Those open files seem pretty plausible, they're just much
higher than what we see on the older systems.
Ed
On Tuesday March 14 2006 10:31 am, Ed L. wrote:
On Tuesday March 14 2006 10:25 am, Tom Lane wrote:
"Ed L." <pgsql@bluepolka.net> writes:
We're trying to make sense of the number of open files on
an HP-UX 11.23 system that's getting several new 8.1.2
clusters, and in particular why the numbers appear to be
significantly larger than our 7.4 clusters on similar
hardware. Would there be anything particular to 8.1.2
over 7.4 that would lead to a larger number of open files?This is much too handwavy to provide an intelligent comment
on. Get a copy of "lsof" and find out exactly which
processes have how many files open, then we'll have some
idea what's going on...We have 3 clusters with 24K, 34K, and 47K open files according
to lsof. These same clusters have 164, 179, and 210 active
connections, respectively. Their schemas, counting the number
of user and system entries in pg_class as a generously rough
measure of potential open files, contain roughly 2000 entries
each. Those open files seem pretty plausible, they're just
much higher than what we see on the older systems.
One lsof curiosity is that one cluster seems to have it's
partition directory listing open about 10K times, including
many times by the same backend process:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
postgres 4023 db1dba 49u REG 64,0x10001 16384 7435 /db1 (/dev/vgdb1/lvol1)
postgres 4023 db1dba 62u REG 64,0x10001 8192 7673 /db1 (/dev/vgdb1/lvol1)
postgres 4023 db1dba 68u REG 64,0x10001 16384 7601 /db1 (/dev/vgdb1/lvol1)
postgres 4023 db1dba 78u REG 64,0x10001 16384 7379 /db1 (/dev/vgdb1/lvol1)
postgres 4023 db1dba 79u REG 64,0x10001 16384 7380 /db1 (/dev/vgdb1/lvol1)
postgres 4023 db1dba 135u REG 64,0x10001 352256 7305 /db1 (/dev/vgdb1/lvol1)
postgres 4023 db1dba 136u REG 64,0x10001 262144 7640 /db1 (/dev/vgdb1/lvol1)
postgres 4023 db1dba 137u REG 64,0x10001 262144 7642 /db1 (/dev/vgdb1/lvol1)
postgres 4023 db1dba 138u REG 64,0x10001 262144 7643 /db1 (/dev/vgdb1/lvol1)
Ed
"Ed L." <pgsql@bluepolka.net> writes:
We have 3 clusters with 24K, 34K, and 47K open files according to
lsof. These same clusters have 164, 179, and 210 active
connections, respectively. Their schemas, counting the number
of user and system entries in pg_class as a generously rough
measure of potential open files, contain roughly 2000 entries
each. Those open files seem pretty plausible, they're just much
higher than what we see on the older systems.
Hm. AFAICT from the CVS logs, 7.4.2 and later should have about the
same behavior as 8.1.* in this regard. What version is the older
installation exactly?
You can always reduce max_files_per_process if you want more
conservative behavior.
regards, tom lane
"Ed L." <pgsql@bluepolka.net> writes:
One lsof curiosity is that one cluster seems to have it's
partition directory listing open about 10K times, including
many times by the same backend process:
Nah, that's just an lsof aberration on HPUX --- it doesn't always tell
the truth about files' names. Notice the NODEs are all different, so
these really are different files. You could use ls -i if you want to
determine what they actually are.
regards, tom lane
On Tuesday March 14 2006 10:46 am, Tom Lane wrote:
"Ed L." <pgsql@bluepolka.net> writes:
We have 3 clusters with 24K, 34K, and 47K open files
according to lsof. These same clusters have 164, 179, and
210 active connections, respectively. Their schemas,
counting the number of user and system entries in pg_class
as a generously rough measure of potential open files,
contain roughly 2000 entries each. Those open files seem
pretty plausible, they're just much higher than what we see
on the older systems.Hm. AFAICT from the CVS logs, 7.4.2 and later should have
about the same behavior as 8.1.* in this regard. What version
is the older installation exactly?
They are machines each with a mix of 7.3.4, 7.4.6, and 7.4.8.
I'm working on lsof comparison to find specific diffs. It would
seem the factors driving number of open files are current
connections, # of relations, indices, etc. Am I correct about
that?
You can always reduce max_files_per_process if you want more
conservative behavior.
Ah, thanks. I'm not particularly worried about this since the
numbers on the new system somewhat make sense to me. But others
here are concerned, so I'm trying to explain/justify/understand
better. If we want to handle 16 clusters on this one box, each
with 300 max_connections and 2000 relations, would it be
ball-park reasonable to say that worst case we might have 300
backends with ~2000 open file descriptors each (300 * 2000 =
600K open files per cluster, 600K * 16 clusters = 10M open
files)? Increasing the kernel parameter 'nfiles' (max total
open files on system) to something like 10M seems to make some
of the ITRC HP gurus gasp. (I suspect we'll hit I/O limits long
before open files become an issue.)
Ed
"Ed L." <pgsql@bluepolka.net> writes:
If we want to handle 16 clusters on this one box, each
with 300 max_connections and 2000 relations, would it be
ball-park reasonable to say that worst case we might have 300
backends with ~2000 open file descriptors each (300 * 2000 =
600K open files per cluster, 600K * 16 clusters = 10M open
files)?
No, an individual backend should never exceed max_files_per_process open
files (1000 by default). It will feel free to go up that high, though,
if it has reason to touch that many database files over its lifetime.
1000 is probably much higher than you really need for reasonable
performance; I'd be inclined to cut it to a couple hundred at most if
you need to sustain large numbers of backends. I dunno what sort of
penalties the kernel might have for millions of open files but there
probably are some ...
regards, tom lane
I try to build 8.1.3 with:
./configure --prefix=/usr/local/pgsql8.1.3 --with-openssl --with-pam
--enable-thread-safety
It fails the openssl test, saying openssl/ssl.h is unavailable. Digging
deeper, I find that it is because the test program with
#include <openssl/ssl.h>
is failing because it can't include krb5.h.
Based on another post, I tried adding "--with-krb5". That explicitly
aborted with it unable to find krb5.h. I then tried:
./configure --prefix=/usr/local/pgsql8.1.3 --with-openssl --with-pam
--enable-thread-safety --with-krb5 --with-includes=/usr/kerberos/include
Now it gets past both the openssl and kerberos, but bites the dust with:
configure: error:
*** Thread test program failed. Your platform is not thread-safe.
*** Check the file 'config.log'for the exact reason.
***
*** You can use the configure option --enable-thread-safety-force
*** to force threads to be enabled. However, you must then run
*** the program in src/tools/thread and add locking function calls
*** to your applications to guarantee thread safety.
If I remove the --with-krb5, it works. Why does enabling Kerberos break
threads?
I haven't been able to find any issues in the archives with krb5 and
threads. Am I missing something here?
Wes
Wes,
Did you try to ./configure w/out "--enable-thread-safety?" I recently
compiled postgreSQL 8.0.1 on Solaris and _needed_ --enable-thread-safety
strictly for building Slony-I against postgresql with that feature enabled.
What is the reason you are compiling this _with_ the feature?
If it's necessary, then you may need to --with-includes= and/or --with-libs=
with additional include directories, such as /usr/include:/usr/include/sys
where-ever the thread .h files are for your OS.
This configure attempt could be failing, because it can't locate the
correct thread headers and/or libraries
Wes wrote:
Show quoted text
I try to build 8.1.3 with:
./configure --prefix=/usr/local/pgsql8.1.3 --with-openssl --with-pam
--enable-thread-safetyIt fails the openssl test, saying openssl/ssl.h is unavailable. Digging
deeper, I find that it is because the test program with#include <openssl/ssl.h>
is failing because it can't include krb5.h.
Based on another post, I tried adding "--with-krb5". That explicitly
aborted with it unable to find krb5.h. I then tried:./configure --prefix=/usr/local/pgsql8.1.3 --with-openssl --with-pam
--enable-thread-safety --with-krb5 --with-includes=/usr/kerberos/includeNow it gets past both the openssl and kerberos, but bites the dust with:
configure: error:
*** Thread test program failed. Your platform is not thread-safe.
*** Check the file 'config.log'for the exact reason.
***
*** You can use the configure option --enable-thread-safety-force
*** to force threads to be enabled. However, you must then run
*** the program in src/tools/thread and add locking function calls
*** to your applications to guarantee thread safety.If I remove the --with-krb5, it works. Why does enabling Kerberos break
threads?I haven't been able to find any issues in the archives with krb5 and
threads. Am I missing something here?Wes
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
On 3/14/06 2:55 PM, "Louis Gonzales" <louis.gonzales@linuxlouis.net> wrote:
Did you try to ./configure w/out "--enable-thread-safety?" I recently
compiled postgreSQL 8.0.1 on Solaris and _needed_ --enable-thread-safety
strictly for building Slony-I against postgresql with that feature enabled.What is the reason you are compiling this _with_ the feature?
If it's necessary, then you may need to --with-includes= and/or --with-libs=
with additional include directories, such as /usr/include:/usr/include/sys
where-ever the thread .h files are for your OS.This configure attempt could be failing, because it can't locate the
correct thread headers and/or libraries
Why would I not want to specify enable-thread-safety? I want to be able to
write threaded programs.
--enable-thread-safety works fine until I enable --with-krb5, so it is
finding the thread libraries.
Wes