Still problems with memory swapping and server load
Hi!
I'm still "being hosed over big time" as Curt Sampson put it. It's still
the same machine and database: 1GB RAM, 4xPIII550Xeon, dumpall.sql is
~300MB (see "[GENERAL] Urgent: Tuning strategies?"). It all starts with
a humble 8MB swap being used (I expect that's just the empty swap with
nothing in it but some system overhead). Then after a short time, memory
usage climbs slow but continuously until it hits physical RAM ceiling
and starts using swap - with not very nice results for the database.
Swap sometimes amounts to 200MB or more.
I altered postgresql.conf-settings as recommended:
max_connections = 128
shared_buffers = 32768
sort_mem = 8192 (16384 or 32768 didn't help either)
wal_files = 32
wal_buffers = 32
fsync = false
Everything else is commented out resp. on default settings.
This is what top gives me:
2:41pm up 6 days, 3:51, 3 users, load average: 4.41, 2.71, 1.89
114 processes: 105 sleeping, 9 running, 0 zombie, 0 stopped
CPU0 states: 60.2% user, 14.3% system, 0.0% nice, 24.5% idle
CPU1 states: 81.0% user, 14.5% system, 0.0% nice, 4.0% idle
CPU2 states: 77.0% user, 12.3% system, 0.0% nice, 10.2% idle
CPU3 states: 71.1% user, 13.3% system, 0.0% nice, 15.0% idle
Mem: 1029400K av, 1023264K used, 6136K free, 0K shrd, 7176K
buff
Swap: 2097136K av, 49656K used, 2047480K free 819052K
cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
6848 postgres 16 0 247M 247M 246M R 93.6 24.6 4:06 postmaster
6928 postgres 9 0 13836 13M 13016 S 25.8 1.3 0:13 postmaster
6926 postgres 9 0 14572 14M 13756 S 23.8 1.4 0:13 postmaster
6920 postgres 10 0 14296 13M 13476 R 21.1 1.3 0:13 postmaster
6874 postgres 9 0 49408 48M 43168 S 19.8 4.7 3:57 postmaster
6911 postgres 9 0 66840 65M 65728 S 19.4 6.4 0:01 postmaster
6917 postgres 9 0 9108 9104 8204 R 19.4 0.8 0:13 postmaster
6875 postgres 11 0 41564 40M 35324 R 18.7 4.0 3:31 postmaster
6927 postgres 10 0 14148 13M 13328 R 17.4 1.3 0:12 postmaster
6889 postgres 9 0 24844 24M 23632 S 15.8 2.4 0:17 postmaster
6903 postgres 12 0 204M 204M 203M R 10.2 20.3 0:27 postmaster
6837 postgres 9 0 251M 251M 250M S 9.8 25.0 0:37 postmaster
5929 postgres 15 0 940 884 668 R 8.9 0.0 8:23 top
6934 root 16 0 976 976 732 R 8.0 0.0 0:07 top
6852 postgres 9 0 227M 227M 226M R 7.8 22.6 0:12 postmaster
6897 postgres 9 0 14988 14M 13948 S 6.0 1.4 0:01 postmaster
6838 postgres 9 0 18364 17M 17304 S 5.6 1.7 0:04 postmaster
6845 postgres 9 0 52344 51M 50916 S 3.6 5.0 0:09 postmaster
6834 postgres 9 0 25456 24M 24356 S 3.0 2.4 0:26 postmaster
6894 postgres 9 0 247M 247M 246M S 2.1 24.6 0:27 postmaster
6907 postgres 9 0 16020 15M 14992 S 1.8 1.5 0:03 postmaster
6904 postgres 9 0 16604 16M 15528 S 1.0 1.6 0:13 postmaster
4799 root 9 0 1820 1444 1300 S 0.1 0.1 0:07 sshd
6893 postgres 9 0 18396 17M 17332 S 0.1 1.7 0:07 postmaster
6916 postgres 9 0 8940 8936 8020 S 0.1 0.8 0:08 postmaster
cat /proc/sys/kernel/shmmax is 323380838 and ipcs -m returns:
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 32768 root 600 1056768 3 dest
0x00000000 98305 root 600 33554432 3 dest
0x00000000 131074 wwwrun 600 368644 3 dest
0x0052e2c1 2457603 postgres 600 274554880 54
0x07021999 229380 root 644 1104 2
Any ideas? Some other way out than "get more RAM"? How effective can I
expect more RAM to be in my situation - wouldn't it just clutter up and
overflow into swap like the first 1GB? How come system-processingtime
gets so high? I followed the other suggestions and tracked slow queries
using logging - now there's hardly anything left with >0.5s execution
time elapsed, but the backends still seem to run amok.
As always very greatful for any help.
Regardy,
Markus
On Wed, 26 Jun 2002, Markus Wollny wrote:
the same machine and database: 1GB RAM, 4xPIII550Xeon, dumpall.sql is
~300MB (see "[GENERAL] Urgent: Tuning strategies?"). It all starts with
a humble 8MB swap being used (I expect that's just the empty swap with
nothing in it but some system overhead). Then after a short time, memory
usage climbs slow but continuously until it hits physical RAM ceiling
and starts using swap - with not very nice results for the database.
Swap sometimes amounts to 200MB or more.
Also use "vmstat", "systat vmstat" or whatever your system's
equivalant is to see just how much swapping you're doing. I wouldn't
be surprised to see some unused programs being pushed out to swap
as you do a lot of I/O, but if you're pushing stuff out to swap
and bringing it back in on a regular basis, you've still got
problems.
Also, remember, your OS may consider reading a program binary when
you run a program to be "page in" activity, so don't get to worried
about that, unless you also see page out activity.
max_connections = 128
shared_buffers = 32768
sort_mem = 8192 (16384 or 32768 didn't help either)
wal_files = 32
wal_buffers = 32
fsync = false
That looks good. Nothing should be using terribly much memory now.
Mem: 1029400K av, 1023264K used, 6136K free, 0K shrd, 7176K buff
Ok, with only 7176K allocated to buffers, you've definitely got
some programs eating up your RAM, I'd say. Looking at your postmasters
below, sorted by size:
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
6837 postgres 9 0 251M 251M 250M S 9.8 25.0 0:37 postmaster
6894 postgres 9 0 247M 247M 246M S 2.1 24.6 0:27 postmaster
6848 postgres 16 0 247M 247M 246M R 93.6 24.6 4:06 postmaster
6852 postgres 9 0 227M 227M 226M R 7.8 22.6 0:12 postmaster
6903 postgres 12 0 204M 204M 203M R 10.2 20.3 0:27 postmaster
6911 postgres 9 0 66840 65M 65728 S 19.4 6.4 0:01 postmaster
6845 postgres 9 0 52344 51M 50916 S 3.6 5.0 0:09 postmaster
6874 postgres 9 0 49408 48M 43168 S 19.8 4.7 3:57 postmaster
6875 postgres 11 0 41564 40M 35324 R 18.7 4.0 3:31 postmaster
6834 postgres 9 0 25456 24M 24356 S 3.0 2.4 0:26 postmaster
6889 postgres 9 0 24844 24M 23632 S 15.8 2.4 0:17 postmaster
6893 postgres 9 0 18396 17M 17332 S 0.1 1.7 0:07 postmaster
6838 postgres 9 0 18364 17M 17304 S 5.6 1.7 0:04 postmaster
6904 postgres 9 0 16604 16M 15528 S 1.0 1.6 0:13 postmaster
6907 postgres 9 0 16020 15M 14992 S 1.8 1.5 0:03 postmaster
6897 postgres 9 0 14988 14M 13948 S 6.0 1.4 0:01 postmaster
6926 postgres 9 0 14572 14M 13756 S 23.8 1.4 0:13 postmaster
6920 postgres 10 0 14296 13M 13476 R 21.1 1.3 0:13 postmaster
6927 postgres 10 0 14148 13M 13328 R 17.4 1.3 0:12 postmaster
6928 postgres 9 0 13836 13M 13016 S 25.8 1.3 0:13 postmaster
6917 postgres 9 0 9108 9104 8204 R 19.4 0.8 0:13 postmaster
6916 postgres 9 0 8940 8936 8020 S 0.1 0.8 0:08 postmaster
4799 root 9 0 1820 1444 1300 S 0.1 0.1 0:07 sshd
6934 root 16 0 976 976 732 R 8.0 0.0 0:07 top
5929 postgres 15 0 940 884 668 R 8.9 0.0 8:23 top
Some of your backends are getting pretty darn big. I wonder what
they're doing? It can't be sort memory at this point. But as you
can see, those five 200-250MB backends are killing you.
If you can figure out what they're doing, and put a stop to that
memory usage, that would help you. Alternatively, perhaps just
dropping another 1-2 GB of RAM in the machine would fix your problem.
Also, for this kind of thing, it's better to provide a "ps aux" or
"ps -ef" than a top, unless you're sure that that display above is
all of the processes.
cjs
--
Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC
"Markus Wollny" <Markus.Wollny@computec.de> writes:
I'm still "being hosed over big time" as Curt Sampson put it. It's still
the same machine and database: 1GB RAM, 4xPIII550Xeon, dumpall.sql is
~300MB (see "[GENERAL] Urgent: Tuning strategies?"). It all starts with
a humble 8MB swap being used (I expect that's just the empty swap with
nothing in it but some system overhead). Then after a short time, memory
usage climbs slow but continuously until it hits physical RAM ceiling
and starts using swap - with not very nice results for the database.
It sort of looks like you are seeing a memory-leak problem. I thought
we'd largely eliminated that class of trouble in recent releases, but
maybe there's still one or two left. Can you identify the exact query
or queries that cause individual backends' memory usage to grow?
regards, tom lane
On Wed, Jun 26, 2002 at 02:50:56PM +0200, Markus Wollny wrote:
Hi!
I'm still "being hosed over big time" as Curt Sampson put it. It's still
the same machine and database: 1GB RAM, 4xPIII550Xeon, dumpall.sql is
~300MB (see "[GENERAL] Urgent: Tuning strategies?"). It all starts with
a humble 8MB swap being used (I expect that's just the empty swap with
nothing in it but some system overhead). Then after a short time, memory
usage climbs slow but continuously until it hits physical RAM ceiling
and starts using swap - with not very nice results for the database.
Swap sometimes amounts to 200MB or more.
Well, there's your problam. As soon as you swap, all performance goes out
the window. Looking below there is five processes using over 200MB of RAM
each. Lookup the PIDs in the log file to see what query it is. Sound's like
you have one whopping big query dragging everything down.
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
6848 postgres 16 0 247M 247M 246M R 93.6 24.6 4:06 postmaster
6903 postgres 12 0 204M 204M 203M R 10.2 20.3 0:27 postmaster
6837 postgres 9 0 251M 251M 250M S 9.8 25.0 0:37 postmaster
6852 postgres 9 0 227M 227M 226M R 7.8 22.6 0:12 postmaster
6894 postgres 9 0 247M 247M 246M S 2.1 24.6 0:27 postmaster
Any ideas? Some other way out than "get more RAM"? How effective can I
expect more RAM to be in my situation - wouldn't it just clutter up and
overflow into swap like the first 1GB? How come system-processingtime
gets so high? I followed the other suggestions and tracked slow queries
using logging - now there's hardly anything left with >0.5s execution
time elapsed, but the backends still seem to run amok.
Well, that first postmaster there seems to have been going for 4 minutes
already. Postgres has various stats functions to allow you to work out what
query each backend is currently executing. Are you sure you don't have an
unconstrained join or something?
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
There are 10 kinds of people in the world, those that can do binary
arithmetic and those that can't.
Hi!
I don't know exactly how to find the offending queries. All I was able
to come up with is check top-output, nail down the pid and then scan
over the logfile to get some queries - but of course there's lots of
queries using this very pid subsequently. How do I determine the
details? Right now all I could see was that all the queries where using
the begin|declare sql_cursor|fetch|close sql_cursor|end-pattern induced
by the odbc-driver, I presume. How do I pinpoint the specific offender?
Regards,
Markus
Show quoted text
-----Ursprüngliche Nachricht-----
Von: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Gesendet: Mittwoch, 26. Juni 2002 16:59
An: Markus Wollny
Cc: pgsql-general@postgresql.org
Betreff: Re: [GENERAL] Still problems with memory swapping and server
load"Markus Wollny" <Markus.Wollny@computec.de> writes:
I'm still "being hosed over big time" as Curt Sampson put
it. It's still
the same machine and database: 1GB RAM, 4xPIII550Xeon,
dumpall.sql is
~300MB (see "[GENERAL] Urgent: Tuning strategies?"). It all
starts with
a humble 8MB swap being used (I expect that's just the
empty swap with
nothing in it but some system overhead). Then after a short
time, memory
usage climbs slow but continuously until it hits physical
RAM ceiling
and starts using swap - with not very nice results for the database.
It sort of looks like you are seeing a memory-leak problem. I thought
we'd largely eliminated that class of trouble in recent releases, but
maybe there's still one or two left. Can you identify the exact query
or queries that cause individual backends' memory usage to grow?regards, tom lane
Import Notes
Resolved by subject fallback
Hi,
-- Markus Wollny <Markus.Wollny@computec.de> wrote:
Mem: 1029400K av, 1023264K used, 6136K free, 0K shrd, 7176K
buff
you use still too much RAM, there are only 7 MB left for OS caching, this
is really not enough!
shared_buffers = 32768
256 MB shared memory; this might OK, if you don't use other large
applications (e.g. a big mod_perl enabled Apache).
sort_mem = 8192 (16384 or 32768 didn't help either)
Each sort can take up to 8 MB RAM; if you have some queries which needs
e.g. 3 big sorts and have 30 from this in parallel, this takes ~720 MB RAM.
Try to reduce a) memory consumption of other applications, b) Memory
consumption of postgres: if you have a lot of big sorts, try to reduce them
in your application or reduce sort_mem; also 20000 shared buffers is mostly
enough.
Also, sorting buffers on disk should be NOT an a RAID 5 array (slow(er)
writes).
With my experience, the BIOS should at least have 200 MB Cache on a 1 GB
Machine.
Ciao
Alvar, Just my ideas ... :-)
--
// Unterschreiben! http://www.odem.org/informationsfreiheit/
// Internet am Telefon: http://www.teletrust.info/
// Das freieste Medium? http://www.odem.org/insert_coin/
// Blaster: http://www.assoziations-blaster.de/
-- Curt Sampson <cjs@cynic.net> wrote:
Some of your backends are getting pretty darn big. I wonder what
they're doing? It can't be sort memory at this point. But as you
can see, those five 200-250MB backends are killing you.
no, not really: they use shared memory:
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
6837 postgres 9 0 251M 251M 250M S 9.8 25.0 0:37 postmaster
6894 postgres 9 0 247M 247M 246M S 2.1 24.6 0:27 postmaster
6848 postgres 16 0 247M 247M 246M R 93.6 24.6 4:06 postmaster
6852 postgres 9 0 227M 227M 226M R 7.8 22.6 0:12 postmaster
6903 postgres 12 0 204M 204M 203M R 10.2 20.3 0:27 postmaster
6911 postgres 9 0 66840 65M 65728 S 19.4 6.4 0:01 postmaster
6845 postgres 9 0 52344 51M 50916 S 3.6 5.0 0:09 postmaster
6874 postgres 9 0 49408 48M 43168 S 19.8 4.7 3:57 postmaster
6875 postgres 11 0 41564 40M 35324 R 18.7 4.0 3:31 postmaster
6834 postgres 9 0 25456 24M 24356 S 3.0 2.4 0:26 postmaster
6889 postgres 9 0 24844 24M 23632 S 15.8 2.4 0:17 postmaster
6893 postgres 9 0 18396 17M 17332 S 0.1 1.7 0:07 postmaster
[...]
So it seems that some other processes eat the memory.
Ciao
Alvar
--
// Unterschreiben! http://www.odem.org/informationsfreiheit/
// Internet am Telefon: http://www.teletrust.info/
// Das freieste Medium? http://www.odem.org/insert_coin/
// Blaster: http://www.assoziations-blaster.de/
Alvar Freude <alvar@a-blast.org> writes:
-- Curt Sampson <cjs@cynic.net> wrote:
Some of your backends are getting pretty darn big. I wonder what
they're doing? It can't be sort memory at this point. But as you
can see, those five 200-250MB backends are killing you.
no, not really: they use shared memory:
No, because all the backends will be accessing the *exact same* shared
memory; it's impossible for some to be using more than others. Since we
see process sizes ranging from 17M to 251M, it's a pretty safe bet that
the latter guys are actually chewing a lot of private space. I don't
know what the "share" column means in that ps display, but I wouldn't
trust it.
regards, tom lane
On Wed, 26 Jun 2002, Alvar Freude wrote:
sort_mem = 8192 (16384 or 32768 didn't help either)
Each sort can take up to 8 MB RAM; if you have some queries which needs
e.g. 3 big sorts and have 30 from this in parallel, this takes ~720 MB RAM.
Not quite. As I mentioned earlier, postgres 7.2 actually allocates
between 2.5 and three times that amount of memory. So in your
example, you'd have 90 * 3 * 8 * 2.5 or more like 5 GB of memory. :-)
cjs
--
Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC