Is it better to use OS cache or max out memory usage of PostgreSQL?
Hello all,
I think I've figured out that posting to the newsgroup doesn't work. :-)
Which is more effecient, to setup PostgreSQL to use as much system RAM
as possible (for example, setting up 800MB for PostgreSQL on a 1GB
system), or to keep PostgreSQL memory usage to something like 25% of RAM
and depend on the OS (in this case, Linux) file system cache to optimize
memory usage?
At first glance, it would seem that depending on the OS file system
cache would not be the way to go... If PostgreSQL needs data and that
data is not in PostgreSQL's shared memory, then PostgreSQL must go out
and retrieve the data from the filesystem (I'm guessing). If the OS has
cached the data, then the data is effectively copied from the OS cache
into PostgreSQL's memory... not only is there the overhead of copying
the data, but now there are two copies of it in memory. However, I have
been told in the past that it is indeed better to keep PostgreSQL memory
usage small and depend more on the OS file system cache. What is the
right balance?
Thanks in advance,
Andy
On Wed, May 15, 2002 at 10:17:43AM -0400, Andy DePue wrote:
Hello all,
I think I've figured out that posting to the newsgroup doesn't work. :-)
Which is more effecient, to setup PostgreSQL to use as much system RAM
as possible (for example, setting up 800MB for PostgreSQL on a 1GB
system), or to keep PostgreSQL memory usage to something like 25% of RAM
and depend on the OS (in this case, Linux) file system cache to optimize
memory usage?
The usual suggestion is about 25% of your physical memory for shared
buffers. You can keep going up as long as you don't start paging,
but consider _all_ the cases your machines might be used under, and
not just the ideal conditions. If you start paging, you'll kill
performance. Consider what happens when you dump the database, for
instance, and you'll see that dedicating too much memory to shared
buffers can cause swapping occasionally. Tom Lane has argued that
you shouldn't use anything more than about 25% of physical memory for
shared buffers, either; see
<http://archives.postgresql.org/pgsql-general/2001-07/msg00464.php>
cache would not be the way to go... If PostgreSQL needs data and that
data is not in PostgreSQL's shared memory, then PostgreSQL must go out
and retrieve the data from the filesystem (I'm guessing). If the OS has
cached the data, then the data is effectively copied from the OS cache
into PostgreSQL's memory... not only is there the overhead of copying
the data, but now there are two copies of it in memory. However, I have
been told in the past that it is indeed better to keep PostgreSQL memory
usage small and depend more on the OS file system cache.
This depends on the system you're using, the efficiency of its
filesystem cache, the liklihood that you'll be retrieving data in
memory, and other such variables. For instance, if you're mostly
writing into the database and not reading it, having a lot of stuff
in your cache isn't going to help. My early tests on Solaris 7
indicated to me that, given the filesystem cache I had, there was
almost nothing to be gained by having very large shared buffers.
Some recent blips have made me try some additional tests, and given
the most recent use patters in the database, I'm now thinking that I
should increase the size of our shared buffers.
Tom Lane has argued that filesystem buffers ought to be very nearly
as fast as shared memory, at least after a reasonably large shared
buffer is configured (see
<http://archives.postgresql.org/pgsql-hackers/2001-11/msg00669.php>).
We've been doing some tests lately that suggest that copying data
from filesystem buffers to the shared buffer imposes a noticable
penalty, at least on Solaris 7. But the same pattern doesn't show up
on FreeBSD; hence my suggestion that it's OS-sensitive.
A
--
----
Andrew Sullivan 87 Mowat Avenue
Liberty RMS Toronto, Ontario Canada
<andrew@libertyrms.info> M6K 3E3
+1 416 646 3304 x110