GUC variable for setting number of local buffers

Started by Tom Lanealmost 21 years ago6 messages
#1Tom Lane
tgl@sss.pgh.pa.us

We've had a TODO item for some time about allowing the user to set the
size of the local buffer array that's used for accessing temporary
tables. The holdup has been that localbuf.c used very unscalable
algorithms (like linear search) and so a large local buffer set would
have terrible performance anyway. We wanted localbuf.c to duplicate the
shared buffer manager's search and replacement algorithms, which looked
like a lot of work.

However, the recent changes to make the shared buffer manager use a
clock sweep replacement algorithm made it trivial to have localbuf.c
do the same. I have just committed additional changes to make
localbuf.c use a hash table instead of linear search for lookup,
so it's now fully on par with the shared buffer manager as far
as algorithms go.

That means we can go ahead with providing a GUC variable to make the
array size user-selectable. I was thinking of calling it either
"local_buffers" (in contrast to "shared_buffers") or "temp_buffers"
(to emphasize the fact that they're used for temporary tables).
Anyone have a preference, or a better alternative?

As far as semantics go, I was thinking of making the variable USERSET
but allowing it to change only as long as you haven't accessed any temp
tables in the current session. Under the hood, we'd postpone calling
InitLocalBuffer() until the first use of temp tables in a session,
at which time the local buffer descriptor array would be allocated,
and henceforth you couldn't change the array size anymore. This would
be enough flexibility to allow temp-table-intensive tasks to run with
a large local setting, without having to make every session do the same.
(It's conceivable that we could support on-the-fly resizing of the
array, but it seems unlikely to be worth the trouble and risk of bugs.)

It's already true that the individual buffers, as opposed to the buffer
descriptors, are allocated only as needed; which makes the overhead
of a large local_buffers setting pretty small if you don't actually do
much with temp tables in a given session. So I was thinking about
making the default value fairly robust, maybe 1000 (as compared to
the historical value of 64...).

Comments?

regards, tom lane

#2Marc G. Fournier
scrappy@postgresql.org
In reply to: Tom Lane (#1)
Re: GUC variable for setting number of local buffers

On Sat, 19 Mar 2005, Tom Lane wrote:

That means we can go ahead with providing a GUC variable to make the
array size user-selectable. I was thinking of calling it either
"local_buffers" (in contrast to "shared_buffers") or "temp_buffers" (to
emphasize the fact that they're used for temporary tables). Anyone have
a preference, or a better alternative?

temp_buffers sounds more descriptive ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664

#3Markus Bertheau
twanger@bluetwanger.de
In reply to: Tom Lane (#1)
Re: GUC variable for setting number of local buffers

В Сбт, 19/03/2005 в 12:57 -0500, Tom Lane пишет:

It's already true that the individual buffers, as opposed to the buffer
descriptors, are allocated only as needed; which makes the overhead
of a large local_buffers setting pretty small if you don't actually do
much with temp tables in a given session. So I was thinking about
making the default value fairly robust, maybe 1000 (as compared to
the historical value of 64...).

Why does the dba need to set that variable at all then?

--
Markus Bertheau <twanger@bluetwanger.de>

#4Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Markus Bertheau (#3)
Re: GUC variable for setting number of local buffers

Markus Bertheau wrote:
-- Start of PGP signed section.

? ???, 19/03/2005 ? 12:57 -0500, Tom Lane ?????:

It's already true that the individual buffers, as opposed to the buffer
descriptors, are allocated only as needed; which makes the overhead
of a large local_buffers setting pretty small if you don't actually do
much with temp tables in a given session. So I was thinking about
making the default value fairly robust, maybe 1000 (as compared to
the historical value of 64...).

Why does the dba need to set that variable at all then?

It is like sort_mem that is local memory but is limited so a single
backend does not exhaust all the RAM on the machine.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#5Simon Riggs
simon@2ndquadrant.com
In reply to: Tom Lane (#1)
Re: GUC variable for setting number of local buffers

On Sat, 2005-03-19 at 12:57 -0500, Tom Lane wrote:

That means we can go ahead with providing a GUC variable to make the
array size user-selectable. I was thinking of calling it either
"local_buffers" (in contrast to "shared_buffers") or "temp_buffers"
(to emphasize the fact that they're used for temporary tables).
Anyone have a preference, or a better alternative?

Comments?

All of that is good news...

Currently, we already have a GUC that describes the amount of memory we
can use for a backend, work_mem. Would it not be possible to continue to
use that setting and resize the temp_buffers area as needed so that
work_mem was not exceeded - and so we need not set local_temp_buffers?

It will become relatively hard to judge how to set work_mem and
local_temp_buffers for larger queries, and almost impossible to do that
in a multi-user system. To do that, we would need some additional
feedback that could be interpreted so as to judge how large to set
these. Perhaps to mention local buffer and memory usage in an EXPLAIN
ANALYZE? It would be much better if we could decide how best to use
work_mem according to the query plan that is just about to be executed,
then set all areas accordingly. After all, not all queries would use
both limits simultaneously.

This is, of course, a nice problem to have. :-)

If we must have a GUC, local_temp_buffers works better for me.
local_buffers is my second choice because it matches the terminology
used everywhere in the code and also because temp_buffers sounds like it
is a global setting, which it would not be.

Best Regards, Simon Riggs

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Markus Bertheau (#3)
Re: GUC variable for setting number of local buffers

Markus Bertheau <twanger@bluetwanger.de> writes:

It's already true that the individual buffers, as opposed to the buffer
descriptors, are allocated only as needed; which makes the overhead
of a large local_buffers setting pretty small if you don't actually do
much with temp tables in a given session. So I was thinking about
making the default value fairly robust, maybe 1000 (as compared to
the historical value of 64...).

Why does the dba need to set that variable at all then?

Because you do have to have a limit. You want the thing trying to keep
all of a large temp table in core?

regards, tom lane