Thread: GUC variable for setting number of local buffers

GUC variable for setting number of local buffers

From
Tom Lane
Date:
We've had a TODO item for some time about allowing the user to set the
size of the local buffer array that's used for accessing temporary
tables.  The holdup has been that localbuf.c used very unscalable
algorithms (like linear search) and so a large local buffer set would
have terrible performance anyway.  We wanted localbuf.c to duplicate the
shared buffer manager's search and replacement algorithms, which looked
like a lot of work.

However, the recent changes to make the shared buffer manager use a
clock sweep replacement algorithm made it trivial to have localbuf.c
do the same.  I have just committed additional changes to make
localbuf.c use a hash table instead of linear search for lookup,
so it's now fully on par with the shared buffer manager as far
as algorithms go.

That means we can go ahead with providing a GUC variable to make the
array size user-selectable.  I was thinking of calling it either
"local_buffers" (in contrast to "shared_buffers") or "temp_buffers"
(to emphasize the fact that they're used for temporary tables).
Anyone have a preference, or a better alternative?

As far as semantics go, I was thinking of making the variable USERSET
but allowing it to change only as long as you haven't accessed any temp
tables in the current session.  Under the hood, we'd postpone calling
InitLocalBuffer() until the first use of temp tables in a session,
at which time the local buffer descriptor array would be allocated,
and henceforth you couldn't change the array size anymore.  This would
be enough flexibility to allow temp-table-intensive tasks to run with
a large local setting, without having to make every session do the same.
(It's conceivable that we could support on-the-fly resizing of the
array, but it seems unlikely to be worth the trouble and risk of bugs.)

It's already true that the individual buffers, as opposed to the buffer
descriptors, are allocated only as needed; which makes the overhead
of a large local_buffers setting pretty small if you don't actually do
much with temp tables in a given session.  So I was thinking about
making the default value fairly robust, maybe 1000 (as compared to
the historical value of 64...).

Comments?
        regards, tom lane


Re: GUC variable for setting number of local buffers

From
"Marc G. Fournier"
Date:
On Sat, 19 Mar 2005, Tom Lane wrote:

> That means we can go ahead with providing a GUC variable to make the 
> array size user-selectable.  I was thinking of calling it either 
> "local_buffers" (in contrast to "shared_buffers") or "temp_buffers" (to 
> emphasize the fact that they're used for temporary tables). Anyone have 
> a preference, or a better alternative?

temp_buffers sounds more descriptive ...

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664


Re: GUC variable for setting number of local buffers

From
Mark Kirkwood
Date:
Tom Lane wrote:
> That means we can go ahead with providing a GUC variable to make the
> array size user-selectable.  I was thinking of calling it either
> "local_buffers" (in contrast to "shared_buffers") or "temp_buffers"
> (to emphasize the fact that they're used for temporary tables).
> Anyone have a preference, or a better alternative?
>

"temp_buffers" (or even "temporary_buffers") makes it nice and clear 
what they are intended for.

cheers

Mark



Re: GUC variable for setting number of local buffers

From
Markus Bertheau
Date:
В Сбт, 19/03/2005 в 12:57 -0500, Tom Lane пишет:

> It's already true that the individual buffers, as opposed to the buffer
> descriptors, are allocated only as needed; which makes the overhead
> of a large local_buffers setting pretty small if you don't actually do
> much with temp tables in a given session.  So I was thinking about
> making the default value fairly robust, maybe 1000 (as compared to
> the historical value of 64...).

Why does the dba need to set that variable at all then?

--
Markus Bertheau <twanger@bluetwanger.de>

Re: GUC variable for setting number of local buffers

From
Bruce Momjian
Date:
Markus Bertheau wrote:
-- Start of PGP signed section.
> ? ???, 19/03/2005 ? 12:57 -0500, Tom Lane ?????:
> 
> > It's already true that the individual buffers, as opposed to the buffer
> > descriptors, are allocated only as needed; which makes the overhead
> > of a large local_buffers setting pretty small if you don't actually do
> > much with temp tables in a given session.  So I was thinking about
> > making the default value fairly robust, maybe 1000 (as compared to
> > the historical value of 64...).
> 
> Why does the dba need to set that variable at all then?

It is like sort_mem that is local memory but is limited so a single
backend does not exhaust all the RAM on the machine.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: GUC variable for setting number of local buffers

From
Simon Riggs
Date:
On Sat, 2005-03-19 at 12:57 -0500, Tom Lane wrote:
> That means we can go ahead with providing a GUC variable to make the
> array size user-selectable.  I was thinking of calling it either
> "local_buffers" (in contrast to "shared_buffers") or "temp_buffers"
> (to emphasize the fact that they're used for temporary tables).
> Anyone have a preference, or a better alternative?

> Comments?

All of that is good news...

Currently, we already have a GUC that describes the amount of memory we
can use for a backend, work_mem. Would it not be possible to continue to
use that setting and resize the temp_buffers area as needed so that
work_mem was not exceeded - and so we need not set local_temp_buffers?

It will become relatively hard to judge how to set work_mem and
local_temp_buffers for larger queries, and almost impossible to do that
in a multi-user system. To do that, we would need some additional
feedback that could be interpreted so as to judge how large to set
these. Perhaps to mention local buffer and memory usage in an EXPLAIN
ANALYZE? It would be much better if we could decide how best to use
work_mem according to the query plan that is just about to be executed,
then set all areas accordingly. After all, not all queries would use
both limits simultaneously.

This is, of course, a nice problem to have. :-)

If we must have a GUC, local_temp_buffers works better for me.
local_buffers is my second choice because it matches the terminology
used everywhere in the code and also because temp_buffers sounds like it
is a global setting, which it would not be.

Best Regards, Simon Riggs



Re: GUC variable for setting number of local buffers

From
Tom Lane
Date:
Markus Bertheau <twanger@bluetwanger.de> writes:
>> It's already true that the individual buffers, as opposed to the buffer
>> descriptors, are allocated only as needed; which makes the overhead
>> of a large local_buffers setting pretty small if you don't actually do
>> much with temp tables in a given session.  So I was thinking about
>> making the default value fairly robust, maybe 1000 (as compared to
>> the historical value of 64...).

> Why does the dba need to set that variable at all then?

Because you do have to have a limit.  You want the thing trying to keep
all of a large temp table in core?
        regards, tom lane