Thread: GUC variable for setting number of local buffers
We've had a TODO item for some time about allowing the user to set the size of the local buffer array that's used for accessing temporary tables. The holdup has been that localbuf.c used very unscalable algorithms (like linear search) and so a large local buffer set would have terrible performance anyway. We wanted localbuf.c to duplicate the shared buffer manager's search and replacement algorithms, which looked like a lot of work. However, the recent changes to make the shared buffer manager use a clock sweep replacement algorithm made it trivial to have localbuf.c do the same. I have just committed additional changes to make localbuf.c use a hash table instead of linear search for lookup, so it's now fully on par with the shared buffer manager as far as algorithms go. That means we can go ahead with providing a GUC variable to make the array size user-selectable. I was thinking of calling it either "local_buffers" (in contrast to "shared_buffers") or "temp_buffers" (to emphasize the fact that they're used for temporary tables). Anyone have a preference, or a better alternative? As far as semantics go, I was thinking of making the variable USERSET but allowing it to change only as long as you haven't accessed any temp tables in the current session. Under the hood, we'd postpone calling InitLocalBuffer() until the first use of temp tables in a session, at which time the local buffer descriptor array would be allocated, and henceforth you couldn't change the array size anymore. This would be enough flexibility to allow temp-table-intensive tasks to run with a large local setting, without having to make every session do the same. (It's conceivable that we could support on-the-fly resizing of the array, but it seems unlikely to be worth the trouble and risk of bugs.) It's already true that the individual buffers, as opposed to the buffer descriptors, are allocated only as needed; which makes the overhead of a large local_buffers setting pretty small if you don't actually do much with temp tables in a given session. So I was thinking about making the default value fairly robust, maybe 1000 (as compared to the historical value of 64...). Comments? regards, tom lane
On Sat, 19 Mar 2005, Tom Lane wrote: > That means we can go ahead with providing a GUC variable to make the > array size user-selectable. I was thinking of calling it either > "local_buffers" (in contrast to "shared_buffers") or "temp_buffers" (to > emphasize the fact that they're used for temporary tables). Anyone have > a preference, or a better alternative? temp_buffers sounds more descriptive ... ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
Tom Lane wrote: > That means we can go ahead with providing a GUC variable to make the > array size user-selectable. I was thinking of calling it either > "local_buffers" (in contrast to "shared_buffers") or "temp_buffers" > (to emphasize the fact that they're used for temporary tables). > Anyone have a preference, or a better alternative? > "temp_buffers" (or even "temporary_buffers") makes it nice and clear what they are intended for. cheers Mark
В Сбт, 19/03/2005 в 12:57 -0500, Tom Lane пишет: > It's already true that the individual buffers, as opposed to the buffer > descriptors, are allocated only as needed; which makes the overhead > of a large local_buffers setting pretty small if you don't actually do > much with temp tables in a given session. So I was thinking about > making the default value fairly robust, maybe 1000 (as compared to > the historical value of 64...). Why does the dba need to set that variable at all then? -- Markus Bertheau <twanger@bluetwanger.de>
Markus Bertheau wrote: -- Start of PGP signed section. > ? ???, 19/03/2005 ? 12:57 -0500, Tom Lane ?????: > > > It's already true that the individual buffers, as opposed to the buffer > > descriptors, are allocated only as needed; which makes the overhead > > of a large local_buffers setting pretty small if you don't actually do > > much with temp tables in a given session. So I was thinking about > > making the default value fairly robust, maybe 1000 (as compared to > > the historical value of 64...). > > Why does the dba need to set that variable at all then? It is like sort_mem that is local memory but is limited so a single backend does not exhaust all the RAM on the machine. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Sat, 2005-03-19 at 12:57 -0500, Tom Lane wrote: > That means we can go ahead with providing a GUC variable to make the > array size user-selectable. I was thinking of calling it either > "local_buffers" (in contrast to "shared_buffers") or "temp_buffers" > (to emphasize the fact that they're used for temporary tables). > Anyone have a preference, or a better alternative? > Comments? All of that is good news... Currently, we already have a GUC that describes the amount of memory we can use for a backend, work_mem. Would it not be possible to continue to use that setting and resize the temp_buffers area as needed so that work_mem was not exceeded - and so we need not set local_temp_buffers? It will become relatively hard to judge how to set work_mem and local_temp_buffers for larger queries, and almost impossible to do that in a multi-user system. To do that, we would need some additional feedback that could be interpreted so as to judge how large to set these. Perhaps to mention local buffer and memory usage in an EXPLAIN ANALYZE? It would be much better if we could decide how best to use work_mem according to the query plan that is just about to be executed, then set all areas accordingly. After all, not all queries would use both limits simultaneously. This is, of course, a nice problem to have. :-) If we must have a GUC, local_temp_buffers works better for me. local_buffers is my second choice because it matches the terminology used everywhere in the code and also because temp_buffers sounds like it is a global setting, which it would not be. Best Regards, Simon Riggs
Markus Bertheau <twanger@bluetwanger.de> writes: >> It's already true that the individual buffers, as opposed to the buffer >> descriptors, are allocated only as needed; which makes the overhead >> of a large local_buffers setting pretty small if you don't actually do >> much with temp tables in a given session. So I was thinking about >> making the default value fairly robust, maybe 1000 (as compared to >> the historical value of 64...). > Why does the dba need to set that variable at all then? Because you do have to have a limit. You want the thing trying to keep all of a large temp table in core? regards, tom lane