Thread: Auto-tuning work_mem and maintenance_work_mem
Josh Berkus suggested here that work_mem and maintenance_work_mem could be auto-tuned like effective_cache_size: http://www.postgresql.org/message-id/50ECCF93.3060101@agliodbs.com The attached patch implements this, closely matching the default values for the default shared_buffers value: test=> SHOW shared_buffers; shared_buffers ---------------- 128MB (1 row) test=> SHOW work_mem; work_mem ---------- 1310kB (1 row) test=> SHOW maintenance_work_mem; maintenance_work_mem ---------------------- 20971kB (1 row) Previous defaults were 1MB and 16MB, but the new defaults don't match exactly because our max_connections is a power of 10 (100), not a power of 2 (128). Of course, if shared_buffer is 10x larger, those defaults become 10x larger. FYI, I based maintenance_work_mem's default on shared_buffers, not on work_mem because it was too complex to change maintenance_work_mem when someone changes work_mem. I will work on auto-tuning temp_buffers next. Any other suggestions? wal_buffers is already auto-tuned. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Attachment
On 2013-10-09 10:30:46 -0400, Bruce Momjian wrote: > Josh Berkus suggested here that work_mem and maintenance_work_mem could > be auto-tuned like effective_cache_size: > > http://www.postgresql.org/message-id/50ECCF93.3060101@agliodbs.com > > The attached patch implements this, closely matching the default values > for the default shared_buffers value: There imo is no correlation between correct values for shared_buffers and work_mem at all. They really are much more workload dependant than anything. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Oct 9, 2013 at 04:32:44PM +0200, Andres Freund wrote: > On 2013-10-09 10:30:46 -0400, Bruce Momjian wrote: > > Josh Berkus suggested here that work_mem and maintenance_work_mem could > > be auto-tuned like effective_cache_size: > > > > http://www.postgresql.org/message-id/50ECCF93.3060101@agliodbs.com > > > > The attached patch implements this, closely matching the default values > > for the default shared_buffers value: > > There imo is no correlation between correct values for shared_buffers > and work_mem at all. They really are much more workload dependant than > anything. Well, that is true, but the more shared_buffers you allocate, the more work_mem you _probably_ want to use. This is only a change of the default. Effectively, if every session uses one full work_mem, you end up with total work_mem usage equal to shared_buffers. We can try a different algorithm to scale up work_mem, but it seems wise to auto-scale it up to some extent based on shared_buffers. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On 2013-10-09 10:35:28 -0400, Bruce Momjian wrote: > On Wed, Oct 9, 2013 at 04:32:44PM +0200, Andres Freund wrote: > > On 2013-10-09 10:30:46 -0400, Bruce Momjian wrote: > > > Josh Berkus suggested here that work_mem and maintenance_work_mem could > > > be auto-tuned like effective_cache_size: > > > > > > http://www.postgresql.org/message-id/50ECCF93.3060101@agliodbs.com > > > > > > The attached patch implements this, closely matching the default values > > > for the default shared_buffers value: > > > > There imo is no correlation between correct values for shared_buffers > > and work_mem at all. They really are much more workload dependant than > > anything. > > Well, that is true, but the more shared_buffers you allocate, the more > work_mem you _probably_ want to use. This is only a change of the > default. Not at all. There's lots of OLTP workloads where huge shared buffers are beneficial but you definitely don't want a huge work_mem. > Effectively, if every session uses one full work_mem, you end up with > total work_mem usage equal to shared_buffers. But that's not how work_mem works. It's limiting memory, per node in the query. So a complex query can use it several dozen times. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
2013/10/9 Bruce Momjian <bruce@momjian.us>
On Wed, Oct 9, 2013 at 04:32:44PM +0200, Andres Freund wrote:Well, that is true, but the more shared_buffers you allocate, the more
> On 2013-10-09 10:30:46 -0400, Bruce Momjian wrote:
> > Josh Berkus suggested here that work_mem and maintenance_work_mem could
> > be auto-tuned like effective_cache_size:
> >
> > http://www.postgresql.org/message-id/50ECCF93.3060101@agliodbs.com
> >
> > The attached patch implements this, closely matching the default values
> > for the default shared_buffers value:
>
> There imo is no correlation between correct values for shared_buffers
> and work_mem at all. They really are much more workload dependant than
> anything.
work_mem you _probably_ want to use. This is only a change of the
default.
Effectively, if every session uses one full work_mem, you end up with
total work_mem usage equal to shared_buffers.
We can try a different algorithm to scale up work_mem, but it seems wise
to auto-scale it up to some extent based on shared_buffers.
In my experience a optimal value of work_mem depends on data and load, so I prefer a work_mem as independent parameter.
maintenance_work_mem can depend on work_mem ~ work_mem * 1 * max_connection / 4
Regards
Pavel
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Oct 9, 2013 at 04:38:01PM +0200, Andres Freund wrote: > On 2013-10-09 10:35:28 -0400, Bruce Momjian wrote: > > On Wed, Oct 9, 2013 at 04:32:44PM +0200, Andres Freund wrote: > > > On 2013-10-09 10:30:46 -0400, Bruce Momjian wrote: > > > > Josh Berkus suggested here that work_mem and maintenance_work_mem could > > > > be auto-tuned like effective_cache_size: > > > > > > > > http://www.postgresql.org/message-id/50ECCF93.3060101@agliodbs.com > > > > > > > > The attached patch implements this, closely matching the default values > > > > for the default shared_buffers value: > > > > > > There imo is no correlation between correct values for shared_buffers > > > and work_mem at all. They really are much more workload dependant than > > > anything. > > > > Well, that is true, but the more shared_buffers you allocate, the more > > work_mem you _probably_ want to use. This is only a change of the > > default. > > Not at all. There's lots of OLTP workloads where huge shared buffers are > beneficial but you definitely don't want a huge work_mem. > > > Effectively, if every session uses one full work_mem, you end up with > > total work_mem usage equal to shared_buffers. > > But that's not how work_mem works. It's limiting memory, per node in the > query. So a complex query can use it several dozen times. True, but again, odds are all sessions are not going to use the full work_mem allocation, so I figured assuming each session uses one full work_mem was probably an over-estimate. You are saying that auto-tuning work_mem for typical workloads is not a win? I don't understand how that can be true. You can always change the default for atypical workloads. As a data point, users often set shared_buffers to 2GB, but the default work_mem of 1MB would mean you would use perhaps 100MB for all sorting --- that seems kind of small as a default. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Oct 9, 2013 at 04:40:38PM +0200, Pavel Stehule wrote: > Effectively, if every session uses one full work_mem, you end up with > total work_mem usage equal to shared_buffers. > > We can try a different algorithm to scale up work_mem, but it seems wise > to auto-scale it up to some extent based on shared_buffers. > > > In my experience a optimal value of work_mem depends on data and load, so I > prefer a work_mem as independent parameter. But it still is an independent parameter. I am just changing the default. > maintenance_work_mem can depend on work_mem ~ work_mem * 1 * max_connection / 4 That is kind of hard to do because we would have to figure out if the old maintenance_work_mem was set from a default computation or by the user. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Oct 9, 2013 at 10:45:52AM -0400, Bruce Momjian wrote: > On Wed, Oct 9, 2013 at 04:40:38PM +0200, Pavel Stehule wrote: > > Effectively, if every session uses one full work_mem, you end up with > > total work_mem usage equal to shared_buffers. > > > > We can try a different algorithm to scale up work_mem, but it seems wise > > to auto-scale it up to some extent based on shared_buffers. > > > > > > In my experience a optimal value of work_mem depends on data and load, so I > > prefer a work_mem as independent parameter. > > But it still is an independent parameter. I am just changing the default. > > > maintenance_work_mem can depend on work_mem ~ work_mem * 1 * max_connection / 4 > > That is kind of hard to do because we would have to figure out if the > old maintenance_work_mem was set from a default computation or by the > user. FYI, this auto-tuning is not for us, who understand the parameters and how they interact, but for the 90% of our users who would benefit from better defaults. It is true that there might now be cases where you would need to _reduce_ work_mem from its default, but I think the new computed default will be better for most users. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
2013/10/9 Bruce Momjian <bruce@momjian.us>
On Wed, Oct 9, 2013 at 10:45:52AM -0400, Bruce Momjian wrote:
> On Wed, Oct 9, 2013 at 04:40:38PM +0200, Pavel Stehule wrote:
> > Effectively, if every session uses one full work_mem, you end up with
> > total work_mem usage equal to shared_buffers.
> >
> > We can try a different algorithm to scale up work_mem, but it seems wise
> > to auto-scale it up to some extent based on shared_buffers.
> >
> >
> > In my experience a optimal value of work_mem depends on data and load, so I
> > prefer a work_mem as independent parameter.
>
> But it still is an independent parameter. I am just changing the default.
>
> > maintenance_work_mem can depend on work_mem ~ work_mem * 1 * max_connection / 4
>
> That is kind of hard to do because we would have to figure out if the
> old maintenance_work_mem was set from a default computation or by the
> user.
FYI, this auto-tuning is not for us, who understand the parameters and
how they interact, but for the 90% of our users who would benefit from
better defaults. It is true that there might now be cases where you
would need to _reduce_ work_mem from its default, but I think the new
computed default will be better for most users.
then we should to use as base a how much dedicated RAM is for PG - not shared buffers.
Pavel
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
On 10/09/2013 10:45 AM, Bruce Momjian wrote: > On Wed, Oct 9, 2013 at 04:40:38PM +0200, Pavel Stehule wrote: >> Effectively, if every session uses one full work_mem, you end up with >> total work_mem usage equal to shared_buffers. >> >> We can try a different algorithm to scale up work_mem, but it seems wise >> to auto-scale it up to some extent based on shared_buffers. >> >> >> In my experience a optimal value of work_mem depends on data and load, so I >> prefer a work_mem as independent parameter. > But it still is an independent parameter. I am just changing the default. > The danger with work_mem especially is that setting it too high can lead to crashing postgres or your system at some stage down the track, so autotuning it is kinda dangerous, much more dangerous than autotuning shared buffers. The assumption that each connection won't use lots of work_mem is also false, I think, especially in these days of connection poolers. I'm not saying don't do it, but I think we need to be quite conservative about it. A reasonable default might be (shared_buffers / (n * max_connections)) FSVO n, but I'm not sure what n should be. Instinct says something like 4, but I have no data to back that up. cheers andrew
On Wed, Oct 9, 2013 at 05:01:24PM +0200, Pavel Stehule wrote: > FYI, this auto-tuning is not for us, who understand the parameters and > how they interact, but for the 90% of our users who would benefit from > better defaults. It is true that there might now be cases where you > would need to _reduce_ work_mem from its default, but I think the new > computed default will be better for most users. > > > > then we should to use as base a how much dedicated RAM is for PG - not shared > buffers. Yes, that was Josh Berkus's suggestion, and we can switch to that, though it requires a new GUC parameter, and then shared_buffers gets tuned on that. I went with shared_buffers because unlike the others, it is a fixed allocation quantity, while the other are much more variable and harder to set. I figured we could keep our 25% estimate of shared_buffers and everything else would fall in line. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
2013/10/9 Bruce Momjian <bruce@momjian.us>
On Wed, Oct 9, 2013 at 05:01:24PM +0200, Pavel Stehule wrote:Yes, that was Josh Berkus's suggestion, and we can switch to that,
> FYI, this auto-tuning is not for us, who understand the parameters and
> how they interact, but for the 90% of our users who would benefit from
> better defaults. It is true that there might now be cases where you
> would need to _reduce_ work_mem from its default, but I think the new
> computed default will be better for most users.
>
>
>
> then we should to use as base a how much dedicated RAM is for PG - not shared
> buffers.
though it requires a new GUC parameter, and then shared_buffers gets
tuned on that.
I went with shared_buffers because unlike the others, it is a fixed
allocation quantity, while the other are much more variable and harder
to set. I figured we could keep our 25% estimate of shared_buffers and
everything else would fall in line.
I understand, but your proposal change a logic to opposite direction. Maybe better is wait to new GUC parameter, and then implement this feature, so be logical and simply understandable.
Pavel
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
On Wed, Oct 9, 2013 at 11:06:07AM -0400, Andrew Dunstan wrote: > > On 10/09/2013 10:45 AM, Bruce Momjian wrote: > >On Wed, Oct 9, 2013 at 04:40:38PM +0200, Pavel Stehule wrote: > >> Effectively, if every session uses one full work_mem, you end up with > >> total work_mem usage equal to shared_buffers. > >> > >> We can try a different algorithm to scale up work_mem, but it seems wise > >> to auto-scale it up to some extent based on shared_buffers. > >> > >> > >>In my experience a optimal value of work_mem depends on data and load, so I > >>prefer a work_mem as independent parameter. > >But it still is an independent parameter. I am just changing the default. > > > > The danger with work_mem especially is that setting it too high can > lead to crashing postgres or your system at some stage down the > track, so autotuning it is kinda dangerous, much more dangerous than > autotuning shared buffers. Good point. > The assumption that each connection won't use lots of work_mem is > also false, I think, especially in these days of connection poolers. OK, makes sense because the sessions last longer. > I'm not saying don't do it, but I think we need to be quite > conservative about it. A reasonable default might be (shared_buffers > / (n * max_connections)) FSVO n, but I'm not sure what n should be. > Instinct says something like 4, but I have no data to back that up. I am fine with '4' --- worked as an effective_cache_size multipler. ;-) I think we should try to hit the existing defaults, which would mean we would use this computation: (shared_buffers / 4) / max_connections + 768k / BUFSZ This would give us for a default 128MB shared buffers and 100 max_connections: (16384 / 4) / 100 + (768 * 1024) / 8192 which gives us 136, and that is 136 * 8192 or 1088k, close to 1MB. For 10x shared buffers, 163840, it gives a work_mem of 4040k, rather than the 10M I was computing in the original patch. How is that? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Oct 9, 2013 at 06:20:13PM +0200, Pavel Stehule wrote: > On Wed, Oct 9, 2013 at 05:01:24PM +0200, Pavel Stehule wrote: > > FYI, this auto-tuning is not for us, who understand the parameters > and > > how they interact, but for the 90% of our users who would benefit > from > > better defaults. It is true that there might now be cases where you > > would need to _reduce_ work_mem from its default, but I think the new > > computed default will be better for most users. > > > > > > > > then we should to use as base a how much dedicated RAM is for PG - not > shared > > buffers. > > Yes, that was Josh Berkus's suggestion, and we can switch to that, > though it requires a new GUC parameter, and then shared_buffers gets > tuned on that. > > I went with shared_buffers because unlike the others, it is a fixed > allocation quantity, while the other are much more variable and harder > to set. I figured we could keep our 25% estimate of shared_buffers and > everything else would fall in line. > > > I understand, but your proposal change a logic to opposite direction. Maybe > better is wait to new GUC parameter, and then implement this feature, so be > logical and simply understandable. OK, I can easily do that. What do others think? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
* Pavel Stehule (pavel.stehule@gmail.com) wrote: > 2013/10/9 Bruce Momjian <bruce@momjian.us> > > I went with shared_buffers because unlike the others, it is a fixed > > allocation quantity, while the other are much more variable and harder > > to set. I figured we could keep our 25% estimate of shared_buffers and > > everything else would fall in line. > > > > I understand, but your proposal change a logic to opposite direction. Maybe > better is wait to new GUC parameter, and then implement this feature, so be > logical and simply understandable. I disagree- having a better default than what we have now is going to almost certainly be a huge improvement in the vast majority of cases. How we arrive at the default isn't particularly relevant as long as we document it. Users who end up using the default don't do so because they read the docs and said "oh, yeah, the way they calculated the default makes a lot of sense", then end up using it because they never open the config file, at all. In other words, I think the set of people who would appreciate having the default calculated in a good way has no intersection with the set of people who *use* the default values, which is the audience that the default values are for. Thanks, Stephen
* Bruce Momjian (bruce@momjian.us) wrote: > I think we should try to hit the existing defaults, which would mean we > would use this computation: For my 2c, I was hoping this would improve things for our users by raising the tiny 1M default work_mem, so I don't agree that we should simply be coming up with an algorithm to hit the same numbers we already have today. > (shared_buffers / 4) / max_connections + 768k / BUFSZ > > This would give us for a default 128MB shared buffers and 100 > max_connections: > > (16384 / 4) / 100 + (768 * 1024) / 8192 > > which gives us 136, and that is 136 * 8192 or 1088k, close to 1MB. > > For 10x shared buffers, 163840, it gives a work_mem of 4040k, rather > than the 10M I was computing in the original patch. > > How is that? So this would only help if people are already going in and modifying shared_buffers, but not setting work_mem? I'd rather see better defaults for users that don't touch anything, such as 4MB or even larger. Thanks, Stephen
On 10/09/2013 09:30 AM, Stephen Frost wrote: >>> I went with shared_buffers because unlike the others, it is a fixed >>> > > allocation quantity, while the other are much more variable and harder >>> > > to set. I figured we could keep our 25% estimate of shared_buffers and >>> > > everything else would fall in line. >>> > > >> > >> > I understand, but your proposal change a logic to opposite direction. Maybe >> > better is wait to new GUC parameter, and then implement this feature, so be >> > logical and simply understandable. > I disagree- having a better default than what we have now is going to > almost certainly be a huge improvement in the vast majority of cases. > How we arrive at the default isn't particularly relevant as long as we > document it. Users who end up using the default don't do so because > they read the docs and said "oh, yeah, the way they calculated the > default makes a lot of sense", then end up using it because they never > open the config file, at all. FWIW, I've been using the following calculations as "starting points" for work_mem with both clients and students. In 80-90% of cases, the user never adjusts the thresholds again, so I'd say that passes the test for a "good enough" setting. The main goal is (a) not to put a default low ceiling on work_mem for people who have lots of RAM and (b) lower the limit for users who have way too many connections on a low-RAM machine. # Most web applications should use the formula below, because their # queries often require no work_mem. # work_mem = ( AvRAM / max_connections ) ROUND DOWN to 2^x # work_mem = 4MB # for 2GB server with 300 connections # Solaris: cut the above in half. # Formula for most BI/DW applications, or others running many complex # queries: # work_mem = ( AvRAM / ( 2 * max_connections ) ) ROUND DOWN to 2^x # work_mem = 128MB # DW server with 32GB RAM and 40 connections AvRAM is "available ram", which for purposes of this approach would be 4X shared_buffers. So the final formula would be: shared_buffers * 4 / max_connections = work_mem *however*, there's a couple problems with autotuning the above: 1) it's strongly workload-dependant; we need to know if the user is doing DW or OLTP. 2) few users adjust their max_connections downwards, even when it's warranted. 3) we also need to know if the user is on a platform like Solaris or FreeBSD which doesn't overcommit RAM allocations per-backend. BTW, in extensive testing of DW workloads, I've never seen an individual backend allocate more than 3X work_mem total. So if we want a completely generic limit, I would say: 1MB << (shared_buffers * 2 / max_connections) << 256MB That is: divide double shared buffers by max_connections. If that's over 1MB, raise it, but not further than 256MB. Overall, our real answer to autotuning work_mem is to have work_mem admissions control, per Kevin's proposal a couple years ago. maintenance_work_mem is easier, because we only really care about the number of autovacuum daemons, which is usually 3. so: 8MB << (shared_buffers / autovacuum_workers) << 256MB ... would do it. Note that I'd expect to adjust the upper limits of these ranges each year, as larger and larger RAM becomes commonplace and as we work out PG's issues with using large RAM blocks. I'm not sure that temp_buffers can be autotuned at all. We'd have to make assumptions about how many temp tables a particular application uses, which is going to be either "a lot" or "none at all". However, at a stab: 1MB << (shared_buffers * 4 / max_connections) << 512MB QUESTION: at one time (7.2?), we allocated work_mem purely by doubling RAM requests, which meant that setting work_mem to any non-binary value meant you actually got the next lowest binary value. Is that no longer true? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Wed, Oct 09, 2013 at 12:30:22PM -0400, Stephen Frost wrote: > * Pavel Stehule (pavel.stehule@gmail.com) wrote: > > 2013/10/9 Bruce Momjian <bruce@momjian.us> > > > I went with shared_buffers because unlike the others, it is a fixed > > > allocation quantity, while the other are much more variable and harder > > > to set. I figured we could keep our 25% estimate of shared_buffers and > > > everything else would fall in line. > > > > > > > I understand, but your proposal change a logic to opposite direction. Maybe > > better is wait to new GUC parameter, and then implement this feature, so be > > logical and simply understandable. > > I disagree- having a better default than what we have now is going to > almost certainly be a huge improvement in the vast majority of cases. > How we arrive at the default isn't particularly relevant as long as we > document it. Users who end up using the default don't do so because > they read the docs and said "oh, yeah, the way they calculated the > default makes a lot of sense", then end up using it because they never > open the config file, at all. > > In other words, I think the set of people who would appreciate having > the default calculated in a good way has no intersection with the set of > people who *use* the default values, which is the audience that the > default values are for. +1 for setting defaults which assume an at least vaguely modern piece of hardware. By and large, people are not installing PostgreSQL for the very first time on a server the newest component of which is ten years old. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Wed, Oct 9, 2013 at 12:25:49PM -0400, Bruce Momjian wrote: > > I'm not saying don't do it, but I think we need to be quite > > conservative about it. A reasonable default might be (shared_buffers > > / (n * max_connections)) FSVO n, but I'm not sure what n should be. > > Instinct says something like 4, but I have no data to back that up. > > I am fine with '4' --- worked as an effective_cache_size multipler. ;-) > I think we should try to hit the existing defaults, which would mean we > would use this computation: > > (shared_buffers / 4) / max_connections + 768k / BUFSZ > > This would give us for a default 128MB shared buffers and 100 > max_connections: > > (16384 / 4) / 100 + (768 * 1024) / 8192 > > which gives us 136, and that is 136 * 8192 or 1088k, close to 1MB. > > For 10x shared buffers, 163840, it gives a work_mem of 4040k, rather > than the 10M I was computing in the original patch. > > How is that? In summary, that would be 615MB for shared_buffers of 2GB, assuming one work_mem per session, and assuming you are running the maximum number of sessions, which you would not normally do. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On 10/09/2013 07:58 AM, Bruce Momjian wrote: >> But it still is an independent parameter. I am just changing the default. >> >>> maintenance_work_mem can depend on work_mem ~ work_mem * 1 * max_connection / 4 >> >> That is kind of hard to do because we would have to figure out if the >> old maintenance_work_mem was set from a default computation or by the >> user. > > FYI, this auto-tuning is not for us, who understand the parameters and > how they interact, but for the 90% of our users who would benefit from > better defaults. It is true that there might now be cases where you > would need to _reduce_ work_mem from its default, but I think the new > computed default will be better for most users. > Just to step in here as a consultant. Bruce is right on here. Autotuning has nothing to do with us, it has to do with Rails developers who deploy PostgreSQL and known nothing of it except what ActiveRecord tells them (I am not being rude here). We could argue all day what the best equation is for this, the key is to pick something reasonable, not perfect. Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats
On Wed, Oct 9, 2013 at 12:41:53PM -0400, Stephen Frost wrote: > * Bruce Momjian (bruce@momjian.us) wrote: > > I think we should try to hit the existing defaults, which would mean we > > would use this computation: > > For my 2c, I was hoping this would improve things for our users by > raising the tiny 1M default work_mem, so I don't agree that we should > simply be coming up with an algorithm to hit the same numbers we already > have today. > > > (shared_buffers / 4) / max_connections + 768k / BUFSZ > > > > This would give us for a default 128MB shared buffers and 100 > > max_connections: > > > > (16384 / 4) / 100 + (768 * 1024) / 8192 > > > > which gives us 136, and that is 136 * 8192 or 1088k, close to 1MB. > > > > For 10x shared buffers, 163840, it gives a work_mem of 4040k, rather > > than the 10M I was computing in the original patch. > > > > How is that? > > So this would only help if people are already going in and modifying > shared_buffers, but not setting work_mem? I'd rather see better > defaults for users that don't touch anything, such as 4MB or even > larger. OK, but changing the default if shared_buffers is _not_ changed is a separate discussion. We can have the discussion here or in another thread. I am thinking the right fix is to allocate larger shared_buffers, especially now that we don't require a larger System V shared memory segement. Basically, for 128MB of shared buffers, I figured the calculation was fine, and when we increase the default shared_buffers, we will then get a better default, which is why I am quoting the 2GB shared_buffers defaults in my emails. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On 10/09/2013 10:07 AM, Bruce Momjian wrote: > We can have the discussion here or in another thread. I am thinking the > right fix is to allocate larger shared_buffers, especially now that we > don't require a larger System V shared memory segement. Basically, for > 128MB of shared buffers, I figured the calculation was fine, and when we > increase the default shared_buffers, we will then get a better default, > which is why I am quoting the 2GB shared_buffers defaults in my emails. Also, it's *worlds* easier to tell users: "set shared_buffers to 1/4 of your RAM, butnot more than 8GB." then to tell them: "set shared_buffers to X, and work_mem to Y, and maintenance_work_mem to Z ..." That is, if there's one and only one setting users need to change, they are more likely to do it. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Wed, Oct 9, 2013 at 10:30 AM, Bruce Momjian <bruce@momjian.us> wrote: > Josh Berkus suggested here that work_mem and maintenance_work_mem could > be auto-tuned like effective_cache_size: > > http://www.postgresql.org/message-id/50ECCF93.3060101@agliodbs.com I think that this is unlikely to work out well. effective_cache_size is a relatively unimportant parameter and the main thing that is important is not to set it egregiously too low. The formula we've committed is probably inaccurate in a large number of case, but it doesn't really matter, because it doesn't do that much in the first place. The same cannot be said for work_mem. Setting it too low cripples performance; setting it too high risks bringing the whole system down.Putting an auto-tuning formula in place that dependson the values for multiple other GUCs is just asking for trouble. Just to give a few example, suppose that a user increases shared_buffers. Magically, work_mem also increases, and everything works great until a load spike causes the system to start swapping, effectively dead in the water. Or suppose the user increases max_connections; all of their query plans change, probably getting worse. The value of the auto-tuning has got to be weighed against the risk of unintended consequences and user confusion, which IMHO is pretty high in this case. And quite frankly I don't think I really believe the auto-tuning formula has much chance of being right in the first place. It's generally true that you're going to need to increase work_mem if you have more memory and decrease it work_mem if you have more connections, but it also depends on a lot of other things, like the complexity of the queries being run, whether all of the connection slots are actually routinely used, and whether you've really set shared_buffers to 25% of your system's total memory, which many people do not, especially on Windows. I think we're just going to create the false impression that we know what the optimal value is when, in reality, that's far from true. I think what is really needed is not so much to auto-tune work_mem as to provide a more sensible default. Why not just change the default to 4MB and be done with it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Oct 9, 2013 at 01:34:21PM -0400, Robert Haas wrote: > And quite frankly I don't think I really believe the auto-tuning > formula has much chance of being right in the first place. It's > generally true that you're going to need to increase work_mem if you > have more memory and decrease it work_mem if you have more > connections, but it also depends on a lot of other things, like the > complexity of the queries being run, whether all of the connection > slots are actually routinely used, and whether you've really set > shared_buffers to 25% of your system's total memory, which many people > do not, especially on Windows. I think we're just going to create the > false impression that we know what the optimal value is when, in > reality, that's far from true. I disagree. There is nothing preventing users from setting their own values, but I think auto-tuning will be make people who don't change values more likely to be closer to an optimal values. We can't auto-tune to a perfect value, but we can auto-tune closer to a perfect value than a fixed default. Yes, auto-tuned values are going to be worse for some users, but I believe they will be better for most users. Having really bad defaults so everyone knows they are bad really isn't user-friendly because the only people who know they are really bad are the people who are tuning them already. Again, we need to think of the typical user, not us. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Oct 9, 2013 at 1:44 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Wed, Oct 9, 2013 at 01:34:21PM -0400, Robert Haas wrote: >> And quite frankly I don't think I really believe the auto-tuning >> formula has much chance of being right in the first place. It's >> generally true that you're going to need to increase work_mem if you >> have more memory and decrease it work_mem if you have more >> connections, but it also depends on a lot of other things, like the >> complexity of the queries being run, whether all of the connection >> slots are actually routinely used, and whether you've really set >> shared_buffers to 25% of your system's total memory, which many people >> do not, especially on Windows. I think we're just going to create the >> false impression that we know what the optimal value is when, in >> reality, that's far from true. > > I disagree. There is nothing preventing users from setting their own > values, but I think auto-tuning will be make people who don't change > values more likely to be closer to an optimal values. We can't > auto-tune to a perfect value, but we can auto-tune closer to a perfect > value than a fixed default. Yes, auto-tuned values are going to be > worse for some users, but I believe they will be better for most users. > > Having really bad defaults so everyone knows they are bad really isn't > user-friendly because the only people who know they are really bad are > the people who are tuning them already. Again, we need to think of the > typical user, not us. I think a typical user will be happier if we simply raise the default rather than stick in an auto-tuning formula that's largely wishful thinking. You're welcome to disagree, but you neither quoted nor responded to my points about the sorts of scenarios in which that might cause surprising and hard-to-debug results. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Oct 9, 2013 at 01:49:23PM -0400, Robert Haas wrote: > > Having really bad defaults so everyone knows they are bad really isn't > > user-friendly because the only people who know they are really bad are > > the people who are tuning them already. Again, we need to think of the > > typical user, not us. > > I think a typical user will be happier if we simply raise the default > rather than stick in an auto-tuning formula that's largely wishful > thinking. You're welcome to disagree, but you neither quoted nor > responded to my points about the sorts of scenarios in which that > might cause surprising and hard-to-debug results. Well, pointing out that is will be negative for some users (which I agree) doesn't refute that it will be better for most users. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Oct 9, 2013 at 2:28 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Wed, Oct 9, 2013 at 01:49:23PM -0400, Robert Haas wrote: >> > Having really bad defaults so everyone knows they are bad really isn't >> > user-friendly because the only people who know they are really bad are >> > the people who are tuning them already. Again, we need to think of the >> > typical user, not us. >> >> I think a typical user will be happier if we simply raise the default >> rather than stick in an auto-tuning formula that's largely wishful >> thinking. You're welcome to disagree, but you neither quoted nor >> responded to my points about the sorts of scenarios in which that >> might cause surprising and hard-to-debug results. > > Well, pointing out that is will be negative for some users (which I > agree) doesn't refute that it will be better for most users. That is, of course, true. But I don't think you've made any argument that the pros exceed the cons, or that the formula will in general be accurate. It's massive simpler than what Josh says he uses, for example, and he's not making the completely silly assumption that available RAM is 4 * shared_buffers. An auto-tuning formula that's completely inaccurate probably won't be better for most users. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2013-10-09 10:02:12 -0700, Joshua D. Drake wrote: > > On 10/09/2013 07:58 AM, Bruce Momjian wrote: > > >>But it still is an independent parameter. I am just changing the default. > >> > >>>maintenance_work_mem can depend on work_mem ~ work_mem * 1 * max_connection / 4 > >> > >>That is kind of hard to do because we would have to figure out if the > >>old maintenance_work_mem was set from a default computation or by the > >>user. > > > >FYI, this auto-tuning is not for us, who understand the parameters and > >how they interact, but for the 90% of our users who would benefit from > >better defaults. It is true that there might now be cases where you > >would need to _reduce_ work_mem from its default, but I think the new > >computed default will be better for most users. > > > > Just to step in here as a consultant. Bruce is right on here. Autotuning has > nothing to do with us, it has to do with Rails developers who deploy > PostgreSQL and known nothing of it except what ActiveRecord tells them (I am > not being rude here). But rails environments aren't exactly a good case for this. They often have a high number of connection that's even pooled. They also mostly don't have that many analytics queries where a high work_mem benefits them much. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Oct 9, 2013 at 02:34:19PM -0400, Robert Haas wrote: > On Wed, Oct 9, 2013 at 2:28 PM, Bruce Momjian <bruce@momjian.us> wrote: > > On Wed, Oct 9, 2013 at 01:49:23PM -0400, Robert Haas wrote: > >> > Having really bad defaults so everyone knows they are bad really isn't > >> > user-friendly because the only people who know they are really bad are > >> > the people who are tuning them already. Again, we need to think of the > >> > typical user, not us. > >> > >> I think a typical user will be happier if we simply raise the default > >> rather than stick in an auto-tuning formula that's largely wishful > >> thinking. You're welcome to disagree, but you neither quoted nor > >> responded to my points about the sorts of scenarios in which that > >> might cause surprising and hard-to-debug results. > > > > Well, pointing out that is will be negative for some users (which I > > agree) doesn't refute that it will be better for most users. > > That is, of course, true. But I don't think you've made any argument > that the pros exceed the cons, or that the formula will in general be > accurate. It's massive simpler than what Josh says he uses, for > example, and he's not making the completely silly assumption that > available RAM is 4 * shared_buffers. An auto-tuning formula that's > completely inaccurate probably won't be better for most users. I disagree. I think we can get a forumla that is certainly better than a fixed value. I think the examples I have shown do have better value than a default fixed value. I am open to whatever forumula people think is best, but I can't see how a fixed value is a win in general. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Oct 9, 2013 at 7:30 AM, Bruce Momjian <bruce@momjian.us> wrote: > Josh Berkus suggested here that work_mem and maintenance_work_mem could > be auto-tuned like effective_cache_size: Have you ever thought that the role of maintenance_work_mem was a bit muddled? It recently occurred to me that it might be a good idea to have a separate setting that is used to bound the amount of memory used by autovacuum (and possibly VACUUM generally) in preference to the more generic maintenance_work_mem setting. In the docs, maintenance_work_mem has this caveat: "Note that when autovacuum runs, up to autovacuum_max_workers times this memory may be allocated, so be careful not to set the default value too high." If Heroku could increase maintenace_work_mem without having it affect the amount of memory used by autovacuum workers, I'm fairly confident that our setting would be higher. Sure, you can just increase it as you need to, but you have to know about it in the first place, which is asking too much of many people tasked with semi-routine maintenance tasks like creating indexes. -- Peter Geoghegan
On 10/09/2013 01:37 PM, Peter Geoghegan wrote: > If Heroku could increase maintenace_work_mem without having it affect > the amount of memory used by autovacuum workers, I'm fairly confident > that our setting would be higher. Sure, you can just increase it as > you need to, but you have to know about it in the first place, which > is asking too much of many people tasked with semi-routine maintenance > tasks like creating indexes. Personally, I never got why we used maint_work_mem instead of work_mem for bulk-loading indexes. What was the reason there? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Wed, Oct 9, 2013 at 09:52:03AM -0700, Josh Berkus wrote: > On 10/09/2013 09:30 AM, Stephen Frost wrote: > >>> I went with shared_buffers because unlike the others, it is a fixed > >>> > > allocation quantity, while the other are much more variable and harder > >>> > > to set. I figured we could keep our 25% estimate of shared_buffers and > >>> > > everything else would fall in line. > >>> > > > >> > > >> > I understand, but your proposal change a logic to opposite direction. Maybe > >> > better is wait to new GUC parameter, and then implement this feature, so be > >> > logical and simply understandable. > > I disagree- having a better default than what we have now is going to > > almost certainly be a huge improvement in the vast majority of cases. > > How we arrive at the default isn't particularly relevant as long as we > > document it. Users who end up using the default don't do so because > > they read the docs and said "oh, yeah, the way they calculated the > > default makes a lot of sense", then end up using it because they never > > open the config file, at all. > > FWIW, I've been using the following calculations as "starting points" > for work_mem with both clients and students. In 80-90% of cases, the > user never adjusts the thresholds again, so I'd say that passes the test > for a "good enough" setting. OK, I have developed the attached patch based on feedback. I took into account Andrew's concern that pooling might cause use of more work_mem than you would expect in a typical session, and Robert's legitimate concern about a destabalizing default for work_mem. I therefore went with the shared_buffers/4 idea. Josh had some interesting calculations for work_mem, but I didn't think the max value would work well as it would confuse users and not be properly maintained by us as hardware grew. I also think changing those defaults between major releases would be perhaps destabilizing. Josh's observation that he rarely sees more than 3x work_mem in a session helps put an upper limit on memory usage. I did like Josh's idea about using autovacuum_max_workers for maintenance_work_mem, though I used the shared_buffers/4 calculation. Here are the defaults for two configurations; first, for the 128MB default shared_buffers: test=> SHOW shared_buffers; shared_buffers ---------------- 128MB (1 row) test=> SHOW work_mem; work_mem ---------- 1095kB (1 row) test=> SHOW maintenance_work_mem; maintenance_work_mem ---------------------- 10922kB (1 row) and for shared_buffers of 2GB: test=> show shared_buffers; shared_buffers ---------------- 2GB (1 row) test=> SHOW work_mem; work_mem ---------- 6010kB (1 row) test=> SHOW maintenance_work_mem ; maintenance_work_mem ---------------------- 174762kB (1 row) -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Attachment
On Wed, Oct 9, 2013 at 02:11:47PM -0700, Josh Berkus wrote: > On 10/09/2013 01:37 PM, Peter Geoghegan wrote: > > If Heroku could increase maintenace_work_mem without having it affect > > the amount of memory used by autovacuum workers, I'm fairly confident > > that our setting would be higher. Sure, you can just increase it as > > you need to, but you have to know about it in the first place, which > > is asking too much of many people tasked with semi-routine maintenance > > tasks like creating indexes. > > Personally, I never got why we used maint_work_mem instead of work_mem > for bulk-loading indexes. What was the reason there? Because 'maintenance' operations were rarer, so we figured we could use more memory in those cases. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Oct 9, 2013 at 2:15 PM, Bruce Momjian <bruce@momjian.us> wrote: > I did like Josh's idea about using autovacuum_max_workers for > maintenance_work_mem, though I used the shared_buffers/4 calculation. I don't like that idea myself, because I wouldn't like to link maintenance_work_mem to autovacuum_max_workers. As you yourself said, maintenance_work_mem exists because maintenance operations are generally much less common than operations that service queries. Couldn't you make the case that autovacuum also services queries? Certainly, autovacuum can be expected to use multiple large allocations of memory, once per worker. From the DBA's perspective, this could be happening at any time, to any extent, much like with work_mem allocations (though we at least bound these special maintenance_work_mem allocations to at most autovacuum_max_workers). So I really think the case is strong for a vacuum_work_mem (with a default of -1 that means 'use maintenace_work_mem'). Even for someone who is very experienced, it may not occur to them to increase maintenance_work_mem when they go to create an index or something. I know that I deal with servers that have hugely variable main memory sizes, and it might not even be immediately obvious what to increase maintenance_work_mem to. Even though the issues may be understood well in a minority of cases, the DBA still has to have the presence of mind to specially increase maintenance_work_mem in a psql session, prior to creating the index. I really mean "presence of mind", because creating an index can be an emergency operation - I had to deal with a customer issue where creating an index relieved a sudden serious production performance issue just last week. > Here are the defaults for two configurations; first, for the 128MB > default shared_buffers: I am certainly supportive of the idea of improving our defaults here. The bar is so incredibly low that anything is likely to be a big improvement. What you've suggested here looks not unreasonable to me. Have you thought about clamping the value too? I'm thinking of very small shared_buffers sizings. After all, 128MB isn't the default in the same way 1MB is presently the default work_mem setting. It is certainly true that shared_buffers size is a poor proxy for an appropriate work_mem size, but does that really matter? -- Peter Geoghegan
On Wed, Oct 9, 2013 at 03:04:24PM -0700, Peter Geoghegan wrote: > On Wed, Oct 9, 2013 at 2:15 PM, Bruce Momjian <bruce@momjian.us> wrote: > > I did like Josh's idea about using autovacuum_max_workers for > > maintenance_work_mem, though I used the shared_buffers/4 calculation. > > I don't like that idea myself, because I wouldn't like to link > maintenance_work_mem to autovacuum_max_workers. > > As you yourself said, maintenance_work_mem exists because maintenance > operations are generally much less common than operations that service > queries. > > Couldn't you make the case that autovacuum also services queries? > Certainly, autovacuum can be expected to use multiple large > allocations of memory, once per worker. From the DBA's perspective, > this could be happening at any time, to any extent, much like with > work_mem allocations (though we at least bound these special > maintenance_work_mem allocations to at most autovacuum_max_workers). > So I really think the case is strong for a vacuum_work_mem (with a > default of -1 that means 'use maintenace_work_mem'). Splitting out vacuum_work_mem from maintenance_work_mem is a separate issue. I assume they were combined because the memory used for vacuum index scans is similar to creating an index. I am not sure if having two settings makes something more likely to be set --- I would think the opposite. > > Here are the defaults for two configurations; first, for the 128MB > > default shared_buffers: > > I am certainly supportive of the idea of improving our defaults here. > The bar is so incredibly low that anything is likely to be a big > improvement. What you've suggested here looks not unreasonable to me. > Have you thought about clamping the value too? I'm thinking of very > small shared_buffers sizings. After all, 128MB isn't the default in > the same way 1MB is presently the default work_mem setting. > > It is certainly true that shared_buffers size is a poor proxy for an > appropriate work_mem size, but does that really matter? Right, the bar is low, so almost anything is an improvement. I figure I will just keep tweeking the algorithm until no one complains. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Oct 9, 2013 at 3:31 PM, Bruce Momjian <bruce@momjian.us> wrote: > Splitting out vacuum_work_mem from maintenance_work_mem is a separate > issue. I assume they were combined because the memory used for vacuum > index scans is similar to creating an index. Is it similar? Doesn't maintenace_work_mem just bound the size of the array of tids to kill there? So you'd expect it to be just a fraction of the amount of memory used by initial index creation. > I am not sure if having > two settings makes something more likely to be set --- I would think the > opposite. Well, if a person does not use vacuum_work_mem, then the cost to that person is low. If they do, the benefits could be immense. At the Heroku office, I've had people wonder why creating an index took what seemed like way too long. I told them to increase maintenance_work_mem, and then the index creation was almost instantaneous. Now, you can attribute some of that to the I/O of temp files on EC2's ephemeral storage, and you'd probably have a point, but that certainly isn't the whole story there. -- Peter Geoghegan
On Wed, Oct 9, 2013 at 03:57:14PM -0700, Peter Geoghegan wrote: > On Wed, Oct 9, 2013 at 3:31 PM, Bruce Momjian <bruce@momjian.us> wrote: > > Splitting out vacuum_work_mem from maintenance_work_mem is a separate > > issue. I assume they were combined because the memory used for vacuum > > index scans is similar to creating an index. > > Is it similar? Doesn't maintenance_work_mem just bound the size of the > array of tids to kill there? So you'd expect it to be just a fraction > of the amount of memory used by initial index creation. Well, the point is it is how much memory you can expect a maintenance operation to use, not what it is being used for. You are right they are used differently, but they are both index-related. > > I am not sure if having > > two settings makes something more likely to be set --- I would think the > > opposite. > > Well, if a person does not use vacuum_work_mem, then the cost to that > person is low. If they do, the benefits could be immense. At the > Heroku office, I've had people wonder why creating an index took what > seemed like way too long. I told them to increase > maintenance_work_mem, and then the index creation was almost > instantaneous. Now, you can attribute some of that to the I/O of temp > files on EC2's ephemeral storage, and you'd probably have a point, but > that certainly isn't the whole story there. I am unclear what you are suggesting here. Are you saying you want a separate vacuum_work_mem and maintenance_work_mem so they can have different defaults? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Oct 9, 2013 at 4:40 PM, Bruce Momjian <bruce@momjian.us> wrote: >> Well, if a person does not use vacuum_work_mem, then the cost to that >> person is low. If they do, the benefits could be immense. At the >> Heroku office, I've had people wonder why creating an index took what >> seemed like way too long. I told them to increase >> maintenance_work_mem, and then the index creation was almost >> instantaneous. Now, you can attribute some of that to the I/O of temp >> files on EC2's ephemeral storage, and you'd probably have a point, but >> that certainly isn't the whole story there. > > I am unclear what you are suggesting here. Are you saying you want a > separate vacuum_work_mem and maintenance_work_mem so they can have > different defaults? Well, the Postgres defaults won't really change, because the default vacuum_work_mem will be -1, which will have vacuum defer to maintenance_work_mem. Under this scheme, vacuum only *prefers* to get bound working memory size from vacuum_work_mem. If you don't like vacuum_work_mem, you can just ignore it. This allows someone like me (or an author of a tool like pgtune, even) to set maintenance_work_mem appreciably higher, because I know that over-allocation will only be a problem when a less well informed human writes a utility command and waits for it to finish (that might not be true in the broadest possible case, but it's pretty close to true). That's a very important distinction to my mind. It's useful to have very large amounts of memory for index creation; it is generally much less useful to have such large allocations for vacuum, and if autovacuum ever does use a lot more memory than is generally expected (concurrent autovacuum worker activity is probably a factor here), that could be totally surprising, mysterious or otherwise inopportune. Obviously not everyone can afford to be an expert. It's relatively rare for a human to do a manual VACUUM from psql, but there might be some POLA issues around this if they set maintenance_work_mem high for that. I think they're resolvable and well worth it, though. Quite apart from the general scenario where there is a relatively small number of well informed people that anticipate under-sizing maintenance_work_mem during semi-routine index creation will be a problem, there is no convenient way to give tools like pg_restore a custom maintenance_work_mem value. And, even well-informed people can be forgetful! -- Peter Geoghegan
On Wed, Oct 9, 2013 at 4:10 PM, Bruce Momjian <bruce@momjian.us> wrote: > I disagree. I think we can get a forumla that is certainly better than > a fixed value. I think the examples I have shown do have better value > than a default fixed value. I am open to whatever forumula people think > is best, but I can't see how a fixed value is a win in general. To really do auto-tuning correctly, we need to add a GUC, or some platform-dependent code, or both, for the amount of memory on the machine, which is not and should not be assumed to have anything to do with shared_buffers, which is often set to very small values like 256MB on Windows, and even on Linux, may not be more than 2GB even on a very large machine. With that, we could set a much better value for effective_cache_size, and it would help here, too. I would like to really encourage careful reflection before we start making a lot of changes in this area. If we're going to make a change here, let's take the time to try to do something good, rather than slamming something through without real consideration. I still want to know why this is better than setting work_mem to 4MB and calling it good. I accept that the current default is too low; I do not accept that the correct value has anything to do with the size of shared_buffers. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Oct 9, 2013 at 08:55:33PM -0400, Robert Haas wrote: > On Wed, Oct 9, 2013 at 4:10 PM, Bruce Momjian <bruce@momjian.us> wrote: > > I disagree. I think we can get a forumla that is certainly better than > > a fixed value. I think the examples I have shown do have better value > > than a default fixed value. I am open to whatever forumula people think > > is best, but I can't see how a fixed value is a win in general. > > To really do auto-tuning correctly, we need to add a GUC, or some > platform-dependent code, or both, for the amount of memory on the > machine, which is not and should not be assumed to have anything to do > with shared_buffers, which is often set to very small values like > 256MB on Windows, and even on Linux, may not be more than 2GB even on > a very large machine. With that, we could set a much better value for > effective_cache_size, and it would help here, too. If you are setting shared_buffers low, you probably want the others low too, or can change them. > I would like to really encourage careful reflection before we start > making a lot of changes in this area. If we're going to make a change > here, let's take the time to try to do something good, rather than > slamming something through without real consideration. I still want We get into the "it isn't perfect so let's do nothing" which is where we have been for years. I want to get out of that. I am not sure how much consideration you want, but I am willing to keep inproving it. No value is going to be perfect, even for users who know their workload. > to know why this is better than setting work_mem to 4MB and calling it > good. I accept that the current default is too low; I do not accept For servers that are not dedicated, a fixed value can easily be too large, and for a larger server, the value can easily be too small. Not sure how you can argue that a fixed value could be better. > that the correct value has anything to do with the size of > shared_buffers. Well, an open item is to add an available_memory GUC and base everything on that, including shared_buffers. That would allow Windows-specific adjustments for the default. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Oct 9, 2013 at 9:11 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Wed, Oct 9, 2013 at 08:55:33PM -0400, Robert Haas wrote: >> On Wed, Oct 9, 2013 at 4:10 PM, Bruce Momjian <bruce@momjian.us> wrote: >> > I disagree. I think we can get a forumla that is certainly better than >> > a fixed value. I think the examples I have shown do have better value >> > than a default fixed value. I am open to whatever forumula people think >> > is best, but I can't see how a fixed value is a win in general. >> >> To really do auto-tuning correctly, we need to add a GUC, or some >> platform-dependent code, or both, for the amount of memory on the >> machine, which is not and should not be assumed to have anything to do >> with shared_buffers, which is often set to very small values like >> 256MB on Windows, and even on Linux, may not be more than 2GB even on >> a very large machine. With that, we could set a much better value for >> effective_cache_size, and it would help here, too. > > If you are setting shared_buffers low, you probably want the others low > too, I don't think that's true. People set shared_buffers low because when they set it high, they get write I/O storms that cripple their system at checkpoint time, or because they need to minimize double-buffering. > or can change them. That is obviously true, but it's true now, too. >> to know why this is better than setting work_mem to 4MB and calling it >> good. I accept that the current default is too low; I do not accept > > For servers that are not dedicated, a fixed value can easily be too > large, and for a larger server, the value can easily be too small. Not > sure how you can argue that a fixed value could be better. But your auto-tuned value can easily be too low or too high, too. Consider someone with a system that has 64GB of RAM. EnterpriseDB has had customers who have found that with, say, a 40GB database, it's best to set shared_buffers to 40GB so that the database remains fully cached. Your latest formula will auto-tune work_mem to roughly 100MB.On the other hand, if the same customer has a 400GBdatabase, which can't be fully cached no matter what, a much lower setting for shared_buffers, like maybe 8GB, is apt to perform better. Your formula will auto-tune shared_buffers to roughly 20MB. In other words, when there's only 24GB of memory available for everything-except-shared-buffers, your formula sets work_mem five times higher than when there's 48GB of memory available for everything-except-shared-buffers. That surely can't be right. >> that the correct value has anything to do with the size of >> shared_buffers. > > Well, an open item is to add an available_memory GUC and base everything > on that, including shared_buffers. That would allow Windows-specific > adjustments for the default. That seems considerably more principled than this patch. On a more pedestrian note, when I try this patch with shared_buffers = 8GB, the postmaster won't start. It dies with: FATAL: -20203 is outside the valid range for parameter "work_mem" (-1 .. 2147483647) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Bruce Momjian (bruce@momjian.us) wrote: > For servers that are not dedicated, a fixed value can easily be too > large, and for a larger server, the value can easily be too small. Not > sure how you can argue that a fixed value could be better. There is definitely something to be said for simplicity and just up'ing the default would have a more dramatic impact with a setting like work_mem than it would with shared_buffers, imv. With work_mem, you'll actually get better plans that can be much more efficient even with larger amounts of data. With shared_buffers, you're generally just going to be saving a bit of time by avoiding the system call to pull the blocks back from the Linux FS cache. This is why I'm much more interested in an actual *change* to what our users who don't configure things will get rather than an approach that comes up with a complicated way to arrive at the same answer. > Well, an open item is to add an available_memory GUC and base everything > on that, including shared_buffers. That would allow Windows-specific > adjustments for the default. I also think that's an interesting idea, but Robert has a good point, knowing the size of the DB itself is another considerstaion (or perhaps the size of the "working set") and those numbers aren't static and may not be easy to figure out. Thanks, Stephen
On Wed, Oct 9, 2013 at 7:11 PM, Stephen Frost <sfrost@snowman.net> wrote: > There is definitely something to be said for simplicity and just up'ing > the default would have a more dramatic impact with a setting like > work_mem than it would with shared_buffers, imv. Simplicity for us or for our users? Yes, shared_buffers is a decidedly bad proxy for appropriate work_mem sizing. But we ought to do something. The problem with setting work_mem to 4MB is that it's still too low for the vast majority of users. And too high for a very small number. I wonder if we should just ship something like pgtune (in /bin, not in /contrib) that can be optionally used at initdb time. Making something like wal_buffers self-tuning is really compelling, but work_mem is quite different. I hear a lot of complaints about "the first 15 minutes experience" of Postgres. It's easy to scoff at this kind of thing, but I think we could do a lot better there, and at no real cost - the major blocker to doing something like that has been fixed (of course, I refer to the SysV shared memory limits). Is the person on a very small box where our current very conservative defaults are appropriate? Why not ask a few high-level questions like that to get inexperienced users started? The tool could even have a parameter that allows a packager to pass total system memory without bothering the user with that, and without bothering us with having to figure out a way to make that work correctly and portably. -- Peter Geoghegan
On Wed, Oct 9, 2013 at 09:34:16PM -0400, Robert Haas wrote: > On a more pedestrian note, when I try this patch with shared_buffers = > 8GB, the postmaster won't start. It dies with: > > FATAL: -20203 is outside the valid range for parameter "work_mem" (-1 > .. 2147483647) Fixed with the attached patch: test=> SHOW shared_buffers; shared_buffers ---------------- 8GB (1 row) test=> SHOW work_mem; work_mem ---------- 21739kB (1 row) test=> SHOW maintenance_work_mem; maintenance_work_mem ---------------------- 699050kB (1 row) -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Attachment
On Wed, Oct 9, 2013 at 07:33:46PM -0700, Peter Geoghegan wrote: > I hear a lot of complaints about "the first 15 minutes experience" of > Postgres. It's easy to scoff at this kind of thing, but I think we > could do a lot better there, and at no real cost - the major blocker > to doing something like that has been fixed (of course, I refer to the > SysV shared memory limits). Is the person on a very small box where > our current very conservative defaults are appropriate? Why not ask a > few high-level questions like that to get inexperienced users started? > The tool could even have a parameter that allows a packager to pass > total system memory without bothering the user with that, and without > bothering us with having to figure out a way to make that work > correctly and portably. I think the simplest solution would be to have a parameter to initdb which specifies how much memory you want to use, and set a new variable available_mem from that, and have things auto-tune based on that value in the backend. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Oct 9, 2013 at 09:34:16PM -0400, Robert Haas wrote: > On Wed, Oct 9, 2013 at 9:11 PM, Bruce Momjian <bruce@momjian.us> wrote: > > On Wed, Oct 9, 2013 at 08:55:33PM -0400, Robert Haas wrote: > >> On Wed, Oct 9, 2013 at 4:10 PM, Bruce Momjian <bruce@momjian.us> wrote: > >> > I disagree. I think we can get a forumla that is certainly better than > >> > a fixed value. I think the examples I have shown do have better value > >> > than a default fixed value. I am open to whatever forumula people think > >> > is best, but I can't see how a fixed value is a win in general. > >> > >> To really do auto-tuning correctly, we need to add a GUC, or some > >> platform-dependent code, or both, for the amount of memory on the > >> machine, which is not and should not be assumed to have anything to do > >> with shared_buffers, which is often set to very small values like > >> 256MB on Windows, and even on Linux, may not be more than 2GB even on > >> a very large machine. With that, we could set a much better value for > >> effective_cache_size, and it would help here, too. > > > > If you are setting shared_buffers low, you probably want the others low > > too, > > I don't think that's true. People set shared_buffers low because when > they set it high, they get write I/O storms that cripple their system > at checkpoint time, or because they need to minimize double-buffering. If people are doing such changes, they are obviously capable of knowing their workload and setting these things to non-default values. > > or can change them. > > That is obviously true, but it's true now, too. And that comment is helpful how? > >> to know why this is better than setting work_mem to 4MB and calling it > >> good. I accept that the current default is too low; I do not accept > > > > For servers that are not dedicated, a fixed value can easily be too > > large, and for a larger server, the value can easily be too small. Not > > sure how you can argue that a fixed value could be better. > > But your auto-tuned value can easily be too low or too high, too. My option is better, not perfect --- I don't know how many times I can say something again and again. Fortunately there are enough people who understand that on the lists. > Consider someone with a system that has 64GB of RAM. EnterpriseDB > has had customers who have found that with, say, a 40GB database, it's > best to set shared_buffers to 40GB so that the database remains fully > cached. Your latest formula will auto-tune work_mem to roughly 100MB. > On the other hand, if the same customer has a 400GB database, which > can't be fully cached no matter what, a much lower setting for > shared_buffers, like maybe 8GB, is apt to perform better. Your > formula will auto-tune shared_buffers to roughly 20MB. You mean work_mem? > In other words, when there's only 24GB of memory available for > everything-except-shared-buffers, your formula sets work_mem five > times higher than when there's 48GB of memory available for > everything-except-shared-buffers. That surely can't be right. Again, IT ISN'T PERFECT, AND NOTHING WILL BE PERFECT, EVENT HAND TUNING. This is about improvement for a typical workload. > >> that the correct value has anything to do with the size of > >> shared_buffers. > > > > Well, an open item is to add an available_memory GUC and base everything > > on that, including shared_buffers. That would allow Windows-specific > > adjustments for the default. > > That seems considerably more principled than this patch. That was Josh Berkus's idea. I am fine writing 20x more lines of code to improve this, but I am determined this will be improved. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Oct 9, 2013 at 8:02 PM, Bruce Momjian <bruce@momjian.us> wrote: > I think the simplest solution would be to have a parameter to initdb > which specifies how much memory you want to use, and set a new variable > available_mem from that, and have things auto-tune based on that value > in the backend. Have you tried pgtune? http://pgfoundry.org/projects/pgtune/ It's an extremely simple program - about 500 lines of python. It asks a few simple questions, and does as good a job of configuring Postgres as most installations will ever need. Importantly, it asks the user to characterize their workload as one of the following: DW, OLTP, Web, Mixed and Desktop. Why can't we just do something along those lines? I know that I constantly find myself rattling off folk wisdom about how to set the most important GUCs. I'm not alone here [1]. Isn't it about time we just automated this? [1] http://rhaas.blogspot.com/2012/03/tuning-sharedbuffers-and-walbuffers.html -- Peter Geoghegan
On Wed, Oct 9, 2013 at 8:13 PM, Bruce Momjian <bruce@momjian.us> wrote: > My option is better, not perfect --- I don't know how many times I can > say something again and again. Fortunately there are enough people who > understand that on the lists. +1 from me on the sentiment: the perfect cannot be allowed to be the enemy of the good. -- Peter Geoghegan
On Wed, Oct 9, 2013 at 08:15:44PM -0700, Peter Geoghegan wrote: > On Wed, Oct 9, 2013 at 8:02 PM, Bruce Momjian <bruce@momjian.us> wrote: > > I think the simplest solution would be to have a parameter to initdb > > which specifies how much memory you want to use, and set a new variable > > available_mem from that, and have things auto-tune based on that value > > in the backend. > > Have you tried pgtune? > > http://pgfoundry.org/projects/pgtune/ > > It's an extremely simple program - about 500 lines of python. It asks > a few simple questions, and does as good a job of configuring Postgres > as most installations will ever need. Importantly, it asks the user to > characterize their workload as one of the following: DW, OLTP, Web, > Mixed and Desktop. Why can't we just do something along those lines? > > I know that I constantly find myself rattling off folk wisdom about > how to set the most important GUCs. I'm not alone here [1]. Isn't it > about time we just automated this? I am not sure that having that external to the backend really makes sense because I am concerned people will not use it. We can certainly add it to change our defaults, of course. Also consider many installs are automated. The bottom line is that our defaults need improvement. A tools would be nice in addition to that. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Oct 9, 2013 at 8:20 PM, Bruce Momjian <bruce@momjian.us> wrote: > I am not sure that having that external to the backend really makes > sense because I am concerned people will not use it. We can certainly > add it to change our defaults, of course. Also consider many installs > are automated. Sure. I was imagining that we'd want to write the tool with the idea in mind that it was usually run immediately after initdb. We'd reach out to packagers to have them push it into the hands of users where that's practical. If you think that sounds odd, consider that on at least one popular Linux distro, installing MySQL will show a ncurses interface where the mysql password is set. We wouldn't need anything as fancy as that. -- Peter Geoghegan
On Thu, Oct 10, 2013 at 2:04 AM, Peter Geoghegan <pg@heroku.com> wrote: > On Wed, Oct 9, 2013 at 4:40 PM, Bruce Momjian <bruce@momjian.us> wrote: >>> Well, if a person does not use vacuum_work_mem, then the cost to that >>> person is low. If they do, the benefits could be immense. At the >>> Heroku office, I've had people wonder why creating an index took what >>> seemed like way too long. I told them to increase >>> maintenance_work_mem, and then the index creation was almost >>> instantaneous. Now, you can attribute some of that to the I/O of temp >>> files on EC2's ephemeral storage, and you'd probably have a point, but >>> that certainly isn't the whole story there. >> >> I am unclear what you are suggesting here. Are you saying you want a >> separate vacuum_work_mem and maintenance_work_mem so they can have >> different defaults? > > Well, the Postgres defaults won't really change, because the default > vacuum_work_mem will be -1, which will have vacuum defer to > maintenance_work_mem. Under this scheme, vacuum only *prefers* to get > bound working memory size from vacuum_work_mem. If you don't like > vacuum_work_mem, you can just ignore it. > > This allows someone like me (or an author of a tool like pgtune, even) > to set maintenance_work_mem appreciably higher, because I know that > over-allocation will only be a problem when a less well informed human > writes a utility command and waits for it to finish (that might not be > true in the broadest possible case, but it's pretty close to true). > That's a very important distinction to my mind. It's useful to have > very large amounts of memory for index creation; it is generally much > less useful to have such large allocations for vacuum, and if > autovacuum ever does use a lot more memory than is generally expected > (concurrent autovacuum worker activity is probably a factor here), > that could be totally surprising, mysterious or otherwise inopportune. > Obviously not everyone can afford to be an expert. > > It's relatively rare for a human to do a manual VACUUM from psql, but > there might be some POLA issues around this if they set > maintenance_work_mem high for that. I think they're resolvable and > well worth it, though. Quite apart from the general scenario where > there is a relatively small number of well informed people that > anticipate under-sizing maintenance_work_mem during semi-routine index > creation will be a problem, there is no convenient way to give tools > like pg_restore a custom maintenance_work_mem value. And, even > well-informed people can be forgetful! While unrelated to the main topic of this thread, I think this is very important as well. I often have to advice people to remember to cap their maintenance_work_mem because of autovacuum, and to remember to re-tune maintenance_wokr_mem when they change the number of autovacuum workers. I would, however, vote for an autovacuum_work_mem rather than a vacuum_work_mem. Analog to the autovacuum_vacuum_cost_* parameters that override the "foreground" parameters. (Though you can give a custom one to pg_restore can't you - just issue a SET command inthe session, it won't affect autovac or anybody else) -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Thu, Oct 10, 2013 at 5:02 AM, Bruce Momjian <bruce@momjian.us> wrote: > On Wed, Oct 9, 2013 at 07:33:46PM -0700, Peter Geoghegan wrote: >> I hear a lot of complaints about "the first 15 minutes experience" of >> Postgres. It's easy to scoff at this kind of thing, but I think we >> could do a lot better there, and at no real cost - the major blocker >> to doing something like that has been fixed (of course, I refer to the >> SysV shared memory limits). Is the person on a very small box where >> our current very conservative defaults are appropriate? Why not ask a >> few high-level questions like that to get inexperienced users started? >> The tool could even have a parameter that allows a packager to pass >> total system memory without bothering the user with that, and without >> bothering us with having to figure out a way to make that work >> correctly and portably. > > I think the simplest solution would be to have a parameter to initdb > which specifies how much memory you want to use, and set a new variable > available_mem from that, and have things auto-tune based on that value > in the backend. I think it would be even simpler, and more reliable, to start with the parameter to initdb - I like that. But instead of having it set a new variable based on that and then autotune off that, just have *initdb* do these calculations you're suggesting, and write new defaults to the files (preferably with a comment). That way if the user *later* comes in and say changes shared_buffers, we don't dynamically resize the work_mem into a value that might cause his machine to die from swapping which would definitely violate the principle of least surprise.. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Thu, Oct 10, 2013 at 5:35 AM, Peter Geoghegan <pg@heroku.com> wrote: > On Wed, Oct 9, 2013 at 8:20 PM, Bruce Momjian <bruce@momjian.us> wrote: >> I am not sure that having that external to the backend really makes >> sense because I am concerned people will not use it. We can certainly >> add it to change our defaults, of course. Also consider many installs >> are automated. > > Sure. > > I was imagining that we'd want to write the tool with the idea in mind > that it was usually run immediately after initdb. We'd reach out to > packagers to have them push it into the hands of users where that's > practical. > > If you think that sounds odd, consider that on at least one popular > Linux distro, installing MySQL will show a ncurses interface where the > mysql password is set. We wouldn't need anything as fancy as that. That's a packaging feature though, and not a MySQL feature. And of course, on other platforms, popping up somethjing like that is explicitly *forbidden* by the packaging standards. But it shows an important distinction - we really only need to provide an *infrastructure* that can be used on all platforms. The platform specific interfaces to it can go in the packaging. (And yes, in a lot of cases "we" are also the packagers, but the point being that this code is already 100% platform specific, making the problem easier) -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Wed, Oct 9, 2013 at 11:35 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Wed, Oct 9, 2013 at 8:20 PM, Bruce Momjian <bruce@momjian.us> wrote: >> I am not sure that having that external to the backend really makes >> sense because I am concerned people will not use it. We can certainly >> add it to change our defaults, of course. Also consider many installs >> are automated. > > Sure. > > I was imagining that we'd want to write the tool with the idea in mind > that it was usually run immediately after initdb. We'd reach out to > packagers to have them push it into the hands of users where that's > practical. > > If you think that sounds odd, consider that on at least one popular > Linux distro, installing MySQL will show a ncurses interface where the > mysql password is set. We wouldn't need anything as fancy as that. I actually had the thought that it might be something we'd integrate *into* initdb. So you'd do initdb --system-memory 8GB or something like that and it would do the rest. That'd be slick, at least IMHO. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Oct 10, 2013 at 1:23 AM, Magnus Hagander <magnus@hagander.net> wrote: > I think it would be even simpler, and more reliable, to start with the > parameter to initdb - I like that. But instead of having it set a new > variable based on that and then autotune off that, just have *initdb* > do these calculations you're suggesting, and write new defaults to the > files (preferably with a comment). > > That way if the user *later* comes in and say changes shared_buffers, > we don't dynamically resize the work_mem into a value that might cause > his machine to die from swapping which would definitely violate the > principle of least surprise.. +1 for all of that. I completely agree. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Peter Geoghegan (pg@heroku.com) wrote: > On Wed, Oct 9, 2013 at 7:11 PM, Stephen Frost <sfrost@snowman.net> wrote: > > There is definitely something to be said for simplicity and just up'ing > > the default would have a more dramatic impact with a setting like > > work_mem than it would with shared_buffers, imv. > > Simplicity for us or for our users? My thinking was 'both', really. > I wonder if we should just ship something like pgtune (in /bin, not in > /contrib) that can be optionally used at initdb time. Making something > like wal_buffers self-tuning is really compelling, but work_mem is > quite different. I'm coming around to agree with this also- doing this at initdb time really makes more sense than during server start-up based on some (mostly) unrelated value. > I hear a lot of complaints about "the first 15 minutes experience" of > Postgres. It's easy to scoff at this kind of thing, but I think we > could do a lot better there, and at no real cost - the major blocker > to doing something like that has been fixed (of course, I refer to the > SysV shared memory limits). Is the person on a very small box where > our current very conservative defaults are appropriate? Why not ask a > few high-level questions like that to get inexperienced users started? There are certainly challenges here wrt asking questions during install, as was mentioned elsewhere, but I agree that we could do better. > The tool could even have a parameter that allows a packager to pass > total system memory without bothering the user with that, and without > bothering us with having to figure out a way to make that work > correctly and portably. Agreed. Thanks, Stephen
* Magnus Hagander (magnus@hagander.net) wrote: > I think it would be even simpler, and more reliable, to start with the > parameter to initdb - I like that. But instead of having it set a new > variable based on that and then autotune off that, just have *initdb* > do these calculations you're suggesting, and write new defaults to the > files (preferably with a comment). Agreed; I especially like having the comment included. > That way if the user *later* comes in and say changes shared_buffers, > we don't dynamically resize the work_mem into a value that might cause > his machine to die from swapping which would definitely violate the > principle of least surprise.. +1 Thanks, Stephen
From: "Bruce Momjian" <bruce@momjian.us> > I will work on auto-tuning temp_buffers next. Any other suggestions? > wal_buffers is already auto-tuned. Great work. I'm looking forward to becoming able to fully utilize system resources right after initdb. Although this is not directly related to memory, could you set max_prepared_transactions = max_connections at initdb time? People must feel frustrated when they can't run applications on a Java or .NET application server and notice that they have to set max_prepared_transactions and restart PostgreSQL. This is far from friendly. Regards MauMau
Robert Haas <robertmhaas@gmail.com> wrote: > I actually had the thought that it might be something we'd integrate > *into* initdb. So you'd do initdb --system-memory 8GB or something > like that and it would do the rest. That'd be slick, at least IMHO. How would you handle the case that the machine (whether physical or a VM) later gets more RAM? That's certainly not unheard of with physical servers, and with VMs I'm not sure that the database server would necessarily go through a stop/start cycle for it. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
From: "Robert Haas" <robertmhaas@gmail.com> > On Thu, Oct 10, 2013 at 1:23 AM, Magnus Hagander <magnus@hagander.net> > wrote: >> I think it would be even simpler, and more reliable, to start with the >> parameter to initdb - I like that. But instead of having it set a new >> variable based on that and then autotune off that, just have *initdb* >> do these calculations you're suggesting, and write new defaults to the >> files (preferably with a comment). >> >> That way if the user *later* comes in and say changes shared_buffers, >> we don't dynamically resize the work_mem into a value that might cause >> his machine to die from swapping which would definitely violate the >> principle of least surprise.. > > +1 for all of that. I completely agree. I vote for this idea completely, too. It's nice to be able to specify usable RAM with something like "initdb --system-memory 8GB", because it provides flexibility for memory allocation --- use the whole machine for one PostgreSQL instance, or run multiple instances on one machine with 50% of RAM for instance-A and 25% of RAM for instance B and C, etc. But what is the default value of --system-memory? I would like it to be the whole RAM. I hope something like pgtune will be incorporated into the core, absorbing the ideas in: - pgtune - https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server - the book "PostgreSQL 9.0 High Performance" by Greg Smith Then initdb calls the tool. Of course, DBAs can use the tool later. Like pgtune, the tool would be nice if it and initdb can accept "--system-type" or "--workload" with arguments {OLTP | DW | mixed}. Regards MauMau
On Thu, Oct 10, 2013 at 11:01:52PM +0900, MauMau wrote: > From: "Bruce Momjian" <bruce@momjian.us> > >I will work on auto-tuning temp_buffers next. Any other suggestions? > >wal_buffers is already auto-tuned. > > Great work. I'm looking forward to becoming able to fully utilize > system resources right after initdb. > > Although this is not directly related to memory, could you set > max_prepared_transactions = max_connections at initdb time? People > must feel frustrated when they can't run applications on a Java or > .NET application server and notice that they have to set > max_prepared_transactions and restart PostgreSQL. This is far from > friendly. I think the problem is that many users don't need prepared transactions and therefore don't want the overhead. Is that still accurate? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
"MauMau" <maumau307@gmail.com> writes: > Although this is not directly related to memory, could you set > max_prepared_transactions = max_connections at initdb time? People must You really need to have a transaction manager around when issuing prepared transaction as failing to commit/rollback them will prevent VACUUM and quickly lead you to an interesting situation. It used to have a default of 5 and is now properly defaulting to 0. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Thu, Oct 10, 2013 at 07:24:26AM -0700, Kevin Grittner wrote: > Robert Haas <robertmhaas@gmail.com> wrote: > > > I actually had the thought that it might be something we'd integrate > > *into* initdb. So you'd do initdb --system-memory 8GB or something > > like that and it would do the rest. That'd be slick, at least IMHO. > > How would you handle the case that the machine (whether physical or > a VM) later gets more RAM? That's certainly not unheard of with > physical servers, and with VMs I'm not sure that the database > server would necessarily go through a stop/start cycle for it. Yes, going from a non-dedicated to a dedicated database server, adding RAM, or moving the cluster to another server could all require an initdb to change auto-tuned values. This is why I think we will need to auto-tune in the backend, rather than via initdb. I do think an available_mem parameter for initdb would help though, to be set in postgresql.conf. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
* Bruce Momjian (bruce@momjian.us) wrote: > On Thu, Oct 10, 2013 at 07:24:26AM -0700, Kevin Grittner wrote: > > Robert Haas <robertmhaas@gmail.com> wrote: > > > I actually had the thought that it might be something we'd integrate > > > *into* initdb. So you'd do initdb --system-memory 8GB or something > > > like that and it would do the rest. That'd be slick, at least IMHO. > > > > How would you handle the case that the machine (whether physical or > > a VM) later gets more RAM? That's certainly not unheard of with > > physical servers, and with VMs I'm not sure that the database > > server would necessarily go through a stop/start cycle for it. > > Yes, going from a non-dedicated to a dedicated database server, adding > RAM, or moving the cluster to another server could all require an initdb > to change auto-tuned values. This is why I think we will need to > auto-tune in the backend, rather than via initdb. I do think an > available_mem parameter for initdb would help though, to be set in > postgresql.conf. For this case, I think the suggestion made by MauMau would be better- tell the user (in the postgresql.conf comments) a command they can run with different memory settings to see what the auto-tuning would do. Perhaps even have a way to enable use of those new variables, but I don't really care for the idea of making a GUC that isn't anything except a control for defaults of *other* GUCs. Thanks, Stephen
On Wed, Oct 9, 2013 at 09:34:16PM -0400, Robert Haas wrote: > But your auto-tuned value can easily be too low or too high, too. > Consider someone with a system that has 64GB of RAM. EnterpriseDB > has had customers who have found that with, say, a 40GB database, it's > best to set shared_buffers to 40GB so that the database remains fully > cached. Your latest formula will auto-tune work_mem to roughly 100MB. > On the other hand, if the same customer has a 400GB database, which > can't be fully cached no matter what, a much lower setting for > shared_buffers, like maybe 8GB, is apt to perform better. Your > formula will auto-tune shared_buffers to roughly 20MB. > > In other words, when there's only 24GB of memory available for > everything-except-shared-buffers, your formula sets work_mem five > times higher than when there's 48GB of memory available for > everything-except-shared-buffers. That surely can't be right. Let me walk through the idea of adding an available_mem setting, that Josh suggested, and which I think addresses Robert's concern about larger shared_buffers and Windows servers. The idea is that initdb would allow you to specify an available_mem parameter, which would set a corresponding value in postgresql.conf. This could be later changed by the user. (See my other email about why we shouldn't do the tuning in initdb.) shared_buffers would auto-tune to 25% of that, except on Windows, and perhaps capped at 8GB, Here is another case where not tuning directly on shared_buffers is a win. All other calculations would be based on available_mem - shared_buffers, so if shared_buffers is manually or auto-tuned high or low, other tuning would still be accurate. work_mem would tune to (available_mem - shared_buffers) / 16 / max_connections, so even if you used all max_connections, and 3x of work_mem in each, you would still only match the size of shared_buffers. maintenance_work_mem would key on autovacuum_max_workers. effective_cache_size would be available_mem minus all of the values above. Now, how to handle changes? available_mem could only be changed by a server restart, because shared_buffers is based on it, and the rest of the parameters are based on available_mem - shared_buffers. Though users can change work_mem in postgresql.conf and per-session, auto-tuning would not be affected by these changes. Calculating only with available_mem - shared_buffers would give stability and predicability to the auto-tuning system. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Thu, Oct 10, 2013 at 11:18:46AM -0400, Stephen Frost wrote: > * Bruce Momjian (bruce@momjian.us) wrote: > > On Thu, Oct 10, 2013 at 07:24:26AM -0700, Kevin Grittner wrote: > > > Robert Haas <robertmhaas@gmail.com> wrote: > > > > I actually had the thought that it might be something we'd integrate > > > > *into* initdb. So you'd do initdb --system-memory 8GB or something > > > > like that and it would do the rest. That'd be slick, at least IMHO. > > > > > > How would you handle the case that the machine (whether physical or > > > a VM) later gets more RAM? That's certainly not unheard of with > > > physical servers, and with VMs I'm not sure that the database > > > server would necessarily go through a stop/start cycle for it. > > > > Yes, going from a non-dedicated to a dedicated database server, adding > > RAM, or moving the cluster to another server could all require an initdb > > to change auto-tuned values. This is why I think we will need to > > auto-tune in the backend, rather than via initdb. I do think an > > available_mem parameter for initdb would help though, to be set in > > postgresql.conf. > > For this case, I think the suggestion made by MauMau would be better- > tell the user (in the postgresql.conf comments) a command they can run > with different memory settings to see what the auto-tuning would do. > Perhaps even have a way to enable use of those new variables, but I > don't really care for the idea of making a GUC that isn't anything > except a control for defaults of *other* GUCs. Well, you then have two places you are doing the tuning --- one in initdb, and another in the tool, and you can have cases where they are not consistent. You could have a mode where initdb re-writes postgresql.conf, but that has all sorts of oddities about changing a config file. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
* Bruce Momjian (bruce@momjian.us) wrote: > On Thu, Oct 10, 2013 at 11:18:46AM -0400, Stephen Frost wrote: > > For this case, I think the suggestion made by MauMau would be better- > > tell the user (in the postgresql.conf comments) a command they can run > > with different memory settings to see what the auto-tuning would do. > > Perhaps even have a way to enable use of those new variables, but I > > don't really care for the idea of making a GUC that isn't anything > > except a control for defaults of *other* GUCs. > > Well, you then have two places you are doing the tuning --- one in > initdb, and another in the tool, and you can have cases where they are > not consistent. You could have a mode where initdb re-writes > postgresql.conf, but that has all sorts of oddities about changing a > config file. Not necessairly.. Have initdb call the tool to get the values to use when first writing out the config file (or make the logic into a library that initdb and the tool both use, or just both #include the same .h; it's not like it's going to be terribly complicated), and then the tool would be responsible for later changes to the postgresql.conf file, or we just tell the user how to make the changes recommended by the tool. I would *not* have initdb doing that, that's not its job. If the user is expected to be modifying the postgresql.conf file in this scenario anyway, I hardly see that having one-parameter-to-rule-them-all is actually better than having 3 or 4. If we're trying to get away from the user modifying postgresql.conf, then we're going to need a tool to do that (or use ALTER SYSTEM WHATEVER). For my part, I'm really much more interested in the "first 15 minutes", as was mentioned elsewhere, than how to help users who have been using PG for a year and then discover they need to tune it a bit. Thanks, Stephen
On Thu, Oct 10, 2013 at 11:45:41AM -0400, Stephen Frost wrote: > * Bruce Momjian (bruce@momjian.us) wrote: > > On Thu, Oct 10, 2013 at 11:18:46AM -0400, Stephen Frost wrote: > > > For this case, I think the suggestion made by MauMau would be better- > > > tell the user (in the postgresql.conf comments) a command they can run > > > with different memory settings to see what the auto-tuning would do. > > > Perhaps even have a way to enable use of those new variables, but I > > > don't really care for the idea of making a GUC that isn't anything > > > except a control for defaults of *other* GUCs. > > > > Well, you then have two places you are doing the tuning --- one in > > initdb, and another in the tool, and you can have cases where they are > > not consistent. You could have a mode where initdb re-writes > > postgresql.conf, but that has all sorts of oddities about changing a > > config file. > > Not necessairly.. Have initdb call the tool to get the values to use > when first writing out the config file (or make the logic into a library > that initdb and the tool both use, or just both #include the same .h; > it's not like it's going to be terribly complicated), and then the tool > would be responsible for later changes to the postgresql.conf file, or > we just tell the user how to make the changes recommended by the tool. > I would *not* have initdb doing that, that's not its job. > > If the user is expected to be modifying the postgresql.conf file in this > scenario anyway, I hardly see that having one-parameter-to-rule-them-all > is actually better than having 3 or 4. If we're trying to get away from > the user modifying postgresql.conf, then we're going to need a tool to > do that (or use ALTER SYSTEM WHATEVER). For my part, I'm really much > more interested in the "first 15 minutes", as was mentioned elsewhere, > than how to help users who have been using PG for a year and then > discover they need to tune it a bit. Well, I like the idea of initdb calling the tool, though the tool then would need to be in C probably as we can't require python for initdb. The tool would not address Robert's issue of someone increasing shared_buffers on their own. In fact, the other constants would still be hard-coded in from initdb, which isn't good. I think the big win for a tool would be to query the user about how they are going to be using Postgres, and that can then spit out values the user can add to postgresql.conf, or to a config file that is included at the end of postgresql.conf. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
* Bruce Momjian (bruce@momjian.us) wrote: > Well, I like the idea of initdb calling the tool, though the tool then > would need to be in C probably as we can't require python for initdb. > The tool would not address Robert's issue of someone increasing > shared_buffers on their own. I'm really not impressed with this argument. Either the user is going to go and modify the config file, in which case I would hope that they'd at least glance around at what they should change, or they're going to move off PG because it's not performing well enough for them- which is really what I'm trying to avoid happening during the first 15m. > In fact, the other constants would still > be hard-coded in from initdb, which isn't good. Actually, it *is* good, as Magnus pointed out. Changing a completely unrelated parameter shouldn't make all of your plans suddenly change. This is mollified, but only a bit, if you have a GUC that's explicitly "this changes other GUCs", but I'd much rather have a tool that can do a better job to begin with and which helps the user understand what parameters are available to change and why there's more than one. > I think the big win for a tool would be to query the user about how they > are going to be using Postgres, and that can then spit out values the > user can add to postgresql.conf, or to a config file that is included at > the end of postgresql.conf. Agreed. Thanks, Stephen
On Thu, Oct 10, 2013 at 12:00:54PM -0400, Stephen Frost wrote: > * Bruce Momjian (bruce@momjian.us) wrote: > > Well, I like the idea of initdb calling the tool, though the tool then > > would need to be in C probably as we can't require python for initdb. > > The tool would not address Robert's issue of someone increasing > > shared_buffers on their own. > > I'm really not impressed with this argument. Either the user is going > to go and modify the config file, in which case I would hope that they'd > at least glance around at what they should change, or they're going to > move off PG because it's not performing well enough for them- which is > really what I'm trying to avoid happening during the first 15m. Well, they aren't going around and looking at other parameters now or we would not feel a need to auto-tune many of our defaults. How do we handle the Python dependency, or is this all to be done in some other language? I certainly am not ready to take on that job. One nice thing about a tool is that you can see your auto-tuned defaults right away, while doing this in the backend, you have to start the server to see the defaults. I am not even sure how I could allow users to preview their defaults for different available_mem settings. > > In fact, the other constants would still > > be hard-coded in from initdb, which isn't good. > > Actually, it *is* good, as Magnus pointed out. Changing a completely > unrelated parameter shouldn't make all of your plans suddenly change. > This is mollified, but only a bit, if you have a GUC that's explicitly > "this changes other GUCs", but I'd much rather have a tool that can do a > better job to begin with and which helps the user understand what > parameters are available to change and why there's more than one. Well, the big question is how many users are going to use the tool, as we are not setting this up for experts, but for novices. I think one big risk is someone changing shared_buffers and not having an accurate available_memory. That might lead to some very inaccurate defaults. Also, what happens if available_memory is not supplied at all? Do we auto-tune just from shared_buffers, or not autotune at all, and then what are the defaults? We could certainly throw an error if shared_buffers > available_memory. We might just ship with available_memory defaulting to 256MB and auto-tune everything from there. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On 10/10/2013 12:28 PM, Bruce Momjian wrote: > > How do we handle the Python dependency, or is this all to be done in > some other language? I certainly am not ready to take on that job. Without considering any wider question here, let me just note this: Anything that can be done in this area in Python should be doable in Perl fairly simply. I don't think we should be adding any Python dependencies. For good or ill Perl has been used for pretty much all our complex scripting (pgindent, MSVC build system etc.) cheers andrew
On Thu, Oct 10, 2013 at 12:39:04PM -0400, Andrew Dunstan wrote: > > On 10/10/2013 12:28 PM, Bruce Momjian wrote: > > > >How do we handle the Python dependency, or is this all to be done in > >some other language? I certainly am not ready to take on that job. > > > Without considering any wider question here, let me just note this: > > Anything that can be done in this area in Python should be doable in > Perl fairly simply. I don't think we should be adding any Python > dependencies. For good or ill Perl has been used for pretty much all > our complex scripting (pgindent, MSVC build system etc.) Yes, but this is a run-time requirement, not build-time, and we have not used Perl in that regard. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On 10/10/2013 12:45 PM, Bruce Momjian wrote: > On Thu, Oct 10, 2013 at 12:39:04PM -0400, Andrew Dunstan wrote: >> On 10/10/2013 12:28 PM, Bruce Momjian wrote: >>> How do we handle the Python dependency, or is this all to be done in >>> some other language? I certainly am not ready to take on that job. >> >> Without considering any wider question here, let me just note this: >> >> Anything that can be done in this area in Python should be doable in >> Perl fairly simply. I don't think we should be adding any Python >> dependencies. For good or ill Perl has been used for pretty much all >> our complex scripting (pgindent, MSVC build system etc.) > Yes, but this is a run-time requirement, not build-time, and we have not > used Perl in that regard. > Nor Python. If we want to avoid added dependencies, we would need to use C. cheers andrew
On Thu, Oct 10, 2013 at 12:59:39PM -0400, Andrew Dunstan wrote: > > On 10/10/2013 12:45 PM, Bruce Momjian wrote: > >On Thu, Oct 10, 2013 at 12:39:04PM -0400, Andrew Dunstan wrote: > >>On 10/10/2013 12:28 PM, Bruce Momjian wrote: > >>>How do we handle the Python dependency, or is this all to be done in > >>>some other language? I certainly am not ready to take on that job. > >> > >>Without considering any wider question here, let me just note this: > >> > >>Anything that can be done in this area in Python should be doable in > >>Perl fairly simply. I don't think we should be adding any Python > >>dependencies. For good or ill Perl has been used for pretty much all > >>our complex scripting (pgindent, MSVC build system etc.) > >Yes, but this is a run-time requirement, not build-time, and we have not > >used Perl in that regard. > > > > > Nor Python. If we want to avoid added dependencies, we would need to use C. Yeah. :-( My crazy idea would be, because setting setting available_mem without a restart would not be supported, to allow the backend to output suggestions, e.g.: test=> SHOW available_mem = '24GB';Auto-tuning values:shared_buffers = 6GBwork_mem = 10MB...ERROR: parameter "available_mem"cannot be changed without restarting the server -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
> Because 'maintenance' operations were rarer, so we figured we could use > more memory in those cases. Once we brought Autovacuum into core, though, we should have changed that. However, I agree with Magnus that the simple course is to have an autovacuum_worker_memory setting which overrides maint_work_mem if set. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 10/09/2013 02:15 PM, Bruce Momjian wrote: > and for shared_buffers of 2GB: > > test=> show shared_buffers; > shared_buffers > ---------------- > 2GB > (1 row) > > test=> SHOW work_mem; > work_mem > ---------- > 6010kB > (1 row) Huh? Only 6MB work_mem for 8GB RAM? How'd you get that? That's way low, and frankly it's not worth bothering with this if all we're going to get is an incremental increase. In that case, let's just set the default to 4MB like Robert suggested. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Thu, Oct 10, 2013 at 10:20:02AM -0700, Josh Berkus wrote: > On 10/09/2013 02:15 PM, Bruce Momjian wrote: > > and for shared_buffers of 2GB: > > > > test=> show shared_buffers; > > shared_buffers > > ---------------- > > 2GB > > (1 row) > > > > test=> SHOW work_mem; > > work_mem > > ---------- > > 6010kB > > (1 row) > > Huh? Only 6MB work_mem for 8GB RAM? How'd you get that? > That's way low, and frankly it's not worth bothering with this if all > we're going to get is an incremental increase. In that case, let's just > set the default to 4MB like Robert suggested. Uh, well, 100 backends at 6MB gives us 600MB, and if each backend uses 3x work_mem, that gives us 1.8GB for total work_mem. This was based on Andrew's concerns about possible over-commit of work_mem. I can of course adjust that. Consider 8GB of shared memory is 21MB. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
All, We can't reasonably require user input at initdb time, because most users don't run initdb by hand -- their installer does it for them. So any "tuning" which initdb does needs to be fully automated. So, the question is: can we reasonably determine, at initdb time, how much RAM the system has? I also think this is where the much-debated ALTER SYSTEM SET suddenly becomes valuable. With it, it's reasonable to run a "tune-up" tool on the client side. I do think it's reasonable to tell a user: "Just installed PostgreSQL? Run this command to tune your system:" Mind you, the tuneup tool I'm working on makes use of Python, configuration directory, and Jinga2, so it's not even relevant to the preceeding. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Bruce, * Bruce Momjian (bruce@momjian.us) wrote: > On Thu, Oct 10, 2013 at 12:00:54PM -0400, Stephen Frost wrote: > > I'm really not impressed with this argument. Either the user is going > > to go and modify the config file, in which case I would hope that they'd > > at least glance around at what they should change, or they're going to > > move off PG because it's not performing well enough for them- which is > > really what I'm trying to avoid happening during the first 15m. > > Well, they aren't going around and looking at other parameters now or we > would not feel a need to auto-tune many of our defaults. I think you're confusing things here. There's a huge difference between "didn't configure anything and got our defaults" and "went and changed only one thing in postgresql.conf". For one thing, we have a ton of the former. Perhaps there are some of the latter as well, but I would argue it's a pretty small group. > How do we handle the Python dependency, or is this all to be done in > some other language? I certainly am not ready to take on that job. I agree that we can't add a Python (or really, perl) dependency, but I don't think there's anything terribly complicated in what pgtune is doing that couldn't be pretty easily done in C.. > One nice thing about a tool is that you can see your auto-tuned defaults > right away, while doing this in the backend, you have to start the > server to see the defaults. I am not even sure how I could allow users > to preview their defaults for different available_mem settings. Agreed. > > Actually, it *is* good, as Magnus pointed out. Changing a completely > > unrelated parameter shouldn't make all of your plans suddenly change. > > This is mollified, but only a bit, if you have a GUC that's explicitly > > "this changes other GUCs", but I'd much rather have a tool that can do a > > better job to begin with and which helps the user understand what > > parameters are available to change and why there's more than one. > > Well, the big question is how many users are going to use the tool, as > we are not setting this up for experts, but for novices. The goal would be to have the distros and/or initdb use it for the initial configuration.. Perhaps by using debconf or similar to ask the user, perhaps by just running it and letting it do whatever it wants. > I think one big risk is someone changing shared_buffers and not having > an accurate available_memory. That might lead to some very inaccurate > defaults. Also, what happens if available_memory is not supplied at > all? Do we auto-tune just from shared_buffers, or not autotune at all, > and then what are the defaults? We could certainly throw an error if > shared_buffers > available_memory. We might just ship with > available_memory defaulting to 256MB and auto-tune everything from > there. These questions are less of an issue if we simply don't have this "available_memory" GUC (which strikes me as just adding more confusion for users anyway, not less). If no '--available-memory' (or whatever) option is passed to initdb then we should have it assume some default, yes, but my view on what that is depends on what the specific results are. It sounds like --avail-memory=256MB would end up setting things to about what we have now for defaults, which is alright for shared_buffers, imv, but not for a default work_mem (1MB is *really* small...). Thanks, Stephen
On Wed, Oct 9, 2013 at 8:06 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
The danger with work_mem especially is that setting it too high can lead to crashing postgres or your system at some stage down the track, so autotuning it is kinda dangerous, much more dangerous than autotuning shared buffers.
On 10/09/2013 10:45 AM, Bruce Momjian wrote:On Wed, Oct 9, 2013 at 04:40:38PM +0200, Pavel Stehule wrote:Effectively, if every session uses one full work_mem, you end up withBut it still is an independent parameter. I am just changing the default.
total work_mem usage equal to shared_buffers.
We can try a different algorithm to scale up work_mem, but it seems wise
to auto-scale it up to some extent based on shared_buffers.
In my experience a optimal value of work_mem depends on data and load, so I
prefer a work_mem as independent parameter.
Is this common to see? I ask because in my experience, having 100 connections all decide to do large sorts simultaneously is going to make the server fall over, regardless of whether it tries to do them in memory (OOM) or whether it does them with tape sorts (stuck spin locks, usually).
The assumption that each connection won't use lots of work_mem is also false, I think, especially in these days of connection poolers.
I don't follow that. Why would using a connection pooler change the multiples of work_mem that a connection would use?
Cheers,
Jeff
Bruce, >> That's way low, and frankly it's not worth bothering with this if all >> we're going to get is an incremental increase. In that case, let's just >> set the default to 4MB like Robert suggested. > > Uh, well, 100 backends at 6MB gives us 600MB, and if each backend uses > 3x work_mem, that gives us 1.8GB for total work_mem. This was based on > Andrew's concerns about possible over-commit of work_mem. I can of > course adjust that. That's worst-case-scenario planning -- the 3X work-mem per backend was: a) Solaris and b) data warehousing In a normal OLTP application each backend averages something like 0.25 * work_mem, since many queries use no work_mem at all. It also doesn't address my point that, if we are worst-case-scenario default-setting, we're going to end up with defaults which aren't materially different from the current defaults. In which case, why even bother with this whole exercise? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
> It also doesn't address my point that, if we are worst-case-scenario > default-setting, we're going to end up with defaults which aren't > materially different from the current defaults. In which case, why even > bother with this whole exercise? Oh, and let me reiterate: the way to optimize work_mem is through an admission control mechanism. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Thu, Oct 10, 2013 at 12:28 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Thu, Oct 10, 2013 at 12:00:54PM -0400, Stephen Frost wrote: >> * Bruce Momjian (bruce@momjian.us) wrote: >> > Well, I like the idea of initdb calling the tool, though the tool then >> > would need to be in C probably as we can't require python for initdb. >> > The tool would not address Robert's issue of someone increasing >> > shared_buffers on their own. >> >> I'm really not impressed with this argument. Either the user is going >> to go and modify the config file, in which case I would hope that they'd >> at least glance around at what they should change, or they're going to >> move off PG because it's not performing well enough for them- which is >> really what I'm trying to avoid happening during the first 15m. > > Well, they aren't going around and looking at other parameters now or we > would not feel a need to auto-tune many of our defaults. > > How do we handle the Python dependency, or is this all to be done in > some other language? I certainly am not ready to take on that job. I don't see why it can't be done in C. The server is written in C, and so is initdb. So no matter where we do this, it's gonna be in C. Where does Python enter into it? What I might propose is that we have add a new binary tunedb, maybe compiled out of the src/bin/initdb.c directory. So you can say: initdb --available-memory=32GB ...and it will initialize the cluster with appropriate settings. Or you can say: tunedb --available-memory=32GB ...and it will print out a set of proposed configuration settings. If we want a mode that rewrites the configuration file, we could have: tunedb --available-memory=32GB --rewrite-config-file=$PATH ...but that might be overkill, at least for version 1. > One nice thing about a tool is that you can see your auto-tuned defaults > right away, while doing this in the backend, you have to start the > server to see the defaults. I am not even sure how I could allow users > to preview their defaults for different available_mem settings. Yep, agreed. And agreed that not being able to preview settings is a problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Oct 10, 2013 at 1:37 PM, Josh Berkus <josh@agliodbs.com> wrote: > So, the question is: can we reasonably determine, at initdb time, how > much RAM the system has? As long as you are willing to write platform-dependent code, yes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Oct 10, 2013 at 11:41 AM, Robert Haas <robertmhaas@gmail.com> wrote: > I don't see why it can't be done in C. The server is written in C, > and so is initdb. So no matter where we do this, it's gonna be in C. > Where does Python enter into it? I mentioned that pgtune was written in Python, but as you say that's wholly incidental. An equivalent C program would only be slightly more verbose. -- Peter Geoghegan
On 10/10/13 11:31 AM, Bruce Momjian wrote: > Let me walk through the idea of adding an available_mem setting, that > Josh suggested, and which I think addresses Robert's concern about > larger shared_buffers and Windows servers. I think this is a promising idea. available_mem could even be set automatically by packages. And power users could just set available_mem = -1 to turn off all the magic.
On Thu, Oct 10, 2013 at 11:43 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Oct 10, 2013 at 1:37 PM, Josh Berkus <josh@agliodbs.com> wrote: >> So, the question is: can we reasonably determine, at initdb time, how >> much RAM the system has? > > As long as you are willing to write platform-dependent code, yes. That's why trying to give the responsibility to a packager is compelling. -- Peter Geoghegan
On Thu, Oct 10, 2013 at 2:45 PM, Josh Berkus <josh@agliodbs.com> wrote: > On 10/10/2013 11:41 AM, Robert Haas wrote: >> tunedb --available-memory=32GB >> >> ...and it will print out a set of proposed configuration settings. If >> we want a mode that rewrites the configuration file, we could have: >> >> tunedb --available-memory=32GB --rewrite-config-file=$PATH >> >> ...but that might be overkill, at least for version 1. > > Given that we are talking currently about ALTER SYSTEM SET *and* > configuration directories, we should not be rewriting any existing > config file. We should be adding an auto-generated one, or using ALTER > SYSTEM SET. > > In fact, why don't we just do this though ALTER SYSTEM SET? add a > plpgsql function called pg_tune(). That's another way to do it, for sure. It does require the ability to log in to the database. I imagine that could be less convenient in some scripting environments. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Oct 10, 2013 at 8:41 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Oct 10, 2013 at 12:28 PM, Bruce Momjian <bruce@momjian.us> wrote: >> On Thu, Oct 10, 2013 at 12:00:54PM -0400, Stephen Frost wrote: >>> * Bruce Momjian (bruce@momjian.us) wrote: >>> > Well, I like the idea of initdb calling the tool, though the tool then >>> > would need to be in C probably as we can't require python for initdb. >>> > The tool would not address Robert's issue of someone increasing >>> > shared_buffers on their own. >>> >>> I'm really not impressed with this argument. Either the user is going >>> to go and modify the config file, in which case I would hope that they'd >>> at least glance around at what they should change, or they're going to >>> move off PG because it's not performing well enough for them- which is >>> really what I'm trying to avoid happening during the first 15m. >> >> Well, they aren't going around and looking at other parameters now or we >> would not feel a need to auto-tune many of our defaults. >> >> How do we handle the Python dependency, or is this all to be done in >> some other language? I certainly am not ready to take on that job. > > I don't see why it can't be done in C. The server is written in C, > and so is initdb. So no matter where we do this, it's gonna be in C. > Where does Python enter into it? > > What I might propose is that we have add a new binary tunedb, maybe > compiled out of the src/bin/initdb.c directory. So you can say: > > initdb --available-memory=32GB > > ...and it will initialize the cluster with appropriate settings. Or > you can say: > > tunedb --available-memory=32GB > > ...and it will print out a set of proposed configuration settings. If > we want a mode that rewrites the configuration file, we could have: > > tunedb --available-memory=32GB --rewrite-config-file=$PATH > > ...but that might be overkill, at least for version 1. I like this. And I agree that the edit-in-place might be overkill. But then, if/when we get the ability to programatically modify the config files, that's probably not a very complicated thing to add once the rest is done. >> One nice thing about a tool is that you can see your auto-tuned defaults >> right away, while doing this in the backend, you have to start the >> server to see the defaults. I am not even sure how I could allow users >> to preview their defaults for different available_mem settings. > > Yep, agreed. And agreed that not being able to preview settings is a problem. I'd even say it would be a *big* problem. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On 10/10/2013 11:41 AM, Robert Haas wrote: > tunedb --available-memory=32GB > > ...and it will print out a set of proposed configuration settings. If > we want a mode that rewrites the configuration file, we could have: > > tunedb --available-memory=32GB --rewrite-config-file=$PATH > > ...but that might be overkill, at least for version 1. Given that we are talking currently about ALTER SYSTEM SET *and* configuration directories, we should not be rewriting any existing config file. We should be adding an auto-generated one, or using ALTER SYSTEM SET. In fact, why don't we just do this though ALTER SYSTEM SET? add a plpgsql function called pg_tune(). -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Wed, Oct 9, 2013 at 10:21 PM, Magnus Hagander <magnus@hagander.net> wrote: >> Well, the Postgres defaults won't really change, because the default >> vacuum_work_mem will be -1, which will have vacuum defer to >> maintenance_work_mem. Under this scheme, vacuum only *prefers* to get >> bound working memory size from vacuum_work_mem. If you don't like >> vacuum_work_mem, you can just ignore it. > While unrelated to the main topic of this thread, I think this is very > important as well. I often have to advice people to remember to cap > their maintenance_work_mem because of autovacuum, and to remember to > re-tune maintenance_wokr_mem when they change the number of autovacuum > workers. I'll code that up at some point, then. -- Peter Geoghegan
On 10/10/13 11:45 AM, Bruce Momjian wrote: > I think the big win for a tool would be to query the user about how they > are going to be using Postgres, and that can then spit out values the > user can add to postgresql.conf, or to a config file that is included at > the end of postgresql.conf. I think such a tool would actually make the initial experience worse for many people. Quick, how are you going to use your PostgreSQL server: - OLTP - web - mixed Uh, all of the above? This sort of thing can quickly turn the "first 15 minutes" into the first 2 hours, as users are forced to analyze the different settings, have second thoughts, wonder about how to change them back, etc. The fewer decisions people have to make initially, the better. The initdb phase already has too many required decisions that can cause regret later (e.g., locale, encoding, checksums).
On Thu, Oct 10, 2013 at 8:46 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Oct 10, 2013 at 2:45 PM, Josh Berkus <josh@agliodbs.com> wrote: >> On 10/10/2013 11:41 AM, Robert Haas wrote: >>> tunedb --available-memory=32GB >>> >>> ...and it will print out a set of proposed configuration settings. If >>> we want a mode that rewrites the configuration file, we could have: >>> >>> tunedb --available-memory=32GB --rewrite-config-file=$PATH >>> >>> ...but that might be overkill, at least for version 1. >> >> Given that we are talking currently about ALTER SYSTEM SET *and* >> configuration directories, we should not be rewriting any existing >> config file. We should be adding an auto-generated one, or using ALTER >> SYSTEM SET. >> >> In fact, why don't we just do this though ALTER SYSTEM SET? add a >> plpgsql function called pg_tune(). > > That's another way to do it, for sure. It does require the ability to > log in to the database. I imagine that could be less convenient in > some scripting environments. I think that would also make it much harder to automate for packagers. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Thu, Oct 10, 2013 at 12:28 PM, Bruce Momjian <bruce@momjian.us> wrote: > How do we handle the Python dependency, or is this all to be done in > some other language? I certainly am not ready to take on that job. I should think it possible to reimplement it in C. It was considerably useful to start by implementing in Python, as that evades various sorts of efforts needed in C (e.g. - memory allocation, picking a hash table implementation), and allows someone to hack on it without needing to run through a recompile every time something is touched. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
On Thu, Oct 10, 2013 at 3:41 PM, Christopher Browne <cbbrowne@gmail.com> wrote: > On Thu, Oct 10, 2013 at 12:28 PM, Bruce Momjian <bruce@momjian.us> wrote: >> How do we handle the Python dependency, or is this all to be done in >> some other language? I certainly am not ready to take on that job. > > I should think it possible to reimplement it in C. It was considerably > useful to start by implementing in Python, as that evades various sorts > of efforts needed in C (e.g. - memory allocation, picking a hash table > implementation), and allows someone to hack on it without needing to > run through a recompile every time something is touched. Also, the last time I saw that tool, it output recommendations for work_mem that I would never, ever recommend to anyone on a production server - they were VERY high. More generally, Josh has made repeated comments that various proposed value/formulas for work_mem are too low, but obviously the people who suggested them didn't think so. So I'm a bit concerned that we don't all agree on what the end goal of this activity looks like. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> More generally, Josh has made repeated comments that various proposed > value/formulas for work_mem are too low, but obviously the people who > suggested them didn't think so. So I'm a bit concerned that we don't > all agree on what the end goal of this activity looks like. The counter-proposal to "auto-tuning" is just to raise the default for work_mem to 4MB or 8MB. Given that Bruce's current formula sets it at 6MB for a server with 8GB RAM, I don't really see the benefit of going to a whole lot of code and formulas in order to end up at a figure only incrementally different from a new static default. The core issue here is that there aren't good "generic" values for these settings for all users -- that's why we have the settings in the first place. Following a formula isn't going to change that. If we're serious about autotuning, then we should look at: a) admissions control for non-shared resources (e.g. work_mem) b) auto-feedback tuning loops (ala Heikki's checkpoint_segments and the bgwriter). We could certainly create an autofeedback tuning loop for work_mem. Just watch the database, record the amount of data spilled to disk for work (pg_stat_sorts), and record the total RAM pinned by backends. *Then* apply a formula and maybe bump up work_mem a bit depending on what comes out of it. And keep monitoring and keep readjusting. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Thu, Oct 10, 2013 at 11:14:27AM -0700, Jeff Janes wrote: > The assumption that each connection won't use lots of work_mem is also > false, I think, especially in these days of connection poolers. > > > I don't follow that. Why would using a connection pooler change the multiples > of work_mem that a connection would use? I assume that a connection pooler would keep processes running longer, so even if they were not all using work_mem, they would have that memory mapped into the process, and perhaps swapped out. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Thu, Oct 10, 2013 at 11:18:28AM -0700, Josh Berkus wrote: > Bruce, > > >> That's way low, and frankly it's not worth bothering with this if all > >> we're going to get is an incremental increase. In that case, let's just > >> set the default to 4MB like Robert suggested. > > > > Uh, well, 100 backends at 6MB gives us 600MB, and if each backend uses > > 3x work_mem, that gives us 1.8GB for total work_mem. This was based on > > Andrew's concerns about possible over-commit of work_mem. I can of > > course adjust that. > > That's worst-case-scenario planning -- the 3X work-mem per backend was: > a) Solaris and > b) data warehousing > > In a normal OLTP application each backend averages something like 0.25 * > work_mem, since many queries use no work_mem at all. > > It also doesn't address my point that, if we are worst-case-scenario > default-setting, we're going to end up with defaults which aren't > materially different from the current defaults. In which case, why even > bother with this whole exercise? OK, here is an updated patch that is less conservative. FYI, this thread has gone on for 80 messages, and I assume it will take many more until we are done: test=> SHOW shared_buffers; shared_buffers ---------------- 128MB (1 row) test=> SHOW work_mem; work_mem ---------- 2621kB (1 row) test=> SHOW maintenance_work_mem; maintenance_work_mem ---------------------- 10922kB (1 row) --------------------------------------------------------------------------- test=> SHOW shared_buffers; shared_buffers ---------------- 2GB (1 row) test=> SHOW work_mem; work_mem ---------- 41943kB (1 row) test=> SHOW maintenance_work_mem; maintenance_work_mem ---------------------- 174762kB (1 row) --------------------------------------------------------------------------- test=> SHOW shared_buffers; shared_buffers ---------------- 8GB (1 row) test=> SHOW work_mem; work_mem ---------- 167772kB (1 row) test=> SHOW maintenance_work_mem; maintenance_work_mem ---------------------- 699050kB (1 row) Patch attached. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Attachment
>> I don't follow that. Why would using a connection pooler change the multiples >> of work_mem that a connection would use? > > I assume that a connection pooler would keep processes running longer, > so even if they were not all using work_mem, they would have that memory > mapped into the process, and perhaps swapped out. Yes, and then this is when it *really* matters what OS you're running, and what release. FreeBSD and Solaris++ don't overallocate RAM, so those long-running connections pin a lot of RAM eventually. And for Linux, it's a question of how aggressive the OOM killer is, which kinda depends on distro/version/sysadmin settings. When I configure pgbouncer for Illumos users, I specifically have it rotate out old connections once an hour for this reason. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Thu, Oct 10, 2013 at 02:44:12PM -0400, Peter Eisentraut wrote: > On 10/10/13 11:31 AM, Bruce Momjian wrote: > > Let me walk through the idea of adding an available_mem setting, that > > Josh suggested, and which I think addresses Robert's concern about > > larger shared_buffers and Windows servers. > > I think this is a promising idea. available_mem could even be set > automatically by packages. And power users could just set available_mem > = -1 to turn off all the magic. Yes, I was thinking about that. Imagine we have an initdb parameter for available memory --- packagers could do something like: initdb -M $(awk '{print $2 * 1024; exit}' /proc/meminfo) to pass in the available memory of the server, or to use 90% of RAM, use: initdb -M $(awk '{printf "%.0f\n", $2 * 1024 * 0.9; exit}' /proc/meminfo) This allows us to externalize all the OS-specific information and allow the packagers to supply it. The packagers could even ask the user if they wish to control the percentage. FYI, I hope people are OK with me replying a lot in this thread --- I do think this is going to take a lot of discussion, but I think the end-result will be worth it. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Thu, Oct 10, 2013 at 03:40:17PM -0700, Josh Berkus wrote: > > >> I don't follow that. Why would using a connection pooler change the multiples > >> of work_mem that a connection would use? > > > > I assume that a connection pooler would keep processes running longer, > > so even if they were not all using work_mem, they would have that memory > > mapped into the process, and perhaps swapped out. > > Yes, and then this is when it *really* matters what OS you're running, > and what release. FreeBSD and Solaris++ don't overallocate RAM, so > those long-running connections pin a lot of RAM eventually. And for > Linux, it's a question of how aggressive the OOM killer is, which kinda > depends on distro/version/sysadmin settings. > > When I configure pgbouncer for Illumos users, I specifically have it > rotate out old connections once an hour for this reason. Just as a point of education, this is a good idea why you want to allocate swap even if you expect your workload to fit in memory. Pushing unused memory to swap is a good use of swap. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Thu, Oct 10, 2013 at 03:27:17PM -0700, Josh Berkus wrote: > > > More generally, Josh has made repeated comments that various proposed > > value/formulas for work_mem are too low, but obviously the people who > > suggested them didn't think so. So I'm a bit concerned that we don't > > all agree on what the end goal of this activity looks like. > > The counter-proposal to "auto-tuning" is just to raise the default for > work_mem to 4MB or 8MB. Given that Bruce's current formula sets it at > 6MB for a server with 8GB RAM, I don't really see the benefit of going > to a whole lot of code and formulas in order to end up at a figure only > incrementally different from a new static default. Well, the plan was going to auto-tune shared_buffers and effective_cache_size too. We could fall back to our existing code where effective_cache_size autotunes on shared_buffers, and we just up work_mem's default, tell people to set shared_buffers properly, and call it a day. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On 10/10/13 9:44 AM, MauMau wrote: > From: "Robert Haas" <robertmhaas@gmail.com> >> On Thu, Oct 10, 2013 at 1:23 AM, Magnus Hagander <magnus@hagander.net> wrote: >>> I think it would be even simpler, and more reliable, to start with the >>> parameter to initdb - I like that. But instead of having it set a new >>> variable based on that and then autotune off that, just have *initdb* >>> do these calculations you're suggesting, and write new defaults to the >>> files (preferably with a comment). >>> >>> That way if the user *later* comes in and say changes shared_buffers, >>> we don't dynamically resize the work_mem into a value that might cause >>> his machine to die from swapping which would definitely violate the >>> principle of least surprise.. >> >> +1 for all of that. I completely agree. > > I vote for this idea completely, too. It's nice to be able to specify usable RAM with something like "initdb --system-memory8GB", because it provides flexibility for memory allocation --- use the whole machine for one PostgreSQL instance,or run multiple instances on one machine with 50% of RAM for instance-A and 25% of RAM for instance B and C, etc. But what is the default value of --system-memory? I would like it to be the whole RAM. > > I hope something like pgtune will be incorporated into the core, absorbing the ideas in: > > - pgtune > - https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server > - the book "PostgreSQL 9.0 High Performance" by Greg Smith > > Then initdb calls the tool. Of course, DBAs can use the tool later. Like pgtune, the tool would be nice if it and initdbcan accept "--system-type" or "--workload" with arguments {OLTP | DW | mixed}. +1 on all of the above. If putting one-time magic in initdb works maybe then we can look at runtime or even completely dynamicmagic. FWIW, I would be careful about allowing the tool to go completely crazy if --system-memory is set really high, includingfor things like work_mem. Frequently if you've got a lot of memory you're going to want a serious chunk of it usedby the filesystem/kernel cache, and not to just vanish into a random sort (esp since last I knew there were diminishingreturns on sort work_mem...) Of course, I'm in a world of 512G servers with 8GB shared buffers so... -- Jim C. Nasby, Data Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On Thu, Oct 10, 2013 at 6:36 PM, Bruce Momjian <bruce@momjian.us> wrote: > Patch attached. ISTM that we have broad consensus that doing this at initdb time is more desirable than doing it in the server on the fly. Not everyone agrees with that (you don't, for instance) but there were many, many votes in favor of that option. Judging by the commit I just pushed to do initdb-time selection of the dynamic shared memory implementation to use, this is probably not hard to code. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Oct 10, 2013 at 6:27 PM, Josh Berkus <josh@agliodbs.com> wrote: >> More generally, Josh has made repeated comments that various proposed >> value/formulas for work_mem are too low, but obviously the people who >> suggested them didn't think so. So I'm a bit concerned that we don't >> all agree on what the end goal of this activity looks like. > > The counter-proposal to "auto-tuning" is just to raise the default for > work_mem to 4MB or 8MB. Given that Bruce's current formula sets it at > 6MB for a server with 8GB RAM, I don't really see the benefit of going > to a whole lot of code and formulas in order to end up at a figure only > incrementally different from a new static default. Agreed. But what do you think the value SHOULD be on such a system? I suggest that it's pretty reasonable to assume that even a developer's personal machine will likely have 8GB or so by the time PostgreSQL comes out, so tuning work_mem on that basis is not unreasonable. Plus, even if it has less, a developer probably won't have 100 connections. I guess the point I'm making here is that raising the default value is not mutually exclusive with auto-tuning. We could quadruple the current defaults for work_mem and maintenance_work_mem and be better off right now, today. Then, we could improve things further in the future if and when we agree on an approach to auto-tuning. And people who don't use the auto-tuning will still have a better default. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert, >> The counter-proposal to "auto-tuning" is just to raise the default for >> work_mem to 4MB or 8MB. Given that Bruce's current formula sets it at >> 6MB for a server with 8GB RAM, I don't really see the benefit of going >> to a whole lot of code and formulas in order to end up at a figure only >> incrementally different from a new static default. > > Agreed. But what do you think the value SHOULD be on such a system? That's the problem: It Depends. One thing in particular which is an issue with calculating against max_connections is that users who don't need 100 connections seldom *reduce* max_connections. So that developer laptop which only needs 3 connections is still going to have a max_connections of 100, just like the DW server where m_c should probably be 30. > I guess the point I'm making here is that raising the default value is > not mutually exclusive with auto-tuning. We could quadruple the > current defaults for work_mem and maintenance_work_mem and be better > off right now, today. Then, we could improve things further in the > future if and when we agree on an approach to auto-tuning. And people > who don't use the auto-tuning will still have a better default. Seems fine to me. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Thu, Oct 10, 2013 at 9:41 PM, Christopher Browne <cbbrowne@gmail.com> wrote: > On Thu, Oct 10, 2013 at 12:28 PM, Bruce Momjian <bruce@momjian.us> wrote: >> How do we handle the Python dependency, or is this all to be done in >> some other language? I certainly am not ready to take on that job. > > I should think it possible to reimplement it in C. It was considerably > useful to start by implementing in Python, as that evades various sorts > of efforts needed in C (e.g. - memory allocation, picking a hash table > implementation), and allows someone to hack on it without needing to > run through a recompile every time something is touched. I think in the context of this problem, reimplementing that from python to C is the easiest part. Actually figuring out what the tool should *do* and how it should do it is the hard part. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Thu, Oct 10, 2013 at 10:20:36PM -0700, Josh Berkus wrote: > Robert, > > >> The counter-proposal to "auto-tuning" is just to raise the default for > >> work_mem to 4MB or 8MB. Given that Bruce's current formula sets it at > >> 6MB for a server with 8GB RAM, I don't really see the benefit of going > >> to a whole lot of code and formulas in order to end up at a figure only > >> incrementally different from a new static default. > > > > Agreed. But what do you think the value SHOULD be on such a system? > > That's the problem: It Depends. > > One thing in particular which is an issue with calculating against > max_connections is that users who don't need 100 connections seldom > *reduce* max_connections. So that developer laptop which only needs 3 > connections is still going to have a max_connections of 100, just like > the DW server where m_c should probably be 30. > > > I guess the point I'm making here is that raising the default value is > > not mutually exclusive with auto-tuning. We could quadruple the > > current defaults for work_mem and maintenance_work_mem and be better > > off right now, today. Then, we could improve things further in the > > future if and when we agree on an approach to auto-tuning. And people > > who don't use the auto-tuning will still have a better default. > > Seems fine to me. I think we are nearing a conclusion on these issues, and I thank everyone for the vigorous discussion. When Josh showed disappointment at the small increases in work_mem and maintenance_work_mem from autotuning, I realized the complexity of autotuning just wasn't warranted here. Andrew's concern about the risks of having a work_mem too high was also sobering. Effective_cache_size has neither of these issues, and hence was logical for auto-tuning. I know Robert originally suggested just improving the work_mem default --- I now agree with him, and am sorry it took me so long to realize he was right. One other problem with auto-tuning is that it really relies not only on allocated_memory, but also on max_connections and autovacuum_max_workers, which are going to be rather arbitrary and hard for a user to set good enough to help auto-tuning. Josh might be right that auto-tuning of work_mem has to be more dynamic, perhaps based on the number of _active_ backends or number of backends who have allocate or are currently using work_mem. Our new dynamic shared memory allocation routines might help here in allocationg memory that can be easily purged from the process address space. I am now seeing a pattern that per-backend allocations really need run-time tuning, rather than being based on fixed GUC values. In summary, I think we need to: * decide on new defaults for work_mem and maintenance_work_mem * add an initdb flag to allow users/packagers to set shared_bufffers? * add an autovacuum_work_mem setting? * change the default for temp_buffers? I will try to think some more about work_mem dynamic/runtime tuning and return to it later. I know Kevin has also thought about it. I am also interesting in working on a server-side function that will make configuration suggestions or use ALTER SYSTEM to set values. I could do it in PL/pgSQL, but PL/Perl would allow me to run operating system commands to probe for OS information. The function could look at statistics and pg_buffercache output, and would be run during a typical workload. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On 10/11/2013 01:11 PM, Bruce Momjian wrote: > In summary, I think we need to: > > * decide on new defaults for work_mem and maintenance_work_mem > * add an initdb flag to allow users/packagers to set shared_bufffers? > * add an autovacuum_work_mem setting? > * change the default for temp_buffers? If we're changing defaults, bgwriter_lru_maxpages and vacuum_cost_limit could also use a bump; those thresholds were set for servers with < 1GB of RAM. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Josh Berkus <josh@agliodbs.com> wrote: > On 10/11/2013 01:11 PM, Bruce Momjian wrote: >> In summary, I think we need to: >> >> * decide on new defaults for work_mem and maintenance_work_mem >> * add an initdb flag to allow users/packagers to set shared_bufffers? >> * add an autovacuum_work_mem setting? >> * change the default for temp_buffers? > > If we're changing defaults, bgwriter_lru_maxpages and vacuum_cost_limit > could also use a bump; those thresholds were set for servers with < 1GB > of RAM. +1 on those. Also, I have often had to bump cpu_tuple_cost into the 0.03 to 0.05 range to get a good plan. In general, this makes the exact settings of *_page_cost less fussy, and I have hit situations where I was completely unable to get a good plan to emerge without bumping cpu_tuple_cost relative to the other cpu costs. I know that it's possible to engineer a workload that shows any particular cost adjustment to make things worse, but in real-life production environments I have never seen an increase in this range make plan choice worse. Regarding the settings which have been the center of attention for most of this thread, I have had very good luck with starting work_mem at machine RAM * 0.25 / max_connections. I get the impression that Josh regards that as too low. My guess is that he deals more with data warehouse reporting systems than I do, where larger settings are both more beneficial and less likely to cause memory exhaustion than the typical systems I've worked with. That is the big problem with auto-configuration -- it depends so much on the workload. In the long run, an admission control policy and/or adaptive configuration based on the observed workload seems like what we *really* need. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
From: "Bruce Momjian" <bruce@momjian.us> > On Thu, Oct 10, 2013 at 11:01:52PM +0900, MauMau wrote: >> Although this is not directly related to memory, could you set >> max_prepared_transactions = max_connections at initdb time? People >> must feel frustrated when they can't run applications on a Java or >> .NET application server and notice that they have to set >> max_prepared_transactions and restart PostgreSQL. This is far from >> friendly. > > I think the problem is that many users don't need prepared transactions > and therefore don't want the overhead. Is that still accurate? I'm not sure if many use XA features, but I saw the questions and answer a few times, IIRC. In the trouble situation, PostgreSQL outputs an intuitive message like "increase max_prepared_transactions", so many users might possibly have been able to change the setting and solve the problem themselves without asking for help, feeling stress like "Why do I have to set this?" For example, max_prepared_transactions is called "hideous creature" in the following page: https://community.jboss.org/wiki/InstallPostgreSQLOnFedora?_sscc=t According to the below page, the amount of memory consumed for this is "(770 + 270 * max_locks_per_transaction) * max_prepared_transactions". With the default setting of maxconnections=100 and max_locks_per_transaction=64, this is only 180KB. So the overhead is negligible. http://www.postgresql.org/docs/9.2/static/kernel-resources.html If the goal is to make PostgreSQL more friendly and run smoothly without frustration from the start and not perfect tuning, I think max_prepared_transactions=max_connections is an easy and good item. If the goal is limited to auto-tuning memory sizes, this improvement can be treated separately. Regards MauMau
From: "Dimitri Fontaine" <dimitri@2ndQuadrant.fr> > "MauMau" <maumau307@gmail.com> writes: >> Although this is not directly related to memory, could you set >> max_prepared_transactions = max_connections at initdb time? People must > > You really need to have a transaction manager around when issuing > prepared transaction as failing to commit/rollback them will prevent > VACUUM and quickly lead you to an interesting situation. I understand this problem occurs only when the user configured the application server to use distributed transactions, the application server crashed between prepare and commit/rollback, and the user doesn't recover the application server. So only improper operation produces the problem. Just setting max_prepared_transactions to non-zero doesn't cause the problem. Is this correct? Regards MauMau
<p dir="ltr"><br /> On Oct 12, 2013 2:13 AM, "MauMau" <<a href="mailto:maumau307@gmail.com">maumau307@gmail.com</a>>wrote:<br /> ><br /> > From: "Bruce Momjian" <<a href="mailto:bruce@momjian.us">bruce@momjian.us</a>><br/> >><br /> >> On Thu, Oct 10, 2013 at 11:01:52PM +0900,MauMau wrote:<br /> >>><br /> >>> Although this is not directly related to memory, could you set<br/> >>> max_prepared_transactions = max_connections at initdb time? People<br /> >>> must feel frustratedwhen they can't run applications on a Java or<br /> >>> .NET application server and notice that they haveto set<br /> >>> max_prepared_transactions and restart PostgreSQL. This is far from<br /> >>> friendly.<br/> >><br /> >><br /> >> I think the problem is that many users don't need prepared transactions<br/> >> and therefore don't want the overhead. Is that still accurate?<br /> ><br /> ><br /> >I'm not sure if many use XA features, but I saw the questions and answer a few times, IIRC. In the trouble situation,PostgreSQL outputs an intuitive message like "increase max_prepared_transactions", so many users might possiblyhave been able to change the setting and solve the problem themselves without asking for help, feeling stress like"Why do I have to set this?" For example, max_prepared_transactions is called "hideous creature" in the following page:<br/> ><br /> > <a href="https://community.jboss.org/wiki/InstallPostgreSQLOnFedora?_sscc=t">https://community.jboss.org/wiki/InstallPostgreSQLOnFedora?_sscc=t</a><p dir="ltr">Anybodywho follows that page is screwed anyway. I notice they recommend running regular VACUUM FULL across thewhole database, so it's obvious they know nothing about postgresql. There's nothing we can do about what people writeon random pages around the Internet. <p dir="ltr">> According to the below page, the amount of memory consumed forthis is "(770 + 270 * max_locks_per_transaction) * max_prepared_transactions". With the default setting of maxconnections=100and max_locks_per_transaction=64, this is only 180KB. So the overhead is negligible.<p dir="ltr">You areassuming memory is the only overhead. I don't think it is.<br /><p dir="ltr">> If the goal is to make PostgreSQL morefriendly and run smoothly without frustration from the start and not perfect tuning, I think max_prepared_transactions=max_connectionsis an easy and good item. If the goal is limited to auto-tuning memory sizes, thisimprovement can be treated separately.<br /><p dir="ltr">Frankly, I think we'd help 1000 times more users of we enableda few wal writers by default and jumped the wal level. Mainly so they could run one off base backup. That's used byorders of magnitude more users than XA. <p dir="ltr">/Magnus
<p dir="ltr"><br /> On Oct 11, 2013 10:23 PM, "Josh Berkus" <<a href="mailto:josh@agliodbs.com">josh@agliodbs.com</a>>wrote:<br /> ><br /> > On 10/11/2013 01:11 PM, Bruce Momjianwrote:<br /> > > In summary, I think we need to:<br /> > ><br /> > > * decide on new defaults forwork_mem and maintenance_work_mem<br /> > > * add an initdb flag to allow users/packagers to set shared_bufffers?<br/> > > * add an autovacuum_work_mem setting?<br /> > > * change the default for temp_buffers?<br/> ><br /> > If we're changing defaults, bgwriter_lru_maxpages and vacuum_cost_limit<br /> > couldalso use a bump; those thresholds were set for servers with < 1GB<br /> > of RAM<br /><p dir="ltr">Uh, those arethere to limit io and not memory, right? More memory isn't the reason to increase them, more io is. For people deployingon modern server hardware then yes it's often low, but for all those deploying in virtualized environments withio performance reminding you of the 1990ies, I'm not so sure it is... <p dir="ltr">/Magnus <br />
"MauMau" <maumau307@gmail.com> writes: > I understand this problem occurs only when the user configured the > application server to use distributed transactions, the application server > crashed between prepare and commit/rollback, and the user doesn't recover > the application server. So only improper operation produces the problem. The reason why that parameter default has changed from 5 to 0 is that some people would mistakenly use a prepared transaction without a transaction manager. Few only people are actually using a transaction manager that it's better to have them have to set PostgreSQL. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Magnus Hagander <magnus@hagander.net> writes: > Frankly, I think we'd help 1000 times more users of we enabled a few wal > writers by default and jumped the wal level. Mainly so they could run one > off base backup. That's used by orders of magnitude more users than XA. +1, or += default max_wal_senders actually ;-) My vote would be to have default to at least 3, so that you can run both a pg_basebackup -Xs (stream) and a standby or a pg_receivexlog in parallel. Maybe 5 is an even better default. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On 2013-10-12 09:04:55 +0200, Magnus Hagander wrote: > Frankly, I think we'd help 1000 times more users of we enabled a few wal > writers by default and jumped the wal level. Mainly so they could run one > off base backup. That's used by orders of magnitude more users than XA. Yes, I've thought about that several times as well. I think it might actually not be too hard to allow increasing the WAL level dynamically similar to what we do with full_page_writes. If we then would allow starting wal senders with a lower wal_level we would get there without breaking older installations which rely on TRUNCATE && COPY optimizations et al. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Sat, Oct 12, 2013 at 3:07 AM, Magnus Hagander <magnus@hagander.net> wrote: > On Oct 11, 2013 10:23 PM, "Josh Berkus" <josh@agliodbs.com> wrote: >> On 10/11/2013 01:11 PM, Bruce Momjian wrote: >> > In summary, I think we need to: >> > >> > * decide on new defaults for work_mem and maintenance_work_mem >> > * add an initdb flag to allow users/packagers to set shared_bufffers? >> > * add an autovacuum_work_mem setting? >> > * change the default for temp_buffers? >> >> If we're changing defaults, bgwriter_lru_maxpages and vacuum_cost_limit >> could also use a bump; those thresholds were set for servers with < 1GB >> of RAM > > Uh, those are there to limit io and not memory, right? More memory isn't the > reason to increase them, more io is. For people deploying on modern server > hardware then yes it's often low, but for all those deploying in virtualized > environments with io performance reminding you of the 1990ies, I'm not so > sure it is... bgwriter_lru_maxpages is clearly related to the size of shared_buffers, although confusingly it is expressed as a number of buffers, while shared_buffers is expressed as a quantity of memory. I think we might have done better to call the GUC bgwriter_lru_maxpercent and make it a percentage of shared buffers. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 10/14/13 8:18 AM, Robert Haas wrote: > On Sat, Oct 12, 2013 at 3:07 AM, Magnus Hagander <magnus@hagander.net> wrote: >> On Oct 11, 2013 10:23 PM, "Josh Berkus" <josh@agliodbs.com> wrote: >>> On 10/11/2013 01:11 PM, Bruce Momjian wrote: >>>> In summary, I think we need to: >>>> >>>> * decide on new defaults for work_mem and maintenance_work_mem >>>> * add an initdb flag to allow users/packagers to set shared_bufffers? >>>> * add an autovacuum_work_mem setting? >>>> * change the default for temp_buffers? >>> >>> If we're changing defaults, bgwriter_lru_maxpages and vacuum_cost_limit >>> could also use a bump; those thresholds were set for servers with < 1GB >>> of RAM >> >> Uh, those are there to limit io and not memory, right? More memory isn't the >> reason to increase them, more io is. For people deploying on modern server >> hardware then yes it's often low, but for all those deploying in virtualized >> environments with io performance reminding you of the 1990ies, I'm not so >> sure it is... > > bgwriter_lru_maxpages is clearly related to the size of > shared_buffers, although confusingly it is expressed as a number of > buffers, while shared_buffers is expressed as a quantity of memory. I > think we might have done better to call the GUC > bgwriter_lru_maxpercent and make it a percentage of shared buffers. > Also, more memory generally means more filesystem cache which means you can do more vacuum work per round. FWIW, on our 512G servers... cnuapp_prod@postgres11.obr=# select name, setting from pg_settings where name ~ 'vacuum_cost'; name | setting ------------------------------+--------- autovacuum_vacuum_cost_delay | 10 autovacuum_vacuum_cost_limit | -1 vacuum_cost_delay | 10 vacuum_cost_limit | 2000 vacuum_cost_page_dirty | 10 vacuum_cost_page_hit | 1 vacuum_cost_page_miss | 10 (7 rows) The page_hit cost is intentionally the same as the page_dirty limit because writes to the SAN are generally far cheaper thanreads that actually hit spindles. Of course with the amount of FS cache we have (512G-8G shared buffers at most) readsare often very likely to hit the FS cache, but tuning of these settings while watching IO stats has shown these settingsto be minimally disruptive. -- Jim C. Nasby, Data Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
From: "Magnus Hagander" <magnus@hagander.net> > On Oct 12, 2013 2:13 AM, "MauMau" <maumau307@gmail.com> wrote: >> I'm not sure if many use XA features, but I saw the questions and answer > a few times, IIRC. In the trouble situation, PostgreSQL outputs an > intuitive message like "increase max_prepared_transactions", so many users > might possibly have been able to change the setting and solve the problem > themselves without asking for help, feeling stress like "Why do I have to > set this?" For example, max_prepared_transactions is called "hideous > creature" in the following page: >> >> https://community.jboss.org/wiki/InstallPostgreSQLOnFedora?_sscc=t > > Anybody who follows that page is screwed anyway. I notice they recommend > running regular VACUUM FULL across the whole database, so it's obvious > they > know nothing about postgresql. There's nothing we can do about what people > write on random pages around the Internet. Regular VACUUM FULL is certainly overkill. Apart from that, having to set max_prepared_transactions seems to make PostgreSQL difficult for people with that level of knowledge, doesn't it? I wonder if there are other major DBMSs which require marameter configuration and server restart to use distributed transactions. > >> According to the below page, the amount of memory consumed for this is > "(770 + 270 * max_locks_per_transaction) * max_prepared_transactions". > With the default setting of maxconnections=100 and > max_locks_per_transaction=64, this is only 180KB. So the overhead is > negligible. > > You are assuming memory is the only overhead. I don't think it is. Having a quick look at the source code, just setting max_prepared_transactions to non-zero seems to produce almost no processing overhead. >> If the goal is to make PostgreSQL more friendly and run smoothly without > frustration from the start and not perfect tuning, I think > max_prepared_transactions=max_connections is an easy and good item. If > the > goal is limited to auto-tuning memory sizes, this improvement can be > treated separately. > > Frankly, I think we'd help 1000 times more users of we enabled a few wal > writers by default and jumped the wal level. Mainly so they could run one > off base backup. That's used by orders of magnitude more users than XA. Agreed. The default of non-zero max_wal_senders and wal_level > 'archive' would be beneficial for more users. Likewise, non-zero max_prepared_transactons would improve the impression of PostgreSQL (for limited number of users, though), and it wouldn't do any harm. Regards MauMau
On 2013-10-15 21:41:18 +0900, MauMau wrote: > Likewise, non-zero max_prepared_transactons would improve the > impression of PostgreSQL (for limited number of users, though), and it > wouldn't do any harm. I've seen several sites shutting down because of forgotten prepared transactions causing bloat and anti-wraparound shutdowns. A big, big -1 for changing that default. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
From: "Dimitri Fontaine" <dimitri@2ndQuadrant.fr> > The reason why that parameter default has changed from 5 to 0 is that > some people would mistakenly use a prepared transaction without a > transaction manager. Few only people are actually using a transaction > manager that it's better to have them have to set PostgreSQL. I guess this problem is not unique to PostgreSQL. I think PostgreSQL can be more friendly for normal users (who use external transaction manager), and does not need to be too conservative because of people who do irregular things. Regards MauMau
On Tue, Oct 15, 2013 at 2:47 PM, MauMau <maumau307@gmail.com> wrote: > From: "Dimitri Fontaine" <dimitri@2ndQuadrant.fr> > >> The reason why that parameter default has changed from 5 to 0 is that >> some people would mistakenly use a prepared transaction without a >> transaction manager. Few only people are actually using a transaction >> manager that it's better to have them have to set PostgreSQL. > > > I guess this problem is not unique to PostgreSQL. I think PostgreSQL can be > more friendly for normal users (who use external transaction manager), and > does not need to be too conservative because of people who do irregular > things. I would say *using* an external transaction manager *is* the irregular thing. The current default *is* friendly for normal users, for example see the comments from Andres about what happens if you make a mistake. So I definitely agree with your sentiment that we should be more friendly for normal users - but in this case we are. If I look through all the customers I've worked with, only a handful have actually used a transaction manager. And of those, at least half of them were using it even though they didn't need it, because they didn't know what it was. But the argument about being friendly for new users should definitely have us change wal_level and max_wal_senders. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On 10/15/2013 05:52 AM, Magnus Hagander wrote: > But the argument about being friendly for new users should definitely > have us change wal_level and max_wal_senders. +1 for having replication supported out-of-the-box aside from pg_hba.conf. To put it another way: users are more likely to care about replication than they are about IO overhead on a non-replicated server. And for the users who care about IO overhead, they are more likely to much about in pg.conf *anyway* in order to set a slew of performance-tuning settings. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 2013-10-15 10:19:06 -0700, Josh Berkus wrote: > On 10/15/2013 05:52 AM, Magnus Hagander wrote: > > But the argument about being friendly for new users should definitely > > have us change wal_level and max_wal_senders. > > +1 for having replication supported out-of-the-box aside from pg_hba.conf. > > To put it another way: users are more likely to care about replication > than they are about IO overhead on a non-replicated server. And for the > users who care about IO overhead, they are more likely to much about in > pg.conf *anyway* in order to set a slew of performance-tuning settings. But it will hurt people restoring backups using pg_restore -j. I think people might be rather dissapointed if that slows down by a factor of three. I think we really need to get to the point where we increase the wal level ondemand... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Oct 15, 2013 at 7:26 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2013-10-15 10:19:06 -0700, Josh Berkus wrote: >> On 10/15/2013 05:52 AM, Magnus Hagander wrote: >> > But the argument about being friendly for new users should definitely >> > have us change wal_level and max_wal_senders. >> >> +1 for having replication supported out-of-the-box aside from pg_hba.conf. >> >> To put it another way: users are more likely to care about replication >> than they are about IO overhead on a non-replicated server. And for the >> users who care about IO overhead, they are more likely to much about in >> pg.conf *anyway* in order to set a slew of performance-tuning settings. > > But it will hurt people restoring backups using pg_restore -j. I think > people might be rather dissapointed if that slows down by a factor of > three. > > I think we really need to get to the point where we increase the wal > level ondemand... Yeha, there are really two things. If we can increase wal_level on demand, that would solve one of them. Turning that into a SIGHUP parameter would be great. I have no idea how hard it would be. In theory, couldn't we let it be sighup and then just have do_pg_start_backup() block until all backends have acknowledged that they are on the new WAL level somehow? (Yes, I realize this might be a big simplification, but I'm allowed to hope, no?) The other problem is max_wal_senders. I think that's a much smaller problem - setting that one to 5 or so by default shouldn't have a big impact. But without the wal_level changes, it would also be mostly pointless... -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On 2013-10-15 19:29:50 +0200, Magnus Hagander wrote: > On Tue, Oct 15, 2013 at 7:26 PM, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2013-10-15 10:19:06 -0700, Josh Berkus wrote: > >> On 10/15/2013 05:52 AM, Magnus Hagander wrote: > >> > But the argument about being friendly for new users should definitely > >> > have us change wal_level and max_wal_senders. > >> > >> +1 for having replication supported out-of-the-box aside from pg_hba.conf. > >> > >> To put it another way: users are more likely to care about replication > >> than they are about IO overhead on a non-replicated server. And for the > >> users who care about IO overhead, they are more likely to much about in > >> pg.conf *anyway* in order to set a slew of performance-tuning settings. > > > > But it will hurt people restoring backups using pg_restore -j. I think > > people might be rather dissapointed if that slows down by a factor of > > three. > > > > I think we really need to get to the point where we increase the wal > > level ondemand... > > Yeha, there are really two things. > > If we can increase wal_level on demand, that would solve one of them. > Turning that into a SIGHUP parameter would be great. I have no idea > how hard it would be. In theory, couldn't we let it be sighup and then > just have do_pg_start_backup() block until all backends have > acknowledged that they are on the new WAL level somehow? (Yes, I > realize this might be a big simplification, but I'm allowed to hope, > no?) Depends on what you want to support. For basebackups, that should be doable with some pullups. It's unfortunately more complex than that for streaming rep - we really need persistent standby registration there. Otherwise the wal_level will fall back to minimal when the standby disconnects which will obviously break the standby. > The other problem is max_wal_senders. I think that's a much smaller > problem - setting that one to 5 or so by default shouldn't have a big > impact. But without the wal_level changes, it would also be mostly > pointless... Well, you currently cannot even set it when the wal_level isn't set appropriately, but that that should be easy enough to change. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Oct 15, 2013 at 7:32 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2013-10-15 19:29:50 +0200, Magnus Hagander wrote: >> On Tue, Oct 15, 2013 at 7:26 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> > On 2013-10-15 10:19:06 -0700, Josh Berkus wrote: >> >> On 10/15/2013 05:52 AM, Magnus Hagander wrote: >> >> > But the argument about being friendly for new users should definitely >> >> > have us change wal_level and max_wal_senders. >> >> >> >> +1 for having replication supported out-of-the-box aside from pg_hba.conf. >> >> >> >> To put it another way: users are more likely to care about replication >> >> than they are about IO overhead on a non-replicated server. And for the >> >> users who care about IO overhead, they are more likely to much about in >> >> pg.conf *anyway* in order to set a slew of performance-tuning settings. >> > >> > But it will hurt people restoring backups using pg_restore -j. I think >> > people might be rather dissapointed if that slows down by a factor of >> > three. >> > >> > I think we really need to get to the point where we increase the wal >> > level ondemand... >> >> Yeha, there are really two things. >> >> If we can increase wal_level on demand, that would solve one of them. >> Turning that into a SIGHUP parameter would be great. I have no idea >> how hard it would be. In theory, couldn't we let it be sighup and then >> just have do_pg_start_backup() block until all backends have >> acknowledged that they are on the new WAL level somehow? (Yes, I >> realize this might be a big simplification, but I'm allowed to hope, >> no?) > > Depends on what you want to support. For basebackups, that should be > doable with some pullups. > It's unfortunately more complex than that for streaming rep - we really > need persistent standby registration there. Otherwise the wal_level will > fall back to minimal when the standby disconnects which will obviously > break the standby. I was actually thinking the easier step might not be to do it dynamically as the standby registers - just allow it to be a SIGHUP parameter. So you'd still change it in postgresql.conf, but it would be ok with a reload rather than a restart. Yes, fully dynamic would be better, so if we could combined those two, that would make us "require nothing for pg_basebackup, and just a reload for replication slaves". The point being that we wouldn't need a *restart* at any point - and that alond would be a big improvement. >> The other problem is max_wal_senders. I think that's a much smaller >> problem - setting that one to 5 or so by default shouldn't have a big >> impact. But without the wal_level changes, it would also be mostly >> pointless... > > Well, you currently cannot even set it when the wal_level isn't set > appropriately, but that that should be easy enough to change. Yes, it would be a trivial change to allow that parametre to be set and then just give an error if you try to initiate streaming in that case. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
From: "Andres Freund" <andres@2ndquadrant.com> > I've seen several sites shutting down because of forgotten prepared > transactions causing bloat and anti-wraparound shutdowns. From: "Magnus Hagander" <magnus@hagander.net> > I would say *using* an external transaction manager *is* the irregular > thing. The current default *is* friendly for normal users, for example > see the comments from Andres about what happens if you make a mistake. > So I definitely agree with your sentiment that we should be more > friendly for normal users - but in this case we are. > > If I look through all the customers I've worked with, only a handful > have actually used a transaction manager. And of those, at least half > of them were using it even though they didn't need it, because they > didn't know what it was. I understand that you mean by *irregular* that there are few people who use distributed transactions. I guess so too: there are not many users who require distributed transactions in the real world. I meant by *irregular* that almost all users should use distributed transactions through an external transaction manager incorporated in Java EE application servers or MSDTC. The distributed transaction features like XA and Java Transaction API (JTA) are established. They are not irregular for those who need them; they were developed and exist for a long time, because they were/are needed. I don't think the default value of zero for max_prepared_transactions is friendly for normal and not-normal users. Normal users, who properly use external transaction manager, won't be caught by the trouble Andres mentioned, because the external transaction manager soon resolves prepared (in-doubt) transactions. Not-normal users, who uses PREPARE TRANSACTION statement or XAResource.prepare() directly from their applications without using external transaction manager or without need based on proper understanding, won't escape from Andres's concern. They will see the following message and follow it blindly to make their applications succeed. ERROR: prepared transactions are disabled HINT: Set max_prepared_transactions to a nonzero value. So, the current default of zero is not only unfriendly for normal users but also non-helpful for those who make mistakes. Regards MauMau
On 10/09/2013 11:06 AM, Andrew Dunstan wrote: > > > > The assumption that each connection won't use lots of work_mem is also > false, I think, especially in these days of connection poolers. > > Andres has just been politely pointing out to me that my knowledge of memory allocators is a little out of date (i.e. by a decade or two), and that this memory is not in fact likely to be held for a long time, at least on most modern systems. That undermines completely my reasoning above. Given that, it probably makes sense for us to be rather more liberal in setting work_mem that I was suggesting. cheers andrew
On Wed, Oct 16, 2013 at 04:25:37PM -0400, Andrew Dunstan wrote: > > On 10/09/2013 11:06 AM, Andrew Dunstan wrote: > > > > > > > >The assumption that each connection won't use lots of work_mem is > >also false, I think, especially in these days of connection > >poolers. > > > > > > > Andres has just been politely pointing out to me that my knowledge > of memory allocators is a little out of date (i.e. by a decade or > two), and that this memory is not in fact likely to be held for a > long time, at least on most modern systems. That undermines > completely my reasoning above. > > Given that, it probably makes sense for us to be rather more liberal > in setting work_mem that I was suggesting. Ah, yes, this came up last year (MMAP_THRESHOLD): http://www.postgresql.org/message-id/20120730161416.GB10877@momjian.us -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Oct 16, 2013 at 5:30 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Wed, Oct 16, 2013 at 04:25:37PM -0400, Andrew Dunstan wrote: >> >> On 10/09/2013 11:06 AM, Andrew Dunstan wrote: >> > >> > >> > >> >The assumption that each connection won't use lots of work_mem is >> >also false, I think, especially in these days of connection >> >poolers. >> > >> > >> >> >> Andres has just been politely pointing out to me that my knowledge >> of memory allocators is a little out of date (i.e. by a decade or >> two), and that this memory is not in fact likely to be held for a >> long time, at least on most modern systems. That undermines >> completely my reasoning above. >> >> Given that, it probably makes sense for us to be rather more liberal >> in setting work_mem that I was suggesting. > > Ah, yes, this came up last year (MMAP_THRESHOLD): > > http://www.postgresql.org/message-id/20120730161416.GB10877@momjian.us Beware of depending on that threshold. It varies wildly among platforms. I've seen implementations with the threshold well above 64MB.
On 10/16/2013 01:25 PM, Andrew Dunstan wrote: > Andres has just been politely pointing out to me that my knowledge of > memory allocators is a little out of date (i.e. by a decade or two), and > that this memory is not in fact likely to be held for a long time, at > least on most modern systems. That undermines completely my reasoning > above. Except that Opensolaris and FreeBSD still have the old memory allocation behavior, as do older Linux kernels, many of which will remain in production for years. I have no idea what Windows' memory management behavior is. So this is a case of needing to know considerably more than the available RAM to determine a good setting. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Wed, Oct 16, 2013 at 5:14 PM, Josh Berkus <josh@agliodbs.com> wrote: > On 10/16/2013 01:25 PM, Andrew Dunstan wrote: >> Andres has just been politely pointing out to me that my knowledge of >> memory allocators is a little out of date (i.e. by a decade or two), and >> that this memory is not in fact likely to be held for a long time, at >> least on most modern systems. That undermines completely my reasoning >> above. > > Except that Opensolaris and FreeBSD still have the old memory allocation > behavior, as do older Linux kernels, many of which will remain in > production for years. I have no idea what Windows' memory management > behavior is. > > So this is a case of needing to know considerably more than the > available RAM to determine a good setting. I agree, but I still think my previous proposal of increasing the defaults for work_mem and maintenance_work_mem by 4X would serve many more people well than it would serve poorly. I haven't heard anyone disagree with that notion. Does anyone disagree? Should we do it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> wrote: > I still think my previous proposal of increasing the defaults for > work_mem and maintenance_work_mem by 4X would serve many more > people well than it would serve poorly. I haven't heard anyone > disagree with that notion. Does anyone disagree? Should we do > it? I think that it makes sense to do that. Those are still reasonable defaults for a machine with 2GB of RAM, maybe even with less. We're talking about putting this only in a release that will come out in 2014. How many machines used for a database server that new will have less than that? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 10/17/2013 08:55 AM, Kevin Grittner wrote: > > Robert Haas <robertmhaas@gmail.com> wrote: > >> I still think my previous proposal of increasing the defaults for >> work_mem and maintenance_work_mem by 4X would serve many more >> people well than it would serve poorly. I haven't heard anyone >> disagree with that notion. Does anyone disagree? Should we do >> it? > > I think that it makes sense to do that. Those are still reasonable > defaults for a machine with 2GB of RAM, maybe even with less. > We're talking about putting this only in a release that will come > out in 2014. How many machines used for a database server that new > will have less than that? A lot. A whole lot, more than what most people have in production with more than that. You are forgetting a very large segment of the population who run... VMs. Why don't we just have 3 default config files: 2GB memory 4GB memory 8GB memory Have initdb detect how much memory is available on the machine in TOTAL and pick the most appropriate. Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats
JD, > A lot. A whole lot, more than what most people have in production with > more than that. You are forgetting a very large segment of the > population who run... VMs. Actually, even a "mini" AWS instance has 1GB of RAM. And nobody who uses a "micro" is going to expect it to perform well under load. I think it's completely reasonable to tell people running on < 1GB of ram to tune PostgreSQL "down". 4MB work_mem / 64MB maint_work_mem still works fine at 1GB. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Thu, Oct 17, 2013 at 12:03 PM, Joshua D. Drake <jd@commandprompt.com> wrote: > On 10/17/2013 08:55 AM, Kevin Grittner wrote: >> Robert Haas <robertmhaas@gmail.com> wrote: >> >>> I still think my previous proposal of increasing the defaults for >>> work_mem and maintenance_work_mem by 4X would serve many more >>> people well than it would serve poorly. I haven't heard anyone >>> disagree with that notion. Does anyone disagree? Should we do >>> it? >> >> >> I think that it makes sense to do that. Those are still reasonable >> defaults for a machine with 2GB of RAM, maybe even with less. >> We're talking about putting this only in a release that will come >> out in 2014. How many machines used for a database server that new >> will have less than that? > > A lot. A whole lot, more than what most people have in production with more > than that. You are forgetting a very large segment of the population who > run... VMs. That's true, but are you actually arguing for keeping work_mem at 1MB? Even on a VM with only 1GB of RAM, work_mem=4MB is not going to cause any problems unless you're also trying to service a large number of simultaneous connections. And if you're doing that, you probably need to rethink something anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 10/17/2013 09:49 AM, Robert Haas wrote: >> A lot. A whole lot, more than what most people have in production with more >> than that. You are forgetting a very large segment of the population who >> run... VMs. > > That's true, but are you actually arguing for keeping work_mem at 1MB? > > Even on a VM with only 1GB of RAM, work_mem=4MB is not going to cause > any problems unless you're also trying to service a large number of > simultaneous connections. And if you're doing that, you probably need > to rethink something anyway. No. I am arguing for the multiple config file option. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats
On Thu, Oct 17, 2013 at 9:03 AM, Joshua D. Drake <jd@commandprompt.com> wrote:
A lot. A whole lot, more than what most people have in production with more than that. You are forgetting a very large segment of the population who run... VMs.
On 10/17/2013 08:55 AM, Kevin Grittner wrote:
Robert Haas <robertmhaas@gmail.com> wrote:I still think my previous proposal of increasing the defaults for
work_mem and maintenance_work_mem by 4X would serve many more
people well than it would serve poorly. I haven't heard anyone
disagree with that notion. Does anyone disagree? Should we do
it?
I think that it makes sense to do that. Those are still reasonable
defaults for a machine with 2GB of RAM, maybe even with less.
We're talking about putting this only in a release that will come
out in 2014. How many machines used for a database server that new
will have less than that?
Why don't we just have 3 default config files:
2GB memory
4GB memory
8GB memory
But what would go in each of those files? Once we agree on what would be in them, why not just have a continuous knob that does that same thing?
Would your suggestion for the 2GB file have work_mem be at least 4 MB?
Cheers,
Jeff
On 10/17/2013 10:33 AM, Jeff Janes wrote: > A lot. A whole lot, more than what most people have in production > with more than that. You are forgetting a very large segment of the > population who run... VMs. > > Why don't we just have 3 default config files: > > 2GB memory > 4GB memory > 8GB memory > > > But what would go in each of those files? Once we agree on what would > be in them, why not just have a continuous knob that does that same thing? Because we should set defaults, not optimized parameters. Workloads vary and we can reasonably say this is what we want BY DEFAULT for something but we can not reasonably say, "this is what will suit your needs". Once you get above 8GB of memory you are dealing with workloads that vary widely and will almost always need some kind of indvidual attention. However, 8GB and below, we can set reasonable defaults that allow a user to likely but possibly not worry about changing the conf. Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats
On Thu, Oct 17, 2013 at 7:22 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Oct 16, 2013 at 5:14 PM, Josh Berkus <josh@agliodbs.com> wrote: >> On 10/16/2013 01:25 PM, Andrew Dunstan wrote: >>> Andres has just been politely pointing out to me that my knowledge of >>> memory allocators is a little out of date (i.e. by a decade or two), and >>> that this memory is not in fact likely to be held for a long time, at >>> least on most modern systems. That undermines completely my reasoning >>> above. >> >> Except that Opensolaris and FreeBSD still have the old memory allocation >> behavior, as do older Linux kernels, many of which will remain in >> production for years. I have no idea what Windows' memory management >> behavior is. >> >> So this is a case of needing to know considerably more than the >> available RAM to determine a good setting. > > I agree, but I still think my previous proposal of increasing the > defaults for work_mem and maintenance_work_mem by 4X would serve many > more people well than it would serve poorly. I haven't heard anyone > disagree with that notion. Does anyone disagree? Should we do it? One source of hesitation for me is that I have a hunch that the stock assumptions that go into shared_buffers will probably change, possibly by a lot. For a lot of (I think very good-) reasons current policy is to set s_b to max 2GB. After we nail some of the outstanding contention issues and make other optimizations I expect that the optimal setting may be 50% of ram or higher. Point being: I like the idea of an initdb time setting that takes in the estimated amount of RAM that is going to be dedicated to the database. From there, we then estimate best settings for the various configurations (perhaps taking in extra hypothetical parameters to help that job along). So the anchor should not be s_b, but user specified. merlin
All, So, I did an informal survey last night a SFPUG, among about 30 PostgreSQL DBAs and developers. While hardly a scientific sample, it's a data point on what we're looking at for servers. Out of the 30, 6 had one or more production instances of PostgreSQL running on machines or VMs with less than 1GB of RAM. Out of those 5 had already edited their PostgreSQL.conf extensively. Perhaps more importantly, for four out of the 6, the low-memory Postgres instance(s) was an older version (8.2 to 9.0) which they did not expect to upgrade.Also, note that a couple of the 6 were consultants,so they were speaking for dozens of customer servers. As a second data point, Christophe and I did a quick survey of the database of server information on our clients, which include a bunch of cloud-hosted web companies. We found two PostgreSQL VMs which did not have 1GB or more RAM, out of a few hundred. Now, obviously, there's some significant sample bias in the above, but I think it gives support to the assertion that we shouldn't really be worrying about PostgresQL running well out-of-the-box on machines with < 1GB of RAM. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Fri, Oct 11, 2013 at 03:39:51PM -0700, Kevin Grittner wrote: > Josh Berkus <josh@agliodbs.com> wrote: > > On 10/11/2013 01:11 PM, Bruce Momjian wrote: > >> In summary, I think we need to: > >> > >> * decide on new defaults for work_mem and maintenance_work_mem > >> * add an initdb flag to allow users/packagers to set shared_bufffers? > >> * add an autovacuum_work_mem setting? > >> * change the default for temp_buffers? > > > > If we're changing defaults, bgwriter_lru_maxpages and vacuum_cost_limit > > could also use a bump; those thresholds were set for servers with < 1GB > > of RAM. > > +1 on those. > > Also, I have often had to bump cpu_tuple_cost into the 0.03 to 0.05 > range to get a good plan. In general, this makes the exact > settings of *_page_cost less fussy, and I have hit situations where > I was completely unable to get a good plan to emerge without > bumping cpu_tuple_cost relative to the other cpu costs. I know that > it's possible to engineer a workload that shows any particular cost > adjustment to make things worse, but in real-life production > environments I have never seen an increase in this range make plan > choice worse. So, would anyone like me to create patches for any of these items before we hit 9.4 beta? We have added autovacuum_work_mem, and increasing work_mem and maintenance_work_mem by 4x is a simple operation. Not sure about the others. Or do we just keep this all for 9.5? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Thu, Feb 13, 2014 at 3:34 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Fri, Oct 11, 2013 at 03:39:51PM -0700, Kevin Grittner wrote: >> Josh Berkus <josh@agliodbs.com> wrote: >> > On 10/11/2013 01:11 PM, Bruce Momjian wrote: >> >> In summary, I think we need to: >> >> >> >> * decide on new defaults for work_mem and maintenance_work_mem >> >> * add an initdb flag to allow users/packagers to set shared_bufffers? >> >> * add an autovacuum_work_mem setting? >> >> * change the default for temp_buffers? >> > >> > If we're changing defaults, bgwriter_lru_maxpages and vacuum_cost_limit >> > could also use a bump; those thresholds were set for servers with < 1GB >> > of RAM. >> >> +1 on those. >> >> Also, I have often had to bump cpu_tuple_cost into the 0.03 to 0.05 >> range to get a good plan. In general, this makes the exact >> settings of *_page_cost less fussy, and I have hit situations where >> I was completely unable to get a good plan to emerge without >> bumping cpu_tuple_cost relative to the other cpu costs. I know that >> it's possible to engineer a workload that shows any particular cost >> adjustment to make things worse, but in real-life production >> environments I have never seen an increase in this range make plan >> choice worse. > > So, would anyone like me to create patches for any of these items before > we hit 9.4 beta? We have added autovacuum_work_mem, and increasing > work_mem and maintenance_work_mem by 4x is a simple operation. Not sure > about the others. Or do we just keep this all for 9.5? I don't think anyone objected to increasing the defaults for work_mem and maintenance_work_mem by 4x, and a number of people were in favor, so I think we should go ahead and do that. If you'd like to do the honors, by all means! The current bgwriter_lru_maxpages value limits the background writer to a maximum of 4MB/s. If one imagines shared_buffers = 8GB, that starts to seem rather low, but I don't have a good feeling for what a better value would be. The current vacuum cost delay settings limit autovacuum to about 2.6MB/s. I am inclined to think we need a rather large bump there, like 10x, but maybe it would be more prudent to do a smaller bump, like say 4x, to avoid changing the default behavior too dramatically between releases. IOW, I guess I'm proposing raising vacuum_cost_limit from 200 to 800. I don't really know about cpu_tuple_cost. Kevin's often advocated raising it, but I haven't heard anyone else advocate for that. I think we need data points from more people to know whether or not that's a good idea in general. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 02/16/2014 09:26 PM, Robert Haas wrote: > I don't really know about cpu_tuple_cost. Kevin's often advocated > raising it, but I haven't heard anyone else advocate for that. I > think we need data points from more people to know whether or not > that's a good idea in general. In 10 years of tuning PostgreSQL professionally, I still don't have a mathematical model for the interaction of the various *_cost parameters with the speeds of CPU, RAM and IO. If someone else has one, please post it so that we can make some intelligent decisions on defaults. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 17/02/14 15:26, Robert Haas wrote: > On Thu, Feb 13, 2014 at 3:34 PM, Bruce Momjian <bruce@momjian.us> wrote: >> On Fri, Oct 11, 2013 at 03:39:51PM -0700, Kevin Grittner wrote: >>> Josh Berkus <josh@agliodbs.com> wrote: >>>> On 10/11/2013 01:11 PM, Bruce Momjian wrote: >>>>> In summary, I think we need to: >>>>> >>>>> * decide on new defaults for work_mem and maintenance_work_mem >>>>> * add an initdb flag to allow users/packagers to set shared_bufffers? >>>>> * add an autovacuum_work_mem setting? >>>>> * change the default for temp_buffers? >>>> If we're changing defaults, bgwriter_lru_maxpages and vacuum_cost_limit >>>> could also use a bump; those thresholds were set for servers with < 1GB >>>> of RAM. >>> +1 on those. >>> >>> Also, I have often had to bump cpu_tuple_cost into the 0.03 to 0.05 >>> range to get a good plan. In general, this makes the exact >>> settings of *_page_cost less fussy, and I have hit situations where >>> I was completely unable to get a good plan to emerge without >>> bumping cpu_tuple_cost relative to the other cpu costs. I know that >>> it's possible to engineer a workload that shows any particular cost >>> adjustment to make things worse, but in real-life production >>> environments I have never seen an increase in this range make plan >>> choice worse. >> So, would anyone like me to create patches for any of these items before >> we hit 9.4 beta? We have added autovacuum_work_mem, and increasing >> work_mem and maintenance_work_mem by 4x is a simple operation. Not sure >> about the others. Or do we just keep this all for 9.5? > I don't think anyone objected to increasing the defaults for work_mem > and maintenance_work_mem by 4x, and a number of people were in favor, > so I think we should go ahead and do that. If you'd like to do the > honors, by all means! > > The current bgwriter_lru_maxpages value limits the background writer > to a maximum of 4MB/s. If one imagines shared_buffers = 8GB, that > starts to seem rather low, but I don't have a good feeling for what a > better value would be. > > The current vacuum cost delay settings limit autovacuum to about > 2.6MB/s. I am inclined to think we need a rather large bump there, > like 10x, but maybe it would be more prudent to do a smaller bump, > like say 4x, to avoid changing the default behavior too dramatically > between releases. IOW, I guess I'm proposing raising > vacuum_cost_limit from 200 to 800. > > I don't really know about cpu_tuple_cost. Kevin's often advocated > raising it, but I haven't heard anyone else advocate for that. I > think we need data points from more people to know whether or not > that's a good idea in general. > Processors have been getting faster, relative to spinning rust, over the years. So it puzzles me why anybody would want to raise the cpu_tuple_cost! Possibly, the various costs should change if the database is on SSD's? Of course, I have the implicit assumption that cost factors like 'cpu_tuple_cost' have more than just a vague relation to the semantics implied by their naming! It would be good, if can we get some clarity on what these various cost factors are actually meant to do and how they relate to each other. Cheers, Gavin Cheers, Gavin
Gavin Flower <GavinFlower@archidevsys.co.nz> writes: > On 17/02/14 15:26, Robert Haas wrote: >> I don't really know about cpu_tuple_cost. Kevin's often advocated >> raising it, but I haven't heard anyone else advocate for that. I >> think we need data points from more people to know whether or not >> that's a good idea in general. > Processors have been getting faster, relative to spinning rust, over the > years. So it puzzles me why anybody would want to raise the > cpu_tuple_cost! The case where this is sensible is where your database mostly fits in RAM, so that the cost of touching the underlying spinning rust isn't so relevant. The default cost settings are certainly not very good for such scenarios. regards, tom lane
On Sun, Feb 16, 2014 at 09:26:47PM -0500, Robert Haas wrote: > > So, would anyone like me to create patches for any of these items before > > we hit 9.4 beta? We have added autovacuum_work_mem, and increasing > > work_mem and maintenance_work_mem by 4x is a simple operation. Not sure > > about the others. Or do we just keep this all for 9.5? > > I don't think anyone objected to increasing the defaults for work_mem > and maintenance_work_mem by 4x, and a number of people were in favor, > so I think we should go ahead and do that. If you'd like to do the > honors, by all means! OK, patch attached. > The current bgwriter_lru_maxpages value limits the background writer > to a maximum of 4MB/s. If one imagines shared_buffers = 8GB, that > starts to seem rather low, but I don't have a good feeling for what a > better value would be. > > The current vacuum cost delay settings limit autovacuum to about > 2.6MB/s. I am inclined to think we need a rather large bump there, > like 10x, but maybe it would be more prudent to do a smaller bump, > like say 4x, to avoid changing the default behavior too dramatically > between releases. IOW, I guess I'm proposing raising > vacuum_cost_limit from 200 to 800. > > I don't really know about cpu_tuple_cost. Kevin's often advocated > raising it, but I haven't heard anyone else advocate for that. I > think we need data points from more people to know whether or not > that's a good idea in general. Robert, can you take the lead on these remaining possible changes? We don't have time for any controversial changes but things everyone can agree on, like work_mem, should be implemented for 9.4. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Attachment
On 2014-02-16 21:26:47 -0500, Robert Haas wrote: > I don't think anyone objected to increasing the defaults for work_mem > and maintenance_work_mem by 4x, and a number of people were in favor, > so I think we should go ahead and do that. If you'd like to do the > honors, by all means! Actually, I object to increasing work_mem by default. In my experience most of the untuned servers are backing some kind of web application and often run with far too many connections. Increasing work_mem for those is dangerous. > I don't really know about cpu_tuple_cost. Kevin's often advocated > raising it, but I haven't heard anyone else advocate for that. I > think we need data points from more people to know whether or not > that's a good idea in general. FWIW It's a good idea in my experience. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
* Andres Freund (andres@2ndquadrant.com) wrote: > On 2014-02-16 21:26:47 -0500, Robert Haas wrote: > > I don't think anyone objected to increasing the defaults for work_mem > > and maintenance_work_mem by 4x, and a number of people were in favor, > > so I think we should go ahead and do that. If you'd like to do the > > honors, by all means! > > Actually, I object to increasing work_mem by default. In my experience > most of the untuned servers are backing some kind of web application and > often run with far too many connections. Increasing work_mem for those > is dangerous. And I still disagree with this- even in those cases. Those same untuned servers are running dirt-simple queries 90% of the time and they won't use any more memory from this, while the 10% of the queries which are more complicated will greatly improve. > > I don't really know about cpu_tuple_cost. Kevin's often advocated > > raising it, but I haven't heard anyone else advocate for that. I > > think we need data points from more people to know whether or not > > that's a good idea in general. > > FWIW It's a good idea in my experience. I'm in favor of this also but I'm also in the camp of "gee, more data would be nice". Thanks, Stephen
On 2014-02-17 11:31:56 -0500, Stephen Frost wrote: > * Andres Freund (andres@2ndquadrant.com) wrote: > > On 2014-02-16 21:26:47 -0500, Robert Haas wrote: > > > I don't think anyone objected to increasing the defaults for work_mem > > > and maintenance_work_mem by 4x, and a number of people were in favor, > > > so I think we should go ahead and do that. If you'd like to do the > > > honors, by all means! > > > > Actually, I object to increasing work_mem by default. In my experience > > most of the untuned servers are backing some kind of web application and > > often run with far too many connections. Increasing work_mem for those > > is dangerous. > > And I still disagree with this- even in those cases. Those same untuned > servers are running dirt-simple queries 90% of the time and they won't > use any more memory from this, while the 10% of the queries which are > more complicated will greatly improve. Uh. Paging. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Feb 17, 2014 at 11:19 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-02-16 21:26:47 -0500, Robert Haas wrote: >> I don't think anyone objected to increasing the defaults for work_mem >> and maintenance_work_mem by 4x, and a number of people were in favor, >> so I think we should go ahead and do that. If you'd like to do the >> honors, by all means! > > Actually, I object to increasing work_mem by default. In my experience > most of the untuned servers are backing some kind of web application and > often run with far too many connections. Increasing work_mem for those > is dangerous. I think you may be out-voted. An awful lot of people have voiced support for the idea of raising this value, and there is no rule that our default should be the smallest value that anyone will ever find useful. We do tend to err on the side of conservatism and aim for a relatively low-end machine, and I agree with that policy, but there is such a thing as going overboard. With the proposed defaults, a user with one sort or hash in every session, each of which uses the entirety of work_mem, is on the hook for 400MB. If you're trying to handle 100 connections on a machine that does not have 400MB of working memory available, you are probably in for a bad time of it. Now, if you're saying that people raise max_connections to say 1000 *and do nothing else* perhaps that makes the argument more plausible. But I don't think it makes it very much more plausible. Even a high-end system is likely to deliver terrible performance if the user has 1000 simultaneously-active connections; one with only a few GB of memory is going to be crushed like a bug. I'll note that in 9.3, we quadrupled the default size of shared_buffers when we got out from under the POSIX shared memory limits and AFAIK we've had zero complaints about that. It is entirely possible, even likely, that there is a machine out there somewhere for which the old value of 32MB is preferable, and those people can configure a smaller value. But that's not typical. And neither do I believe that the typical PostgreSQL user wants a 2MB sort to spill to disk. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-02-17 12:23:58 -0500, Robert Haas wrote: > On Mon, Feb 17, 2014 at 11:19 AM, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2014-02-16 21:26:47 -0500, Robert Haas wrote: > >> I don't think anyone objected to increasing the defaults for work_mem > >> and maintenance_work_mem by 4x, and a number of people were in favor, > >> so I think we should go ahead and do that. If you'd like to do the > >> honors, by all means! > > > > Actually, I object to increasing work_mem by default. In my experience > > most of the untuned servers are backing some kind of web application and > > often run with far too many connections. Increasing work_mem for those > > is dangerous. > > I think you may be out-voted. I realize that, but I didn't want to let the "I don't think anyone objected" stand :) > With the proposed defaults, a user with one sort or hash in every > session, each of which uses the entirety of work_mem, is on the hook > for 400MB. If you're trying to handle 100 connections on a machine > that does not have 400MB of working memory available, you are probably > in for a bad time of it. Sure, if that's all they do it's fine. But often enough queries aren't that simple. Lots of the ORMs commonly used for web applications tend to create lots of JOINs to gather all the data and also use sorting for paging. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-02-17 12:23:58 -0500, Robert Haas wrote: >> I think you may be out-voted. > I realize that, but I didn't want to let the "I don't think anyone > objected" stand :) FWIW, I think we need to be pretty gradual about this sort of thing, because push-back from the field is the only way to know if we've gone too far for average users. I'm OK with raising work_mem 4X in one go, but I'd complain if it were 10X, or if we were also raising other resource consumption limits in the same release. regards, tom lane
On Mon, Feb 17, 2014 at 11:33 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-02-17 11:31:56 -0500, Stephen Frost wrote: >> * Andres Freund (andres@2ndquadrant.com) wrote: >> > On 2014-02-16 21:26:47 -0500, Robert Haas wrote: >> > > I don't think anyone objected to increasing the defaults for work_mem >> > > and maintenance_work_mem by 4x, and a number of people were in favor, >> > > so I think we should go ahead and do that. If you'd like to do the >> > > honors, by all means! >> > >> > Actually, I object to increasing work_mem by default. In my experience >> > most of the untuned servers are backing some kind of web application and >> > often run with far too many connections. Increasing work_mem for those >> > is dangerous. >> >> And I still disagree with this- even in those cases. Those same untuned >> servers are running dirt-simple queries 90% of the time and they won't >> use any more memory from this, while the 10% of the queries which are >> more complicated will greatly improve. > > Uh. Paging. What about it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-02-17 13:33:17 -0500, Robert Haas wrote: > On Mon, Feb 17, 2014 at 11:33 AM, Andres Freund <andres@2ndquadrant.com> wrote: > >> And I still disagree with this- even in those cases. Those same untuned > >> servers are running dirt-simple queries 90% of the time and they won't > >> use any more memory from this, while the 10% of the queries which are > >> more complicated will greatly improve. > > > > Uh. Paging. > > What about it? It's often the source of a good portion of the queries and load in web applications. Multiple joins and more than one row... I have several time seen stats changes or bad to-be-sorted columns cause large amounts of memory to be used. Anyway, I've stated my opinion that I do not think it's a good idea to raise that particular default (while agreeing with all the others) and I know I am in the minority, so I don't think we need to argue this out... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Feb 17, 2014 at 07:39:47PM +0100, Andres Freund wrote: > On 2014-02-17 13:33:17 -0500, Robert Haas wrote: > > On Mon, Feb 17, 2014 at 11:33 AM, Andres Freund <andres@2ndquadrant.com> wrote: > > >> And I still disagree with this- even in those cases. Those same untuned > > >> servers are running dirt-simple queries 90% of the time and they won't > > >> use any more memory from this, while the 10% of the queries which are > > >> more complicated will greatly improve. > > > > > > Uh. Paging. > > > > What about it? > > It's often the source of a good portion of the queries and load in web > applications. Multiple joins and more than one row... I have several > time seen stats changes or bad to-be-sorted columns cause large amounts > of memory to be used. Perhaps we should have said there was "general agreement" to increase work_mem and maintenence_work_mem by 4x, not that there was 100% agreement. It would be nice to have 100% agreement, but if we _require_ that then defaults would probably never be changed. > Anyway, I've stated my opinion that I do not think it's a good idea to > raise that particular default (while agreeing with all the others) and I > know I am in the minority, so I don't think we need to argue this out... OK, good. If you did feel there was need for more discussion, we would need to push this change to PG 9.5. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On 18/02/14 03:48, Tom Lane wrote: > Gavin Flower <GavinFlower@archidevsys.co.nz> writes: >> On 17/02/14 15:26, Robert Haas wrote: >>> I don't really know about cpu_tuple_cost. Kevin's often advocated >>> raising it, but I haven't heard anyone else advocate for that. I >>> think we need data points from more people to know whether or not >>> that's a good idea in general. >> Processors have been getting faster, relative to spinning rust, over the >> years. So it puzzles me why anybody would want to raise the >> cpu_tuple_cost! > The case where this is sensible is where your database mostly fits in > RAM, so that the cost of touching the underlying spinning rust isn't > so relevant. The default cost settings are certainly not very good > for such scenarios. > > regards, tom lane Thanks. That is obvious... once you pointed it out! Cheers, Gavin
On Mon, Feb 17, 2014 at 8:31 AM, Stephen Frost <sfrost@snowman.net> wrote: >> Actually, I object to increasing work_mem by default. In my experience >> most of the untuned servers are backing some kind of web application and >> often run with far too many connections. Increasing work_mem for those >> is dangerous. > > And I still disagree with this- even in those cases. Those same untuned > servers are running dirt-simple queries 90% of the time and they won't > use any more memory from this, while the 10% of the queries which are > more complicated will greatly improve. +1 -- Peter Geoghegan
On Sun, Feb 16, 2014 at 6:26 PM, Robert Haas <robertmhaas@gmail.com> wrote:
to a maximum of 4MB/s. If one imagines shared_buffers = 8GB, thatThe current bgwriter_lru_maxpages value limits the background writer
starts to seem rather low, but I don't have a good feeling for what a
better value would be.
I don't quite understand the point of bgwriter_lru_maxpages in the first place. What is it supposed to protect us from?
I wonder if that isn't an artefact from when the checkpointer was the same process as the background writer, to prevent the background writer functionality from starving the checkpointer functionality.
Cheers,
Jeff
On Mon, Feb 17, 2014 at 11:14:33AM -0500, Bruce Momjian wrote: > On Sun, Feb 16, 2014 at 09:26:47PM -0500, Robert Haas wrote: > > > So, would anyone like me to create patches for any of these items before > > > we hit 9.4 beta? We have added autovacuum_work_mem, and increasing > > > work_mem and maintenance_work_mem by 4x is a simple operation. Not sure > > > about the others. Or do we just keep this all for 9.5? > > > > I don't think anyone objected to increasing the defaults for work_mem > > and maintenance_work_mem by 4x, and a number of people were in favor, > > so I think we should go ahead and do that. If you'd like to do the > > honors, by all means! > > OK, patch attached. Patch applied. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Mon, Feb 24, 2014 at 1:05 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Mon, Feb 17, 2014 at 11:14:33AM -0500, Bruce Momjian wrote: >> On Sun, Feb 16, 2014 at 09:26:47PM -0500, Robert Haas wrote: >> > > So, would anyone like me to create patches for any of these items before >> > > we hit 9.4 beta? We have added autovacuum_work_mem, and increasing >> > > work_mem and maintenance_work_mem by 4x is a simple operation. Not sure >> > > about the others. Or do we just keep this all for 9.5? >> > >> > I don't think anyone objected to increasing the defaults for work_mem >> > and maintenance_work_mem by 4x, and a number of people were in favor, >> > so I think we should go ahead and do that. If you'd like to do the >> > honors, by all means! >> >> OK, patch attached. > > Patch applied. Thanks! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 02/18/2014 12:19 AM, Andres Freund wrote: > On 2014-02-16 21:26:47 -0500, Robert Haas wrote: >> I don't think anyone objected to increasing the defaults for work_mem >> and maintenance_work_mem by 4x, and a number of people were in favor, >> so I think we should go ahead and do that. If you'd like to do the >> honors, by all means! > > Actually, I object to increasing work_mem by default. In my experience > most of the untuned servers are backing some kind of web application and > often run with far too many connections. Increasing work_mem for those > is dangerous. Good point. Especially with pagination involved. Those OFFSET 40000 LIMIT 100 queries can be a killer. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-02-16 21:26:47 -0500, Robert Haas wrote: >> I don't really know about cpu_tuple_cost. Kevin's often >> advocated raising it, but I haven't heard anyone else advocate >> for that. I think we need data points from more people to know >> whether or not that's a good idea in general. > > FWIW It's a good idea in my experience. This is more about the balance among the various cpu_* costs than the balance between cpu_* costs and the *_page costs. I usually need to adjust the page costs, too; and given how heavily cached many machines are, I'm usually moving them down. But if you think about the work involved in moving to a new tuple, do you really think it's only twice the cost of moving to a new index entry on an index scan? Or only four times as expensive as executing an average operator function? In my experience setting cpu_tuple_cost higher tends to better model costs, and prevent CPU-sucking scans of large numbers of rows. I only have anecdotal evidence, though. I have seen it help dozens of times, and have yet to see it hurt. That said, most people on this list are probably capable of engineering a benchmark which will show whichever result they would prefer. I would prefer to hear about other data points based on field experience with production systems. I haven't offered the trivial patch because when I've raised the point before, there didn't seem to be anyone else who had the same experience. It's good to hear that Andres has seen this, too. FWIW, even though I'm repeating something I've mentioned before, whenever raising this setting did help, 0.03 was high enough to see the benefit. Several times I have also tried 0.05 just to test whether I was wandering near a tipping point for a bad choice from this. I have never had 0.05 produce plans noticeably better or worse than 0.03. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 03/10/2014 03:16 PM, Kevin Grittner wrote: > I only have anecdotal evidence, though. I have seen it help dozens > of times, and have yet to see it hurt. That said, most people on > this list are probably capable of engineering a benchmark which > will show whichever result they would prefer. I would prefer to > hear about other data points based on field experience with > production systems. I haven't offered the trivial patch because > when I've raised the point before, there didn't seem to be anyone > else who had the same experience. It's good to hear that Andres > has seen this, too. The problem with cpu_tuple_cost is that it's used in several places by the planner and makes it hard to model what the effect of any change would be. If we had a good general benchmark which actually gave the query planner a workout, we could come up with some reasonable default settings, but right now we can't. Back in 2004-2006 era, when CPU speeds had leapfrogged ahead of disk speeds (which were largely unchanged from 2000), I was routinely *lowering* cpu_tuple_cost (and cpu_index_tuple_cost) to get better plans. This was baked into early versions of Greenplum for that reason. So I'm not saying that we shouldn't change the default for cpu_tuple_cost. I'm saying that we currently don't have enough information on *when* and *how much* to change it. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Mon, Mar 10, 2014 at 3:16 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
>> I don't really know about cpu_tuple_cost. Kevin's oftenThis is more about the balance among the various cpu_* costs than
>> advocated raising it, but I haven't heard anyone else advocate
>> for that. I think we need data points from more people to know
>> whether or not that's a good idea in general.
>
> FWIW It's a good idea in my experience.
the balance between cpu_* costs and the *_page costs. I usually
need to adjust the page costs, too; and given how heavily cached
many machines are, I'm usually moving them down. But if you think
about the work involved in moving to a new tuple, do you really
think it's only twice the cost of moving to a new index entry on an
index scan? Or only four times as expensive as executing an
average operator function?
If the next tuple is already hinted and not compressed or toasted, I would completely believe that. In fact, unless the operator is integer or dp, I would say it is less than 2 times as expensive. If it is a text operator and the collation is not "C" or "POSIX", then moving to the next tuple is likely less expensive than a single operator evaluation.
If your tuples are updated nearly as often as queried, the hint resolution could be a big cost. But in that case, probably the contention would be a bigger issue than the pure CPU cost.
I don't know how compression and toast would affect the times. Are your tables heavily toasted?
If top down measurements and bottom up measurements aren't giving the same results, then what is going on? We know and document how caching needs to be baked into the page costs parameters. What unknown thing is throwing off the cpu costs?
In my experience setting cpu_tuple_cost
higher tends to better model costs, and prevent CPU-sucking scans
of large numbers of rows.
I only have anecdotal evidence, though. I have seen it help dozens
of times, and have yet to see it hurt. That said, most people on
this list are probably capable of engineering a benchmark which
will show whichever result they would prefer. I would prefer to
hear about other data points based on field experience with
production systems. I haven't offered the trivial patch because
when I've raised the point before, there didn't seem to be anyone
else who had the same experience. It's good to hear that Andres
has seen this, too.
FWIW, even though I'm repeating something I've mentioned before,
whenever raising this setting did help, 0.03 was high enough to see
the benefit. Several times I have also tried 0.05 just to test
whether I was wandering near a tipping point for a bad choice from
this. I have never had 0.05 produce plans noticeably better or
worse than 0.03.
Have you ever tried lowering the other two cpu cost parameters instead? That would be the more definitive indication that the benefit is not coming just by moving the io costs closer to the cpu costs
Cheers,
Jeff