Thread: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Merlin Moncure
Date:
Reference: http://postgresql.1045698.n5.nabble.com/Simple-join-doesn-t-use-index-td5738689.html This is a pretty common gotcha: user sets shared_buffers but misses the esoteric but important effective_cache_size. ISTM effective_cache_size should always be >= shared buffers -- this is a soft configuration error that could be reported as a warning and perhaps overridden on the fly. merlin
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Claudio Freire
Date:
On Tue, Jan 8, 2013 at 11:39 AM, Merlin Moncure <mmoncure@gmail.com> wrote: > Reference: http://postgresql.1045698.n5.nabble.com/Simple-join-doesn-t-use-index-td5738689.html > > This is a pretty common gotcha: user sets shared_buffers but misses > the esoteric but important effective_cache_size. ISTM > effective_cache_size should always be >= shared buffers -- this is a > soft configuration error that could be reported as a warning and > perhaps overridden on the fly. Not true. If there are many concurrent users running concurrent queries against parallel databases, such as some test systems I have that contain many databases for many test environments, such a setting wouldn't make sense. If a DBA sets it to lower than shared_buffers, that setting has to be honored. Rather, I'd propose the default setting should be "-1" or something "default" and "automagic" that works most of the time (but not all).
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Robert Haas
Date:
On Tue, Jan 8, 2013 at 9:53 AM, Claudio Freire <klaussfreire@gmail.com> wrote: > On Tue, Jan 8, 2013 at 11:39 AM, Merlin Moncure <mmoncure@gmail.com> wrote: >> Reference: http://postgresql.1045698.n5.nabble.com/Simple-join-doesn-t-use-index-td5738689.html >> >> This is a pretty common gotcha: user sets shared_buffers but misses >> the esoteric but important effective_cache_size. ISTM >> effective_cache_size should always be >= shared buffers -- this is a >> soft configuration error that could be reported as a warning and >> perhaps overridden on the fly. > > Not true. If there are many concurrent users running concurrent > queries against parallel databases, such as some test systems I have > that contain many databases for many test environments, such a setting > wouldn't make sense. If a DBA sets it to lower than shared_buffers, > that setting has to be honored. > > Rather, I'd propose the default setting should be "-1" or something > "default" and "automagic" that works most of the time (but not all). +1. I've found that a value of three-quarters of system memory works pretty well most of the time. Of course, there's not a single, portable way of getting that on every platform we support. If I remember my last investigation into this area, to use that particular rule we would probably need at least three paths - one for Windows, one for System V descendents, and one for BSD descendents. And there might still be obscure things that wouldn't be covered. Of course this also makes the admittedly unwarranted assumption that the database owns the box, which could be wrong too, but purposely lowballing effective_cache_size to discourage index-scan plans seems unlikely to be a winning strategy. A cruder heuristic that might be useful is 3 * shared_buffers. If people follow the guidance to set shared_buffers around 25% of RAM, then this will work out to around 75% again. Of course, many people, for valid reasons, use smaller values of shared_buffers than that, and a few use larger ones. It might still be better than no auto-tuning, though I wouldn't swear to it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Bruce Momjian
Date:
On Tue, Jan 8, 2013 at 05:23:36PM -0500, Robert Haas wrote: > > Rather, I'd propose the default setting should be "-1" or something > > "default" and "automagic" that works most of the time (but not all). > > +1. I've found that a value of three-quarters of system memory works > pretty well most of the time. Of course, there's not a single, > portable way of getting that on every platform we support. If I > remember my last investigation into this area, to use that particular > rule we would probably need at least three paths - one for Windows, > one for System V descendents, and one for BSD descendents. And there > might still be obscure things that wouldn't be covered. Of course > this also makes the admittedly unwarranted assumption that the > database owns the box, which could be wrong too, but purposely > lowballing effective_cache_size to discourage index-scan plans seems > unlikely to be a winning strategy. > > A cruder heuristic that might be useful is 3 * shared_buffers. If > people follow the guidance to set shared_buffers around 25% of RAM, > then this will work out to around 75% again. Of course, many people, > for valid reasons, use smaller values of shared_buffers than that, and > a few use larger ones. It might still be better than no auto-tuning, > though I wouldn't swear to it. Agreed. This is similar to the fudge we have about random_page_cost: Random access to mechanical disk storage is normally much more expensive than four-times sequential access. However, a lower default is used (4.0) because the majority of random accesses to disk, such as indexed reads, are assumed to be in cache. The default value can be thought of as modeling random access as 40 times slowerthan sequential, while expecting 90% of random reads to be cached. effective_cache_size is impossible to set accurately because you have no idea what other things might be in the cache, or what other concurrent queries might be filling the cache. Going with something at least partly reasonable makes a lot more sense. While we don't know the size of RAM, we do know the size of shared_buffers, and keying on that for a default seems like a no-brainer, rather than using 128MB. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Jan 8, 2013 at 9:53 AM, Claudio Freire <klaussfreire@gmail.com> wrote: >> Rather, I'd propose the default setting should be "-1" or something >> "default" and "automagic" that works most of the time (but not all). > A cruder heuristic that might be useful is 3 * shared_buffers. Both parts of that work for me. It's certainly silly that the default value of effective_cache_size is now equivalent to the default value of shared_buffers. And I don't especially like the idea of trying to make it depend directly on the box's physical RAM, for the same practical reasons Robert mentioned. It might be better to use 4 * shared_buffers, as that corresponds to the multiple that's been the default since 8.2 or so (ie 128MB vs 32MB), and 3x just seems kinda oddball. regards, tom lane
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Robert Haas
Date:
On Tue, Jan 8, 2013 at 7:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, Jan 8, 2013 at 9:53 AM, Claudio Freire <klaussfreire@gmail.com> wrote: >>> Rather, I'd propose the default setting should be "-1" or something >>> "default" and "automagic" that works most of the time (but not all). > >> A cruder heuristic that might be useful is 3 * shared_buffers. > > Both parts of that work for me. It's certainly silly that the default > value of effective_cache_size is now equivalent to the default value > of shared_buffers. And I don't especially like the idea of trying to > make it depend directly on the box's physical RAM, for the same > practical reasons Robert mentioned. For the record, I don't believe those problems would be particularly hard to solve. > It might be better to use 4 * shared_buffers, as that corresponds to the > multiple that's been the default since 8.2 or so (ie 128MB vs 32MB), and > 3x just seems kinda oddball. I suspect that would be OK, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Jan 8, 2013 at 7:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> ... And I don't especially like the idea of trying to >> make it depend directly on the box's physical RAM, for the same >> practical reasons Robert mentioned. > For the record, I don't believe those problems would be particularly > hard to solve. Well, the problem of "find out the box's physical RAM" is doubtless solvable if we're willing to put enough sweat and tears into it, but I'm dubious that it's worth the trouble. The harder part is how to know if the box is supposed to be dedicated to the database. Bear in mind that the starting point of this debate was the idea that we're talking about an inexperienced DBA who doesn't know about any configuration knob we might provide for the purpose. I'd prefer to go with a default that's predictable and not totally foolish --- and some multiple of shared_buffers seems like it'd fit the bill. regards, tom lane
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Andrew Dunstan
Date:
On 01/08/2013 08:08 PM, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, Jan 8, 2013 at 7:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> ... And I don't especially like the idea of trying to >>> make it depend directly on the box's physical RAM, for the same >>> practical reasons Robert mentioned. >> For the record, I don't believe those problems would be particularly >> hard to solve. > Well, the problem of "find out the box's physical RAM" is doubtless > solvable if we're willing to put enough sweat and tears into it, but > I'm dubious that it's worth the trouble. The harder part is how to know > if the box is supposed to be dedicated to the database. Bear in mind > that the starting point of this debate was the idea that we're talking > about an inexperienced DBA who doesn't know about any configuration knob > we might provide for the purpose. > > I'd prefer to go with a default that's predictable and not totally > foolish --- and some multiple of shared_buffers seems like it'd fit the > bill. +1. That seems to be by far the biggest bang for the buck. Anything else will surely involve a lot more code for not much more benefit. cheers andrew
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
All, >> Well, the problem of "find out the box's physical RAM" is doubtless >> solvable if we're willing to put enough sweat and tears into it, but >> I'm dubious that it's worth the trouble. The harder part is how to know >> if the box is supposed to be dedicated to the database. Bear in mind >> that the starting point of this debate was the idea that we're talking >> about an inexperienced DBA who doesn't know about any configuration knob >> we might provide for the purpose. Frankly, you'd need to go through a whole decision tree to do this right: - How much RAM do you have? - Is that RAM shared with other services? - Is this a DW or OLTP server? ... etc. We just don't want to get into that in the core code. However ... >> I'd prefer to go with a default that's predictable and not totally >> foolish --- and some multiple of shared_buffers seems like it'd fit the >> bill. > > +1. That seems to be by far the biggest bang for the buck. Anything else > will surely involve a lot more code for not much more benefit. I don't think we're going far enough here. I think there should be an optional setting in postgresql.conf called: available_ram The, shared_buffers, wal_buffers, and effective_cache_size (and possible other future settings) can be set to -1. If they are set to -1, then we use the figure: shared_buffers = available_ram * 0.25(with a ceiling of 8GB) wal_buffers = available_ram * 0.05(with a ceiling of 32MB) effective_cache_size = available_ram * 0.75(with a floor of 128MB) If they are set to an amount, then we use the amount they are set to. It would be nice to also automatically set work_mem, maint_work_mem, temp_buffers, etc. based on the above, but that would be considerably more difficult and require performance testing we haven't done yet. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Benedikt Grundmann
Date:
On Wed, Jan 9, 2013 at 2:01 AM, Josh Berkus <josh@agliodbs.com> wrote:
All,
>> Well, the problem of "find out the box's physical RAM" is doubtless
>> solvable if we're willing to put enough sweat and tears into it, but
>> I'm dubious that it's worth the trouble. The harder part is how to know
>> if the box is supposed to be dedicated to the database. Bear in mind
>> that the starting point of this debate was the idea that we're talking
>> about an inexperienced DBA who doesn't know about any configuration knob
>> we might provide for the purpose.
For what it is worth even if it is a dedicated database box 75% might be way too high. I remember investigating bad performance on our biggest database server, that in the end turned out to be a too high setting of effective_cache_size. From reading the code back then my rationale for it being to high was that the code that makes use of the effective_cache_size tries very hard to account for what the current query would do to the cache but doesn't take into account how many queries (on separate datasets!) are currently begin executed (and competing for the same cache). On that box we often have 100+ active connections and many looking at different big datasets.
Cheers,
bene
Cheers,
bene
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes: > Well, the problem of "find out the box's physical RAM" is doubtless > solvable if we're willing to put enough sweat and tears into it, but > I'm dubious that it's worth the trouble. The harder part is how to know > if the box is supposed to be dedicated to the database. Bear in mind > that the starting point of this debate was the idea that we're talking > about an inexperienced DBA who doesn't know about any configuration knob > we might provide for the purpose. It seems to me that pgfincore has the smarts we need to know about that, and that Cédric has code and refenrences for making it work on all platforms we care about (linux, bsd, windows for starters). Then we could have a pretty decent snapshot of the information, and maybe a background worker of some sort would be able to refresh the information while the server is running. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
Dimitri, > It seems to me that pgfincore has the smarts we need to know about that, > and that Cédric has code and refenrences for making it work on all > platforms we care about (linux, bsd, windows for starters). Well, fincore is Linux-only, and for that matter only more recent versions of linux. It would be great if we could just detect the RAM, but then we *still* have to ask the user if this is a dedicated box or not. > Then we could have a pretty decent snapshot of the information, and > maybe a background worker of some sort would be able to refresh the > information while the server is running. I don't see a background worker as necessary or even desirable; people don't frequently add RAM to their systems. A manually callable command would be completely adequate. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Claudio Freire
Date:
On Wed, Jan 9, 2013 at 3:39 PM, Josh Berkus <josh@agliodbs.com> wrote: >> It seems to me that pgfincore has the smarts we need to know about that, >> and that Cédric has code and refenrences for making it work on all >> platforms we care about (linux, bsd, windows for starters). > > Well, fincore is Linux-only, and for that matter only more recent > versions of linux. It would be great if we could just detect the RAM, > but then we *still* have to ask the user if this is a dedicated box or not. Not really. I'm convinced, and not only for e_c_s, that autoconfiguration is within the realm of possibility. However, since such a statement holds little weight without a patch attached, I've been keeping it to myself (until I can invest the effort needed for that patch). In any case, as eavesdroppers can infer a cryptographic key by timing operations or measuring power consumption, I'm pretty sure postgres can infer cost metrics and/or time sharing with clever instrumentation. The trick lies in making such instrumentation uninstrusive.
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
Claudio, > Not really. I'm convinced, and not only for e_c_s, that > autoconfiguration is within the realm of possibility. Hey, if you can do it, my hat's off to you. > In any case, as eavesdroppers can infer a cryptographic key by timing > operations or measuring power consumption, I'm pretty sure postgres > can infer cost metrics and/or time sharing with clever > instrumentation. The trick lies in making such instrumentation > uninstrusive. ... and not requiring a great deal of code maintenance for each and every release of Linux and Windows. Anyway, we could do something for 9.3 if we just make "available RAM" a manual setting. Asking the user "how much RAM is available for Postgres" is not a terribly difficult question. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Jeff Janes
Date:
On Wed, Jan 9, 2013 at 12:38 AM, Benedikt Grundmann <bgrundmann@janestreet.com> wrote: > For what it is worth even if it is a dedicated database box 75% might be way > too high. I remember investigating bad performance on our biggest database > server, that in the end turned out to be a too high setting of > effective_cache_size. From reading the code back then my rationale for it > being to high was that the code that makes use of the effective_cache_size > tries very hard to account for what the current query would do to the cache > but doesn't take into account how many queries (on separate datasets!) are > currently begin executed (and competing for the same cache). On that box we > often have 100+ active connections and many looking at different big > datasets. I think that most busy installations either run a lot of small queries (for which effective_cache_size is irrelevant), or a few large queries. Your case is probably somewhat rare, and so as far as defaults go, it would be sacrificed for the common good. The docs do anticipate the need to account for multiple concurrent queries to be discounted in deciding how to set effective_cache_size, but perhaps the wording could be improved. Out of curiosity, what did your queries look like after you lowered effective_cache_size? Were there a lot of sequential scans, or did it just choose different indexes than it had before? If a lot of sequential scans, were they mostly on just a few tables that each had many sequential scans going on simultaneously, or was it 100+ different tables each with one sequential scan going on? (You said different big datasets, but I don't know if these are in different tables, or in common tables with a column to distinguish them.) Cheers, Jeff
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Bruce Momjian
Date:
On Tue, Jan 8, 2013 at 08:40:44PM -0500, Andrew Dunstan wrote: > > On 01/08/2013 08:08 PM, Tom Lane wrote: > >Robert Haas <robertmhaas@gmail.com> writes: > >>On Tue, Jan 8, 2013 at 7:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >>>... And I don't especially like the idea of trying to > >>>make it depend directly on the box's physical RAM, for the same > >>>practical reasons Robert mentioned. > >>For the record, I don't believe those problems would be particularly > >>hard to solve. > >Well, the problem of "find out the box's physical RAM" is doubtless > >solvable if we're willing to put enough sweat and tears into it, but > >I'm dubious that it's worth the trouble. The harder part is how to know > >if the box is supposed to be dedicated to the database. Bear in mind > >that the starting point of this debate was the idea that we're talking > >about an inexperienced DBA who doesn't know about any configuration knob > >we might provide for the purpose. > > > >I'd prefer to go with a default that's predictable and not totally > >foolish --- and some multiple of shared_buffers seems like it'd fit the > >bill. > > +1. That seems to be by far the biggest bang for the buck. Anything > else will surely involve a lot more code for not much more benefit. I have developed the attached patch which implements an auto-tuned effective_cache_size which is 4x the size of shared buffers. I had to set effective_cache_size to its old 128MB default so the EXPLAIN regression tests would pass unchanged. I considered a new available_ram variable but that just gives us another variable, and in a way shared_buffers is a fixed amount, while effective_cache_size is an estimate, so I thought driving everything from shared_buffers made sense. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Attachment
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Magnus Hagander
Date:
On Thu, Sep 5, 2013 at 3:01 AM, Bruce Momjian <bruce@momjian.us> wrote: > On Tue, Jan 8, 2013 at 08:40:44PM -0500, Andrew Dunstan wrote: >> >> On 01/08/2013 08:08 PM, Tom Lane wrote: >> >Robert Haas <robertmhaas@gmail.com> writes: >> >>On Tue, Jan 8, 2013 at 7:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >>>... And I don't especially like the idea of trying to >> >>>make it depend directly on the box's physical RAM, for the same >> >>>practical reasons Robert mentioned. >> >>For the record, I don't believe those problems would be particularly >> >>hard to solve. >> >Well, the problem of "find out the box's physical RAM" is doubtless >> >solvable if we're willing to put enough sweat and tears into it, but >> >I'm dubious that it's worth the trouble. The harder part is how to know >> >if the box is supposed to be dedicated to the database. Bear in mind >> >that the starting point of this debate was the idea that we're talking >> >about an inexperienced DBA who doesn't know about any configuration knob >> >we might provide for the purpose. >> > >> >I'd prefer to go with a default that's predictable and not totally >> >foolish --- and some multiple of shared_buffers seems like it'd fit the >> >bill. >> >> +1. That seems to be by far the biggest bang for the buck. Anything >> else will surely involve a lot more code for not much more benefit. > > I have developed the attached patch which implements an auto-tuned > effective_cache_size which is 4x the size of shared buffers. I had to > set effective_cache_size to its old 128MB default so the EXPLAIN > regression tests would pass unchanged. That's not really autotuning though. ISTM that making the *default* 4 x shared_buffers might make perfect sense, but do we really need to hijack the value of "-1" for that? That might be useful for some time when we have actual autotuning, that somehow inspects the system and tunes it from there. I also don't think it should be called autotuning, when it's just a "smarter default value". I like the feature, though, just not the packaging. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes: > On Thu, Sep 5, 2013 at 3:01 AM, Bruce Momjian <bruce@momjian.us> wrote: >> I have developed the attached patch which implements an auto-tuned >> effective_cache_size which is 4x the size of shared buffers. I had to >> set effective_cache_size to its old 128MB default so the EXPLAIN >> regression tests would pass unchanged. > That's not really autotuning though. ISTM that making the *default* 4 > x shared_buffers might make perfect sense, but do we really need to > hijack the value of "-1" for that? That might be useful for some time > when we have actual autotuning, that somehow inspects the system and > tunes it from there. Well, the real problem with this patch is that it documents what the auto-tuning algorithm is; without that commitment, just saying "-1 means autotune" might be fine. Did you consider the alternative of just tweaking initdb to insert a default for effective_cache_size that's 4x whatever it picks for shared_buffers? That would probably be about 3 lines of code, and it wouldn't nail down any particular server-side behavior. regards, tom lane
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Bruce Momjian
Date:
On Thu, Sep 5, 2013 at 06:14:33PM +0200, Magnus Hagander wrote: > > I have developed the attached patch which implements an auto-tuned > > effective_cache_size which is 4x the size of shared buffers. I had to > > set effective_cache_size to its old 128MB default so the EXPLAIN > > regression tests would pass unchanged. > > That's not really autotuning though. ISTM that making the *default* 4 > x shared_buffers might make perfect sense, but do we really need to > hijack the value of "-1" for that? That might be useful for some time > when we have actual autotuning, that somehow inspects the system and > tunes it from there. > > I also don't think it should be called autotuning, when it's just a > "smarter default value". > > I like the feature, though, just not the packaging. That "auto-tuning" text came from the wal_buffer documentation, which does exactly this based on shared_buffers: The contents of the WAL buffers are written out to disk at every transaction commit, so extremely large valuesare unlikely to provide a significant benefit. However, setting this value to at least a few megabytescan improve write performance on a busy --> server where many clients are committing at once. The auto-tuning ----------- selected by the default setting of -1 should give reasonable results in most cases. I am fine with rewording and not using -1, but we should change the wal_buffer default and documentation too then. I am not sure what other value than -1 to use? 0? I figure if we ever get better auto-tuning, we would just remove this functionality and make it better. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Bruce Momjian
Date:
On Thu, Sep 5, 2013 at 12:48:54PM -0400, Tom Lane wrote: > Magnus Hagander <magnus@hagander.net> writes: > > On Thu, Sep 5, 2013 at 3:01 AM, Bruce Momjian <bruce@momjian.us> wrote: > >> I have developed the attached patch which implements an auto-tuned > >> effective_cache_size which is 4x the size of shared buffers. I had to > >> set effective_cache_size to its old 128MB default so the EXPLAIN > >> regression tests would pass unchanged. > > > That's not really autotuning though. ISTM that making the *default* 4 > > x shared_buffers might make perfect sense, but do we really need to > > hijack the value of "-1" for that? That might be useful for some time > > when we have actual autotuning, that somehow inspects the system and > > tunes it from there. > > Well, the real problem with this patch is that it documents what the > auto-tuning algorithm is; without that commitment, just saying "-1 means > autotune" might be fine. OK, but I did this based on wal_buffers, which has a -1 default, calls it auto-tuning, and explains how the default is computed. > Did you consider the alternative of just tweaking initdb to insert a > default for effective_cache_size that's 4x whatever it picks for > shared_buffers? That would probably be about 3 lines of code, and it > wouldn't nail down any particular server-side behavior. The problem there is that many users are told to tune shared_buffers, but don't touch effective cache size. Having initdb set the effective_cache_size value would not help there. Again, this is all based on the auto-tuning of wal_buffers. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
On 09/05/2013 02:16 PM, Bruce Momjian wrote: >> Well, the real problem with this patch is that it documents what the >> auto-tuning algorithm is; without that commitment, just saying "-1 means >> autotune" might be fine. > > OK, but I did this based on wal_buffers, which has a -1 default, calls > it auto-tuning, and explains how the default is computed. I don't see a real problem with this. For users who have set their shared_buffers correctly, effective_cache_size should also be correct. > The problem there is that many users are told to tune shared_buffers, > but don't touch effective cache size. Having initdb set the > effective_cache_size value would not help there. Again, this is all > based on the auto-tuning of wal_buffers. Standard advice we've given in the past is 25% shared buffers, 75% effective_cache_size. Which would make EFS *3X* shared_buffers, not 4X.Maybe we're changing the conventional calculation,but I thought I'd point that out. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Merlin Moncure
Date:
On Thu, Sep 5, 2013 at 5:11 PM, Josh Berkus <josh@agliodbs.com> wrote: > On 09/05/2013 02:16 PM, Bruce Momjian wrote: >>> Well, the real problem with this patch is that it documents what the >>> auto-tuning algorithm is; without that commitment, just saying "-1 means >>> autotune" might be fine. >> >> OK, but I did this based on wal_buffers, which has a -1 default, calls >> it auto-tuning, and explains how the default is computed. > > I don't see a real problem with this. For users who have set their > shared_buffers correctly, effective_cache_size should also be correct. Agreed. I think -1 is the right setting for autotune as things stand today. If we want something else, then we should change other settings as well (like wal_buffers) and that is not in the scope of this patch. >> The problem there is that many users are told to tune shared_buffers, >> but don't touch effective cache size. Having initdb set the >> effective_cache_size value would not help there. Again, this is all >> based on the auto-tuning of wal_buffers. > > Standard advice we've given in the past is 25% shared buffers, 75% > effective_cache_size. Which would make EFS *3X* shared_buffers, not 4X. > Maybe we're changing the conventional calculation, but I thought I'd > point that out. This was debated upthread. merlin
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Bruce Momjian
Date:
On Thu, Sep 5, 2013 at 03:11:53PM -0700, Josh Berkus wrote: > On 09/05/2013 02:16 PM, Bruce Momjian wrote: > >> Well, the real problem with this patch is that it documents what the > >> auto-tuning algorithm is; without that commitment, just saying "-1 means > >> autotune" might be fine. > > > > OK, but I did this based on wal_buffers, which has a -1 default, calls > > it auto-tuning, and explains how the default is computed. > > I don't see a real problem with this. For users who have set their > shared_buffers correctly, effective_cache_size should also be correct. > > > The problem there is that many users are told to tune shared_buffers, > > but don't touch effective cache size. Having initdb set the > > effective_cache_size value would not help there. Again, this is all > > based on the auto-tuning of wal_buffers. > > Standard advice we've given in the past is 25% shared buffers, 75% > effective_cache_size. Which would make EFS *3X* shared_buffers, not 4X. > Maybe we're changing the conventional calculation, but I thought I'd > point that out. Yes, I had wondered that myself, and 3x and 4x were thrown out as options. There were more people who liked 4x, but one of the reasons was that 3x sounded odd --- not sure what to make of that, but I went with the most popular. I am fine with 3x, and I do think it logically makes more sense, and is less likely to over-estimate than 4x. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
On 09/05/2013 03:30 PM, Merlin Moncure wrote: >> Standard advice we've given in the past is 25% shared buffers, 75% >> effective_cache_size. Which would make EFS *3X* shared_buffers, not 4X. >> Maybe we're changing the conventional calculation, but I thought I'd >> point that out. > > This was debated upthread. Actually, no, it wasn't. Tom threw out a suggestion that we use 4X for historical reasons. That's all, there was no discussion. So, my point stands: our historical advice has been to set EFS to 75% of RAM. Maybe we're changing that advice, but if so, let's change it. Otherwise 3X makes more sense. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Cédric Villemain
Date:
Le jeudi 5 septembre 2013 17:14:37 Bruce Momjian a écrit : > On Thu, Sep 5, 2013 at 06:14:33PM +0200, Magnus Hagander wrote: > > > I have developed the attached patch which implements an auto-tuned > > > effective_cache_size which is 4x the size of shared buffers. I had to > > > set effective_cache_size to its old 128MB default so the EXPLAIN > > > regression tests would pass unchanged. > > > > That's not really autotuning though. ISTM that making the *default* 4 > > x shared_buffers might make perfect sense, but do we really need to > > hijack the value of "-1" for that? That might be useful for some time > > when we have actual autotuning, that somehow inspects the system and > > tunes it from there. > > > > I also don't think it should be called autotuning, when it's just a > > "smarter default value". > > > > I like the feature, though, just not the packaging. > > That "auto-tuning" text came from the wal_buffer documentation, which > does exactly this based on shared_buffers: > > The contents of the WAL buffers are written out to disk at every > transaction commit, so extremely large values are unlikely to > provide a significant benefit. However, setting this value to at > least a few megabytes can improve write performance on a busy > --> server where many clients are committing at once. The auto-tuning > ----------- > selected by the default setting of -1 should give reasonable > results in most cases. > > I am fine with rewording and not using -1, but we should change the > wal_buffer default and documentation too then. I am not sure what other > value than -1 to use? 0? I figure if we ever get better auto-tuning, > we would just remove this functionality and make it better. I'm fine with a -1 for auto-tune or inteligent default: it means (for me) that you don't need to care about this parameter in most case. A negative impact of the simpler multiplier might be that if suddendly someone reduce the shared_buffers size to fix some strange behavior, then he at the same needs to increase manualy the effective_cache_size (which remain the sum of the caches on the system, at least on a dedicated to PostgreSQL one). IMHO it is easy to know exactly how much of the memory is (or can be) used for/by PostgreSQL, we can compute that and update effective_cache_size at regular point int time. (just an idea, I know there are arguments against that too) Maybe the value for a 4x multiplier instead of 3x, is that the effective_cache_size usage can be larger than required. It's not a big trouble. With all things around NUMA we maybe just need to revisit that area (memory access cost non linear, double-triple caching, ...) . -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Bruce Momjian
Date:
On Thu, Sep 5, 2013 at 09:02:27PM -0700, Josh Berkus wrote: > On 09/05/2013 03:30 PM, Merlin Moncure wrote: > > >> Standard advice we've given in the past is 25% shared buffers, 75% > >> effective_cache_size. Which would make EFS *3X* shared_buffers, not 4X. > >> Maybe we're changing the conventional calculation, but I thought I'd > >> point that out. > > > > This was debated upthread. > > Actually, no, it wasn't. Tom threw out a suggestion that we use 4X for > historical reasons. That's all, there was no discussion. > > So, my point stands: our historical advice has been to set EFS to 75% of > RAM. Maybe we're changing that advice, but if so, let's change it. > Otherwise 3X makes more sense. So, what do we want the effective_cache_size default to be? 3x or 4x? We clearly state: If you have a dedicated database server with 1GB or more of RAM, a reasonable starting value for shared_buffersis 25% of the memory in your system. There are some workloads where even If we make the default 4x, that means that people using the above suggestion would be setting their effective_cache_size to 100% of RAM? If we go with 4x, which I believe was the majority opinion, what shall we answer to someone who asks about this contradiction? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Jeff Janes
Date:
On Mon, Sep 9, 2013 at 6:29 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Thu, Sep 5, 2013 at 09:02:27PM -0700, Josh Berkus wrote: >> On 09/05/2013 03:30 PM, Merlin Moncure wrote: >> >> >> Standard advice we've given in the past is 25% shared buffers, 75% >> >> effective_cache_size. Which would make EFS *3X* shared_buffers, not 4X. >> >> Maybe we're changing the conventional calculation, but I thought I'd >> >> point that out. >> > >> > This was debated upthread. >> >> Actually, no, it wasn't. Tom threw out a suggestion that we use 4X for >> historical reasons. That's all, there was no discussion. >> >> So, my point stands: our historical advice has been to set EFS to 75% of >> RAM. Maybe we're changing that advice, but if so, let's change it. >> Otherwise 3X makes more sense. > > So, what do we want the effective_cache_size default to be? 3x or 4x? > We clearly state: > > If you have a dedicated database server with 1GB or more of RAM, > a reasonable starting value for shared_buffers is 25% of the > memory in your system. There are some workloads where even > > If we make the default 4x, that means that people using the above > suggestion would be setting their effective_cache_size to 100% of RAM? > If we go with 4x, which I believe was the majority opinion, what shall > we answer to someone who asks about this contradiction? I vote for 3x. The past defaults had a different ratio, but we are changing things to make them better, not to leave them the same. We should change it be consistent with the advice we have long given. Sure, 3 is not a power of 2, but I usually root for the underdog. Cheers, Jeff
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Merlin Moncure
Date:
On Tue, Sep 10, 2013 at 11:39 AM, Jeff Janes <jeff.janes@gmail.com> wrote: > On Mon, Sep 9, 2013 at 6:29 PM, Bruce Momjian <bruce@momjian.us> wrote: >> On Thu, Sep 5, 2013 at 09:02:27PM -0700, Josh Berkus wrote: >>> On 09/05/2013 03:30 PM, Merlin Moncure wrote: >>> >>> >> Standard advice we've given in the past is 25% shared buffers, 75% >>> >> effective_cache_size. Which would make EFS *3X* shared_buffers, not 4X. >>> >> Maybe we're changing the conventional calculation, but I thought I'd >>> >> point that out. >>> > >>> > This was debated upthread. >>> >>> Actually, no, it wasn't. Tom threw out a suggestion that we use 4X for >>> historical reasons. That's all, there was no discussion. >>> >>> So, my point stands: our historical advice has been to set EFS to 75% of >>> RAM. Maybe we're changing that advice, but if so, let's change it. >>> Otherwise 3X makes more sense. >> >> So, what do we want the effective_cache_size default to be? 3x or 4x? >> We clearly state: >> >> If you have a dedicated database server with 1GB or more of RAM, >> a reasonable starting value for shared_buffers is 25% of the >> memory in your system. There are some workloads where even >> >> If we make the default 4x, that means that people using the above >> suggestion would be setting their effective_cache_size to 100% of RAM? >> If we go with 4x, which I believe was the majority opinion, what shall >> we answer to someone who asks about this contradiction? > > I vote for 3x. The past defaults had a different ratio, but we are > changing things to make them better, not to leave them the same. We > should change it be consistent with the advice we have long given. > Sure, 3 is not a power of 2, but I usually root for the underdog. I vote 4x on the basis that for this setting (unlike almost all the other memory settings) the ramifications for setting it too high generally aren't too bad. Also, the o/s and temporary memory usage as a share of total physical memory has been declining over time (meaning, that if you have a 256gb memory server and follow the advice to set to 64gb, your memory for caching is approximately 64gb). merlin
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
Merlin, > I vote 4x on the basis that for this setting (unlike almost all the > other memory settings) the ramifications for setting it too high > generally aren't too bad. Also, the o/s and temporary memory usage as > a share of total physical memory has been declining over time If we're doing that, then we should change our general advice on this setting as well. Another argument in favor: this is a default setting, and by default, shared_buffers won't be 25% of RAM. > (meaning, that if you have a 256gb memory server and follow the advice > to set to 64gb, your memory for caching is approximately 64gb). Wait, what? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Merlin Moncure
Date:
On Tue, Sep 10, 2013 at 5:08 PM, Josh Berkus <josh@agliodbs.com> wrote: > Merlin, > >> I vote 4x on the basis that for this setting (unlike almost all the >> other memory settings) the ramifications for setting it too high >> generally aren't too bad. Also, the o/s and temporary memory usage as >> a share of total physical memory has been declining over time > > If we're doing that, then we should change our general advice on this > setting as well. > > Another argument in favor: this is a default setting, and by default, > shared_buffers won't be 25% of RAM. > >> (meaning, that if you have a 256gb memory server and follow the advice >> to set to 64gb, your memory for caching is approximately 64gb). oops, meant to say "approximately 256gb". merlin
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Bruce Momjian
Date:
On Tue, Sep 10, 2013 at 03:08:24PM -0700, Josh Berkus wrote: > Merlin, > > > I vote 4x on the basis that for this setting (unlike almost all the > > other memory settings) the ramifications for setting it too high > > generally aren't too bad. Also, the o/s and temporary memory usage as > > a share of total physical memory has been declining over time > > If we're doing that, then we should change our general advice on this > setting as well. Uh, what general advice? I don't see 4x mentioned anywhere. > Another argument in favor: this is a default setting, and by default, > shared_buffers won't be 25% of RAM. So, are you saying you like 4x now? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Bruce Momjian
Date:
On Wed, Sep 11, 2013 at 09:18:30AM -0400, Bruce Momjian wrote: > On Tue, Sep 10, 2013 at 03:08:24PM -0700, Josh Berkus wrote: > > Merlin, > > > > > I vote 4x on the basis that for this setting (unlike almost all the > > > other memory settings) the ramifications for setting it too high > > > generally aren't too bad. Also, the o/s and temporary memory usage as > > > a share of total physical memory has been declining over time > > > > If we're doing that, then we should change our general advice on this > > setting as well. > > Uh, what general advice? I don't see 4x mentioned anywhere. > > > Another argument in favor: this is a default setting, and by default, > > shared_buffers won't be 25% of RAM. > > So, are you saying you like 4x now? Here is an arugment for 3x. First, using the documented 25% of RAM, 3x puts our effective_cache_size as 75% of RAM, giving us no room for kernel, backend memory, and work_mem usage. If anything it should be lower than 3x, not higher. Second, if the machine is not a dedicated machine, and supposed 10% of RAM is used for shared_buffers, 4x would put effective cache size at 40% of RAM, which again seems too high, considering others are using the machine and filling the kernel cache. 3x also seems too high, but acceptable at 30% of RAM. I basically can't imagine a case where you set shared_buffers to a reasonable value and would still have 4x of that available for kernel cache. Finally, for those who like the idea of 4x, you can think of shared_buffers (1x) + effective_cache_size (3x) as totalling 4x. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Alvaro Herrera
Date:
Bruce Momjian escribió: > > So, are you saying you like 4x now? > > Here is an arugment for 3x. First, using the documented 25% of RAM, 3x > puts our effective_cache_size as 75% of RAM, giving us no room for > kernel, backend memory, and work_mem usage. If anything it should be > lower than 3x, not higher. The other argument I see for the 3x value is that it is a compromise. People with really large servers will want to increase it; people with very small servers will want to reduce it. > Finally, for those who like the idea of 4x, you can think of > shared_buffers (1x) + effective_cache_size (3x) as totalling 4x. This part of your argument doesn't work really, because AFAIR the effective_cache_size value ought to consider that shared_buffers is part of it (so e_c_s is shared_buffers + kernel cache). So if you're seeing the 4x as e_c_s + s_b, you would be counting s_b twice. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Bruce Momjian
Date:
On Wed, Sep 11, 2013 at 12:43:07PM -0300, Alvaro Herrera wrote: > Bruce Momjian escribió: > > > > So, are you saying you like 4x now? > > > > Here is an arugment for 3x. First, using the documented 25% of RAM, 3x > > puts our effective_cache_size as 75% of RAM, giving us no room for > > kernel, backend memory, and work_mem usage. If anything it should be > > lower than 3x, not higher. > > The other argument I see for the 3x value is that it is a compromise. > People with really large servers will want to increase it; people with > very small servers will want to reduce it. Yes, you could make the argument that 2x is the right default, especially considering work_mem. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
On 09/11/2013 08:27 AM, Bruce Momjian wrote: > On Wed, Sep 11, 2013 at 09:18:30AM -0400, Bruce Momjian wrote: >> On Tue, Sep 10, 2013 at 03:08:24PM -0700, Josh Berkus wrote: >>> Another argument in favor: this is a default setting, and by default, >>> shared_buffers won't be 25% of RAM. >> >> So, are you saying you like 4x now? I still prefer 3X. However, I acknowledge that there are legitimate arguments for higher values for a *default*. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Andres Freund
Date:
On 2013-09-11 12:53:29 -0400, Bruce Momjian wrote: > On Wed, Sep 11, 2013 at 12:43:07PM -0300, Alvaro Herrera wrote: > > Bruce Momjian escribió: > > > > > > So, are you saying you like 4x now? > > > > > > Here is an arugment for 3x. First, using the documented 25% of RAM, 3x > > > puts our effective_cache_size as 75% of RAM, giving us no room for > > > kernel, backend memory, and work_mem usage. If anything it should be > > > lower than 3x, not higher. > > > > The other argument I see for the 3x value is that it is a compromise. > > People with really large servers will want to increase it; people with > > very small servers will want to reduce it. > > Yes, you could make the argument that 2x is the right default, > especially considering work_mem. I think that most of the arguments in this thread drastically overestimate the precision and the effect of effective_cache_size. The planner logic behind it basically only uses it to calculate things within a single index scan. That alone shows that any precise calculation cannot be very meaningful. It also does *NOT* directly influence how the kernel caches disk io. It's there to guess how likely it is something is still cached when accessing things repeatedly. I think nearly all practical experience shows that setting it smaller is more likely to cause problems than setting it too low. We shouldn't be too skimpy here. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Claudio Freire
Date:
On Wed, Sep 11, 2013 at 12:27 PM, Bruce Momjian <bruce@momjian.us> wrote: >> > Another argument in favor: this is a default setting, and by default, >> > shared_buffers won't be 25% of RAM. >> >> So, are you saying you like 4x now? > > Here is an arugment for 3x. First, using the documented 25% of RAM, 3x > puts our effective_cache_size as 75% of RAM, giving us no room for > kernel, backend memory, and work_mem usage. If anything it should be > lower than 3x, not higher. AFAIK, e_c_s must include shared_buffers, so 25% + 75% = 75% And your statement seems to assume 25% + 75% = 100%. Which isn't universally true, no matter what your math teacher probably taught you ;-)
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
> I think that most of the arguments in this thread drastically > overestimate the precision and the effect of effective_cache_size. The > planner logic behind it basically only uses it to calculate things > within a single index scan. That alone shows that any precise > calculation cannot be very meaningful. > It also does *NOT* directly influence how the kernel caches disk > io. It's there to guess how likely it is something is still cached when > accessing things repeatedly. Agreed. I think we should take the patch as-is, and spend the rest of the 9.4 dev cycle arguing about 3x vs. 4x. ;-) -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Robert Haas
Date:
On Wed, Sep 11, 2013 at 3:40 PM, Josh Berkus <josh@agliodbs.com> wrote: >> I think that most of the arguments in this thread drastically >> overestimate the precision and the effect of effective_cache_size. The >> planner logic behind it basically only uses it to calculate things >> within a single index scan. That alone shows that any precise >> calculation cannot be very meaningful. >> It also does *NOT* directly influence how the kernel caches disk >> io. It's there to guess how likely it is something is still cached when >> accessing things repeatedly. > > Agreed. I think we should take the patch as-is, and spend the rest of > the 9.4 dev cycle arguing about 3x vs. 4x. > > ;-) I'm happy with that option, but I think the larger point here is that this only has a hope of being right if you're setting shared_buffers to 25% of system memory. And more and more, people are not doing that, because of the other recommendation, not much discussed here, to cap shared_buffers at about 8GB. Systems whose total memory is far larger than 32GB are becoming quite commonplace, and only figure to become moreso. So while I don't particularly object to this proposal, it would have had a lot more value if we'd done it 5 years ago. Now the good news is that right now the default is 128MB, and under any of these proposals the default will go up, quite a bit. Default shared_buffers is now 128MB, so we're looking at raising the default to at least 384MB, and for people who also tune shared_buffers but might not bother with effective cache size, it'll go up a lot more. That's clearly a move in the right direction even if the accuracy of the formula is suspect (which it is). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Merlin Moncure
Date:
On Fri, Sep 13, 2013 at 10:08 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Sep 11, 2013 at 3:40 PM, Josh Berkus <josh@agliodbs.com> wrote: >>> I think that most of the arguments in this thread drastically >>> overestimate the precision and the effect of effective_cache_size. The >>> planner logic behind it basically only uses it to calculate things >>> within a single index scan. That alone shows that any precise >>> calculation cannot be very meaningful. >>> It also does *NOT* directly influence how the kernel caches disk >>> io. It's there to guess how likely it is something is still cached when >>> accessing things repeatedly. >> >> Agreed. I think we should take the patch as-is, and spend the rest of >> the 9.4 dev cycle arguing about 3x vs. 4x. >> >> ;-) > > I'm happy with that option, but I think the larger point here is that > this only has a hope of being right if you're setting shared_buffers > to 25% of system memory. And more and more, people are not doing > that, because of the other recommendation, not much discussed here, to > cap shared_buffers at about 8GB. Systems whose total memory is far > larger than 32GB are becoming quite commonplace, and only figure to > become moreso. So while I don't particularly object to this proposal, > it would have had a lot more value if we'd done it 5 years ago. > > Now the good news is that right now the default is 128MB, and under > any of these proposals the default will go up, quite a bit. Default > shared_buffers is now 128MB, so we're looking at raising the default > to at least 384MB, and for people who also tune shared_buffers but > might not bother with effective cache size, it'll go up a lot more. > That's clearly a move in the right direction even if the accuracy of > the formula is suspect (which it is). This is a very important point: the 8gb cap is also to high. We have a very high transaction rate server here that exploded at 32GB, was downgraded to 2GB ran fine, then upgraded to 4GB (over my strenuous objection) and exploded again. The stock documentation advice I probably needs to be revised to so that's the lesser of 2GB and 25%. I'm more and more coming around to the opinion that in terms of shared buffers we have some major problems that manifest in high end servers. So +1 to your point, although I'm still ok with the auto-setting on the basis that for very high end servers most of the setting end up being manually tweaked anyways. We do need to be cautious though; it's not impossible that improvements to buffers system might cause the 25% advice to revised upwards in the not-to-near future if some of the problems get solved.. merlin
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Andres Freund
Date:
On 2013-09-13 10:50:06 -0500, Merlin Moncure wrote: > The stock documentation advice I probably needs to be revised to so > that's the lesser of 2GB and 25%. I think that would be a pretty bad idea. There are lots of workloads where people have postgres happily chugging along with s_b lots bigger than that and see benefits. We have a couple people reporting mostly undiagnosed (because that turns out to be hard!) problems that seem to be avoided with smaller s_b. We don't even remotely know enough about the problem to make such general recommendations. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Merlin Moncure
Date:
On Fri, Sep 13, 2013 at 11:07 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2013-09-13 10:50:06 -0500, Merlin Moncure wrote: >> The stock documentation advice I probably needs to be revised to so >> that's the lesser of 2GB and 25%. > > I think that would be a pretty bad idea. There are lots of workloads > where people have postgres happily chugging along with s_b lots bigger > than that and see benefits. > We have a couple people reporting mostly undiagnosed (because that turns > out to be hard!) problems that seem to be avoided with smaller s_b. We > don't even remotely know enough about the problem to make such general > recommendations. I happen to be one of those "couple" people. Load goes from 0.1 to 500 without warning then back to 0.1 equally without warning. Unfortunately the server is in a different jurisdiction such that it makes deep forensic analysis impossible. I think this is happening more and more often as postgres is becoming increasingly deployed on high(er) -end servers. I've personally (alone) dealt with 4-5 confirmed cases and there have been many more. We have a problem. But, to address your point, the "big s_b" benefits are equally hard to quantify (unless your database happens to fit in s_b) -- they mostly help high write activity servers where the write activity fits a very specific pattern. But the risks of low s_b (mostly slightly higher i/o and query latency) are much easier to deal with than high s_b (even if less likely); random inexplicable server stalls and other weird manifestations. My "stock" advice remains to set to max 2gb until having a reason (of which there can be many) to set otherwise. merlin
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Andres Freund
Date:
On 2013-09-13 11:27:03 -0500, Merlin Moncure wrote: > On Fri, Sep 13, 2013 at 11:07 AM, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2013-09-13 10:50:06 -0500, Merlin Moncure wrote: > >> The stock documentation advice I probably needs to be revised to so > >> that's the lesser of 2GB and 25%. > > > > I think that would be a pretty bad idea. There are lots of workloads > > where people have postgres happily chugging along with s_b lots bigger > > than that and see benefits. > > We have a couple people reporting mostly undiagnosed (because that turns > > out to be hard!) problems that seem to be avoided with smaller s_b. We > > don't even remotely know enough about the problem to make such general > > recommendations. > I happen to be one of those "couple" people. Load goes from 0.1 to > 500 without warning then back to 0.1 equally without warning. > Unfortunately the server is in a different jurisdiction such that it > makes deep forensic analysis impossible. I think this is happening > more and more often as postgres is becoming increasingly deployed on > high(er) -end servers. I've personally (alone) dealt with 4-5 > confirmed cases and there have been many more. We have a problem. Absolutely not claiming the contrary. I think it sucks that we couldn't fully figure out what's happening in detail. I'd love to get my hand on a setup where it can be reliably reproduced. > But, to address your point, the "big s_b" benefits are equally hard to > quantify (unless your database happens to fit in s_b) Databases where the hot dataset fits in s_b is pretty honking big use case tho. That's one of the primary reasons to buy machines with craploads of memory. That said, I think having a note in the docs that large s_b can cause such a problem might not be a bad idea and I surely wouldn't argue against it. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
On 09/13/2013 09:27 AM, Merlin Moncure wrote: > I happen to be one of those "couple" people. Load goes from 0.1 to > 500 without warning then back to 0.1 equally without warning. > Unfortunately the server is in a different jurisdiction such that it > makes deep forensic analysis impossible. I think this is happening > more and more often as postgres is becoming increasingly deployed on > high(er) -end servers. I've personally (alone) dealt with 4-5 > confirmed cases and there have been many more. We have a problem. Can you explain a bit more about this? I'm currently grappling with a db cluster which has periodic mysterious total LWLock paralysis, and is configured with 8GB shared_buffers (and 512GB ram installed). -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Merlin Moncure
Date:
On Fri, Sep 13, 2013 at 2:36 PM, Josh Berkus <josh@agliodbs.com> wrote: > On 09/13/2013 09:27 AM, Merlin Moncure wrote: >> I happen to be one of those "couple" people. Load goes from 0.1 to >> 500 without warning then back to 0.1 equally without warning. >> Unfortunately the server is in a different jurisdiction such that it >> makes deep forensic analysis impossible. I think this is happening >> more and more often as postgres is becoming increasingly deployed on >> high(er) -end servers. I've personally (alone) dealt with 4-5 >> confirmed cases and there have been many more. We have a problem. > > Can you explain a bit more about this? I'm currently grappling with a > db cluster which has periodic mysterious total LWLock paralysis, and is > configured with 8GB shared_buffers (and 512GB ram installed). see thread: http://postgresql.1045698.n5.nabble.com/StrategyGetBuffer-optimization-take-2-td5766307.html plus others. In this particular case, s_b needs to be set to 2GB or lower (they tried raising to 4GB for no good reason) and problem reoccured. what are the specific symptoms of your problem? anything interesting in pg_locks? is $client willing to experiment with custom patches? merlin
Re: Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
On 09/13/2013 12:55 PM, Merlin Moncure wrote: > what are the specific symptoms of your problem? anything interesting > in pg_locks? is $client willing to experiment with custom patches? 3 servers: 1 master, two replicas. 32-core Xeon, hyperthreaded to 64 cores 512GB RAM each s_b set to 8GB Load-balanced between all 3 ~~ 11 different databases combined database size around 600GB using pgbouncer Irregularly, during periods of high activity (although not necessarily peak activity) one or another of the systems will go into paralysis, with all backends apparently waiting on LWLocks (we need more tracing information to completely confirm this). Activity at this time is usually somewhere between 50 and 100 concurrent queries (and 80 to 150 connections). pg_locks doesn't think anything is waiting on a lock. What's notable is that sometimes it's just *one* of the replicas which goes into paralysis. If the master gets this issue though, the replicas experience it soon afterwards. Increasing wal_buffers from 16GB to 64GB seems to make this issue happen less frequently, but it doesn't go away entirely. Only a restart of the server, or killing all backend, ends the lockup. The workload is OLTP, essentially, around 20/80 write/read. They use PostGIS. The other notable thing about their workload is that due to an ORM defect, they get idle-in-transactions which last from 5 to 15 seconds several times a minute. They are willing to use experimental patches, but only if those patches can be applied only to a replica. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Andres Freund
Date:
Josh Berkus <josh@agliodbs.com> schrieb: >What's notable is that sometimes it's just *one* of the replicas which >goes into paralysis. If the master gets this issue though, the >replicas >experience it soon afterwards. Increasing wal_buffers from 16GB to >64GB >seems to make this issue happen less frequently, but it doesn't go away >entirely. Youre talking about MB, not GB here, right? Andres -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
On 09/13/2013 01:25 PM, Andres Freund wrote: > > > Josh Berkus <josh@agliodbs.com> schrieb: > >> What's notable is that sometimes it's just *one* of the replicas which >> goes into paralysis. If the master gets this issue though, the >> replicas >> experience it soon afterwards. Increasing wal_buffers from 16GB to >> 64GB >> seems to make this issue happen less frequently, but it doesn't go away >> entirely. > > Youre talking about MB, not GB here, right? Oh, yes, right. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Merlin Moncure
Date:
On Fri, Sep 13, 2013 at 3:20 PM, Josh Berkus <josh@agliodbs.com> wrote: > On 09/13/2013 12:55 PM, Merlin Moncure wrote: >> what are the specific symptoms of your problem? anything interesting >> in pg_locks? is $client willing to experiment with custom patches? > > 3 servers: 1 master, two replicas. > 32-core Xeon, hyperthreaded to 64 cores > 512GB RAM each > s_b set to 8GB > Load-balanced between all 3 > ~~ 11 different databases > combined database size around 600GB > using pgbouncer > > Irregularly, during periods of high activity (although not necessarily > peak activity) one or another of the systems will go into paralysis, > with all backends apparently waiting on LWLocks (we need more tracing > information to completely confirm this). Activity at this time is > usually somewhere between 50 and 100 concurrent queries (and 80 to 150 > connections). pg_locks doesn't think anything is waiting on a lock. > > What's notable is that sometimes it's just *one* of the replicas which > goes into paralysis. If the master gets this issue though, the replicas > experience it soon afterwards. Increasing wal_buffers from 16GB to 64GB > seems to make this issue happen less frequently, but it doesn't go away > entirely. Only a restart of the server, or killing all backend, ends > the lockup. > > The workload is OLTP, essentially, around 20/80 write/read. They use > PostGIS. The other notable thing about their workload is that due to an > ORM defect, they get idle-in-transactions which last from 5 to 15 > seconds several times a minute. > > They are willing to use experimental patches, but only if those patches > can be applied only to a replica. ok, points similar: *) master/slave config (two slaves for me) *) 'big' server 256GB mem, 32 core *) 80% approx. (perhaps more) *) some spacial searching (but not very much) *) OLTP *) presentation of load, although in my case it did resolve anywhere from 30 secs to half hour *) aside from the spike, 100% healthy points different *) application side pooling: 96 app servers, max 5 connections each (aside: are you using transaction mode pgbouncer?) *) I saw gripes about relation extension in pg_locks merlin
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Kevin Grittner
Date:
Andres Freund <andres@2ndquadrant.com> wrote: > Absolutely not claiming the contrary. I think it sucks that we > couldn't fully figure out what's happening in detail. I'd love to > get my hand on a setup where it can be reliably reproduced. I have seen two completely different causes for symptoms like this, and I suspect that these aren't the only two. (1) The dirty page avalanche: PostgreSQL hangs on to a large number of dirty buffers and then dumps a lot of them at once. The OS does the same. When PostgreSQL dumps its buffers to the OS it pushes the OS over a "tipping point" where it is writing dirty buffers too fast for the controller's BBU cache to absorb them. Everything freezes until the controller writes and accepts OS writes for a lot of data. This can take several minutes, during which time the database seems "frozen". Cure is some combination of these: reduce shared_buffers, make the background writer more aggressive, checkpoint more often, make the OS dirty page writing more aggressive, add more BBU RAM to the controller. (2) Transparent huge page support goes haywire on its defrag work. Clues on this include very high "system" CPU time during an episode, and `perf top` shows more time in kernel spinlock functions than anywhere else. The database doesn't completely lock up like with the dirty page avalanche, but it is slow enough that users often describe it that way. So far I have only seen this cured by disabling THP support (in spite of some people urging that just the defrag be disabled). It does make me wonder whether there is something we could do in PostgreSQL to interact better with THPs. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Merlin Moncure
Date:
On Fri, Sep 13, 2013 at 4:04 PM, Kevin Grittner <kgrittn@ymail.com> wrote: > Andres Freund <andres@2ndquadrant.com> wrote: > >> Absolutely not claiming the contrary. I think it sucks that we >> couldn't fully figure out what's happening in detail. I'd love to >> get my hand on a setup where it can be reliably reproduced. > > I have seen two completely different causes for symptoms like this, > and I suspect that these aren't the only two. > > (1) The dirty page avalanche: PostgreSQL hangs on to a large > number of dirty buffers and then dumps a lot of them at once. The > OS does the same. When PostgreSQL dumps its buffers to the OS it > pushes the OS over a "tipping point" where it is writing dirty > buffers too fast for the controller's BBU cache to absorb them. > Everything freezes until the controller writes and accepts OS > writes for a lot of data. This can take several minutes, during > which time the database seems "frozen". Cure is some combination > of these: reduce shared_buffers, make the background writer more > aggressive, checkpoint more often, make the OS dirty page writing > more aggressive, add more BBU RAM to the controller. Yeah -- I've seen this too, and it's a well understood problem. Getting o/s to spin dirty pages out faster is the name of the game I think. Storage is getting so fast that it's (mostly) moot anyways. Also, this is under the umbrella of 'high i/o' -- the stuff I've been seeing is low- or no- I/o. > (2) Transparent huge page support goes haywire on its defrag work. > Clues on this include very high "system" CPU time during an > episode, and `perf top` shows more time in kernel spinlock > functions than anywhere else. The database doesn't completely lock > up like with the dirty page avalanche, but it is slow enough that > users often describe it that way. So far I have only seen this > cured by disabling THP support (in spite of some people urging that > just the defrag be disabled). It does make me wonder whether there > is something we could do in PostgreSQL to interact better with > THPs. Ah, that's a useful tip; need to research that, thanks. Maybe Josh might be able to give it a whirl... merlin
Re: Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
On 09/13/2013 01:58 PM, Merlin Moncure wrote: > ok, points similar: > *) master/slave config (two slaves for me) > *) 'big' server 256GB mem, 32 core > *) 80% approx. (perhaps more) > *) some spacial searching (but not very much) > *) OLTP > *) presentation of load, although in my case it did resolve anywhere > from 30 secs to half hour > *) aside from the spike, 100% healthy > > points different > *) application side pooling: 96 app servers, max 5 connections each > (aside: are you using transaction mode pgbouncer?) Yes. > *) I saw gripes about relation extension in pg_locks I'll check for that next time. We're also working on seeing if we can reproduce this under test conditions. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Merlin Moncure
Date:
On Fri, Sep 13, 2013 at 4:15 PM, Josh Berkus <josh@agliodbs.com> wrote: > On 09/13/2013 01:58 PM, Merlin Moncure wrote: >> ok, points similar: >> *) master/slave config (two slaves for me) >> *) 'big' server 256GB mem, 32 core >> *) 80% approx. (perhaps more) >> *) some spacial searching (but not very much) >> *) OLTP >> *) presentation of load, although in my case it did resolve anywhere >> from 30 secs to half hour >> *) aside from the spike, 100% healthy >> >> points different >> *) application side pooling: 96 app servers, max 5 connections each >> (aside: are you using transaction mode pgbouncer?) > > Yes Given that, I would be curious to see what dropping connection pool down to somewhere between 32-64 connections would do. This is a band-aid obviously since not everyone can or wants to run pgbouncer. But this first thing that pops into my mind in these situations is that it might alleviate the symptoms. merlin
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Andres Freund
Date:
On 2013-09-13 14:04:55 -0700, Kevin Grittner wrote: > Andres Freund <andres@2ndquadrant.com> wrote: > > > Absolutely not claiming the contrary. I think it sucks that we > > couldn't fully figure out what's happening in detail. I'd love to > > get my hand on a setup where it can be reliably reproduced. > > I have seen two completely different causes for symptoms like this, > and I suspect that these aren't the only two. > > (1) The dirty page avalanche: PostgreSQL hangs on to a large > number of dirty buffers and then dumps a lot of them at once. The > OS does the same. When PostgreSQL dumps its buffers to the OS it > pushes the OS over a "tipping point" where it is writing dirty > buffers too fast for the controller's BBU cache to absorb them. > Everything freezes until the controller writes and accepts OS > writes for a lot of data. This can take several minutes, during > which time the database seems "frozen". Cure is some combination > of these: reduce shared_buffers, make the background writer more > aggressive, checkpoint more often, make the OS dirty page writing > more aggressive, add more BBU RAM to the controller. That should hopefully be diagnosable from other system stats like the dirty rate. > (2) Transparent huge page support goes haywire on its defrag work. > Clues on this include very high "system" CPU time during an > episode, and `perf top` shows more time in kernel spinlock > functions than anywhere else. The database doesn't completely lock > up like with the dirty page avalanche, but it is slow enough that > users often describe it that way. So far I have only seen this > cured by disabling THP support (in spite of some people urging that > just the defrag be disabled). Yes, I have seen that issue a couple of times now as well. I can confirm that in at least two cases disabling defragmentation alone proved to be enough to fix the issue. Annoyingly enough there are different ways to disable defragmentation/THP depending on whether you're using THP backported by redhat or the upstream version... > It does make me wonder whether there > is something we could do in PostgreSQL to interact better with > THPs. The best thing I see is to just use explicit hugepages. I've previously sent a prototype for that then has been turned into an actual implementation by Christian Kruse. A colleague of mine is working on polishing that patch into something committable. If you use large s_b, the memory savings alone (some 100kb instead dozens of megabytes per backend) can be worth it, not to talk of actual performance gains. Updating the kernel helps as well, they've improved the efficiency of defragmentation a good bit. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Bruce Momjian
Date:
On Thu, Sep 5, 2013 at 05:14:37PM -0400, Bruce Momjian wrote: > On Thu, Sep 5, 2013 at 06:14:33PM +0200, Magnus Hagander wrote: > > > I have developed the attached patch which implements an auto-tuned > > > effective_cache_size which is 4x the size of shared buffers. I had to > > > set effective_cache_size to its old 128MB default so the EXPLAIN > > > regression tests would pass unchanged. > > > > That's not really autotuning though. ISTM that making the *default* 4 > > x shared_buffers might make perfect sense, but do we really need to > > hijack the value of "-1" for that? That might be useful for some time > > when we have actual autotuning, that somehow inspects the system and > > tunes it from there. > > > > I also don't think it should be called autotuning, when it's just a > > "smarter default value". > > > > I like the feature, though, just not the packaging. > > That "auto-tuning" text came from the wal_buffer documentation, which > does exactly this based on shared_buffers: > > The contents of the WAL buffers are written out to disk at every > transaction commit, so extremely large values are unlikely to > provide a significant benefit. However, setting this value to at > least a few megabytes can improve write performance on a busy > --> server where many clients are committing at once. The auto-tuning > ----------- > selected by the default setting of -1 should give reasonable > results in most cases. > > I am fine with rewording and not using -1, but we should change the > wal_buffer default and documentation too then. I am not sure what other > value than -1 to use? 0? I figure if we ever get better auto-tuning, > we would just remove this functionality and make it better. Patch applied with a default of 4x shared buffers. I have added a 9.4 TODO that we might want to revisit this. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Kevin Hale Boyes
Date:
The patch contains a small typo in config.sgml. Probably just drop the "is" from "is can".
+ results if this database cluster is can utilize most of the memory
Kevin.
On 8 October 2013 10:13, Bruce Momjian <bruce@momjian.us> wrote:
Patch applied with a default of 4x shared buffers. I have added a 9.4On Thu, Sep 5, 2013 at 05:14:37PM -0400, Bruce Momjian wrote:
> On Thu, Sep 5, 2013 at 06:14:33PM +0200, Magnus Hagander wrote:
> > > I have developed the attached patch which implements an auto-tuned
> > > effective_cache_size which is 4x the size of shared buffers. I had to
> > > set effective_cache_size to its old 128MB default so the EXPLAIN
> > > regression tests would pass unchanged.
> >
> > That's not really autotuning though. ISTM that making the *default* 4
> > x shared_buffers might make perfect sense, but do we really need to
> > hijack the value of "-1" for that? That might be useful for some time
> > when we have actual autotuning, that somehow inspects the system and
> > tunes it from there.
> >
> > I also don't think it should be called autotuning, when it's just a
> > "smarter default value".
> >
> > I like the feature, though, just not the packaging.
>
> That "auto-tuning" text came from the wal_buffer documentation, which
> does exactly this based on shared_buffers:
>
> The contents of the WAL buffers are written out to disk at every
> transaction commit, so extremely large values are unlikely to
> provide a significant benefit. However, setting this value to at
> least a few megabytes can improve write performance on a busy
> --> server where many clients are committing at once. The auto-tuning
> -----------
> selected by the default setting of -1 should give reasonable
> results in most cases.
>
> I am fine with rewording and not using -1, but we should change the
> wal_buffer default and documentation too then. I am not sure what other
> value than -1 to use? 0? I figure if we ever get better auto-tuning,
> we would just remove this functionality and make it better.
TODO that we might want to revisit this.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Bruce Momjian
Date:
On Tue, Oct 8, 2013 at 01:04:18PM -0600, Kevin Hale Boyes wrote: > The patch contains a small typo in config.sgml. Probably just drop the "is" > from "is can". > > + results if this database cluster is can utilize most of the memory > > Kevin. Thank you, fixed. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Simon Riggs
Date:
On 8 October 2013 17:13, Bruce Momjian <bruce@momjian.us> wrote: > Patch applied with a default of 4x shared buffers. I have added a 9.4 > TODO that we might want to revisit this. I certainly want to revisit this patch and this setting. How can we possibly justify a default setting that could be more than physical RAM? The maximum known safe value is the setting of shared_buffers itself, without external knowledge. But how can we possibly set it even that high? Does anyone have any evidence at all on how to set this? How can we possibly autotune it? I prefer the idea of removing "effective_cache_size" completely, since it has so little effect on workloads and is very frequently misunderstood by users. It's just dangerous, without being useful. Why do we autotune the much more important synch scan threshold, yet allow tuning of e_c_s? Lets fix e_c_s at 25% of shared_buffers and remove the parameter completely, just as we do with so many other performance parameters. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Andres Freund
Date:
On 2014-05-06 15:09:15 +0100, Simon Riggs wrote: > On 8 October 2013 17:13, Bruce Momjian <bruce@momjian.us> wrote: > > > Patch applied with a default of 4x shared buffers. I have added a 9.4 > > TODO that we might want to revisit this. > > I certainly want to revisit this patch and this setting. > > How can we possibly justify a default setting that could be more than > physical RAM? Because it doesn't hurt overly much if it's set too large? > The maximum known safe value is the setting of shared_buffers itself, > without external knowledge. But how can we possibly set it even that > high? > > Does anyone have any evidence at all on how to set this? How can we > possibly autotune it? It's just a different default setting? I think the new value will cause less problems than the old one which frequently leads to index scans not being used although beneficial. > I prefer the idea of removing "effective_cache_size" completely, since > it has so little effect on workloads and is very frequently > misunderstood by users. It's just dangerous, without being useful. -many. > Lets fix e_c_s at 25% of shared_buffers and remove the parameter > completely, just as we do with so many other performance parameters. That'd cause *massive* regression for many installations. Without significantly overhauling costsize.c that's really not feasible. There's lots of installations that use relatively small s_b settings for good reasons. If we fix e_c_s to 25% of s_b many queries on those won't use indexes anymore. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes: > Lets fix e_c_s at 25% of shared_buffers and remove the parameter > completely, just as we do with so many other performance parameters. Apparently, you don't even understand what this parameter is for. Setting it smaller than shared_buffers is insane. regards, tom lane
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Simon Riggs
Date:
On 6 May 2014 15:18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: >> Lets fix e_c_s at 25% of shared_buffers and remove the parameter >> completely, just as we do with so many other performance parameters. > > Apparently, you don't even understand what this parameter is for. > Setting it smaller than shared_buffers is insane. You know you can't justify that comment and so do I. What workload is so badly affected as to justify use of the word insane in this context? I can read code. But it appears nobody apart from me actually does, or at least understand the behaviour that results. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Simon Riggs
Date:
On 6 May 2014 15:17, Andres Freund <andres@2ndquadrant.com> wrote: >> Lets fix e_c_s at 25% of shared_buffers and remove the parameter >> completely, just as we do with so many other performance parameters. > > That'd cause *massive* regression for many installations. Without > significantly overhauling costsize.c that's really not feasible. There's > lots of installations that use relatively small s_b settings for good > reasons. If we fix e_c_s to 25% of s_b many queries on those won't use > indexes anymore. "many queries" can't be correct. All this changes is the cost of IndexScans that would use more than 25% of shared_buffers worth of data. Hopefully not many of those in your workload. Changing the cost doesn't necessarily prevent index scans either. And if there are many of those in your workload AND you run more than one at same time, then the larger setting will work against you. So the benefit window for such a high setting is slim, at best. I specifically picked 25% of shared_buffers because that is the point at which sequential scans become more efficient and use the cache more efficiently. If our cost models are correct, then switching away from index scans shouldn't hurt at all. Assuming we can use large tranches of memory for single queries has a very bad effect on cache hit ratios. Encouraging such usage seems to fall into the category of insane, from my perspective. Having it as a default setting is bad. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes: > On 6 May 2014 15:18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Simon Riggs <simon@2ndQuadrant.com> writes: >>> Lets fix e_c_s at 25% of shared_buffers and remove the parameter >>> completely, just as we do with so many other performance parameters. >> Apparently, you don't even understand what this parameter is for. >> Setting it smaller than shared_buffers is insane. > You know you can't justify that comment and so do I. What I meant is that your comments indicate complete lack of understanding of the parameter. It's *supposed* to be larger than shared_buffers, and there is no "safety risk" involved in setting it too high. regards, tom lane
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Andres Freund
Date:
On 2014-05-06 17:43:45 +0100, Simon Riggs wrote: > On 6 May 2014 15:17, Andres Freund <andres@2ndquadrant.com> wrote: > > >> Lets fix e_c_s at 25% of shared_buffers and remove the parameter > >> completely, just as we do with so many other performance parameters. > > > > That'd cause *massive* regression for many installations. Without > > significantly overhauling costsize.c that's really not feasible. There's > > lots of installations that use relatively small s_b settings for good > > reasons. If we fix e_c_s to 25% of s_b many queries on those won't use > > indexes anymore. > > "many queries" can't be correct. It is. > All this changes is the cost of > IndexScans that would use more than 25% of shared_buffers worth of > data. Hopefully not many of those in your workload. Changing the cost > doesn't necessarily prevent index scans either. And if there are many > of those in your workload AND you run more than one at same time, then > the larger setting will work against you. So the benefit window for > such a high setting is slim, at best. Why? There's many workloads where indexes are larger than shared buffers but fit into the operating system's cache. And that's precisely what effective_cache_size is about. Especially on bigger machines shared_buffers can't be set big enough to actually use all the machine's memory. It's not uncommon to have 4GB shared buffers on a machine with 512GB RAM... It'd be absolutely disastrous to set effective_cache_size to 1GB for an analytics workload. > I specifically picked 25% of shared_buffers because that is the point > at which sequential scans become more efficient and use the cache more > efficiently. If our cost models are correct, then switching away from > index scans shouldn't hurt at all. More often than not indexes are smaller than the table size, so this argument doesn't seem to make much sense. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
On 05/06/2014 08:41 AM, Simon Riggs wrote: > On 6 May 2014 15:18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Simon Riggs <simon@2ndQuadrant.com> writes: >>> Lets fix e_c_s at 25% of shared_buffers and remove the parameter >>> completely, just as we do with so many other performance parameters. >> >> Apparently, you don't even understand what this parameter is for. >> Setting it smaller than shared_buffers is insane. > > You know you can't justify that comment and so do I. What workload is > so badly affected as to justify use of the word insane in this > context? Most of them? Really? I have to tell you, your post sounds like you've missed out on the last 12 years of PostgreSQL query tuning. Which is a little shocking considering where you've spent that 12 years. > I can read code. But it appears nobody apart from me actually does, or > at least understand the behaviour that results. So, break it down for us: explain how we'll get desirable query plans out of the current code if: (1) Table & Index is larger than shared_buffers; (2) Table & Index is smaller than RAM; (3) Selectivity is 0.02 (4) ECS is set lower than shared_buffers I think the current cost math does a pretty good job of choosing the correct behavior if ECS is set correctly. But if it's not, no. If I'm wrong, then you've successfully found a bug in our costing math, so I'd love to see it. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Jeff Janes
Date:
On Tue, May 6, 2014 at 7:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Simon Riggs <simon@2ndQuadrant.com> writes:Apparently, you don't even understand what this parameter is for.
> Lets fix e_c_s at 25% of shared_buffers and remove the parameter
> completely, just as we do with so many other performance parameters.
Setting it smaller than shared_buffers is insane.
The e_c_s is assumed to be usable for each backend trying to run queries sensitive to it. If you have dozens of such queries running simultaneously (not something I personally witness, but also not insane) and each of these queries has its own peculiar working set, then having e_c_s smaller than s_b makes sense.
I have a hard time believe that this is at all common, however. Certainly not common enough so to justify cranking the setting all the way the other direction and then removing the crank handle.
Cheers,
Jeff
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Simon Riggs
Date:
On 6 May 2014 18:08, Josh Berkus <josh@agliodbs.com> wrote: > On 05/06/2014 08:41 AM, Simon Riggs wrote: >> On 6 May 2014 15:18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Simon Riggs <simon@2ndQuadrant.com> writes: >>>> Lets fix e_c_s at 25% of shared_buffers and remove the parameter >>>> completely, just as we do with so many other performance parameters. >>> >>> Apparently, you don't even understand what this parameter is for. >>> Setting it smaller than shared_buffers is insane. >> >> You know you can't justify that comment and so do I. What workload is >> so badly affected as to justify use of the word insane in this >> context? > > Most of them? Really? I didn't use the word "most" anywhere. So not really clear what you are saying. > I have to tell you, your post sounds like you've missed out on the last > 12 years of PostgreSQL query tuning. Which is a little shocking > considering where you've spent that 12 years. I read the code, think what to say and then say what I think, not rely on dogma. I tried to help years ago by changing the docs on e_c_s, but that's been mostly ignored down the years, as it is again here. >> I can read code. But it appears nobody apart from me actually does, or >> at least understand the behaviour that results. > > So, break it down for us: explain how we'll get desirable query plans > out of the current code if: > > (1) Table & Index is larger than shared_buffers; > (2) Table & Index is smaller than RAM; > (3) Selectivity is 0.02 > (4) ECS is set lower than shared_buffers Is that it? The above use case is the basis for a default setting?? It's a circular argument, since you're assuming we've all followed your advice of setting shared_buffers to 25% of RAM, which then presumes a large gap between (1) and (2). It also ignores that if ECS is set low then it increases the cost, but does not actually preclude index scans larger than that setting. It also ignores that if your database fits in RAM, your random_page_cost setting is wrong and lowering that appropriately will increase the incidence of index scans again. You should also include (5) You're only running one query at a time (which you know, how?) (6) You don't care if you flush your cache for later queries (7) You've got big tables yet are not partitioning them effectively > I think the current cost math does a pretty good job of choosing the > correct behavior if ECS is set correctly. But if it's not, no. > > If I'm wrong, then you've successfully found a bug in our costing math, > so I'd love to see it. Setting it high generates lovely EXPLAINs for a single query, but do we have any evidence that whole workloads are better off with higher settings? And that represents the general case? And it makes sense even if it makes it bigger than actual RAM?? If you assume that you can use all of that memory, you're badly wrong. Presumably you also set work_mem larger than shared_buffers, since that will induce exactly the same behaviour and have the same downsides. (Large memory usage for single query, but causes cache churn, plus problems if we try to overuse RAM because of concurrent usage). In the absence of performance measurements that show the genuine effect on workloads, I am attempting to make a principle-based argument. I suggested 25% of shared_buffers because we already use that as the point where other features cut in to minimise cache churn. I'm making the argument that if *that* setting is the right one to control cache churn, then why is it acceptable for index scans to churn up even bigger chunks of cache? In case it wasn't clear, I am only suggesting 25% of shared_buffers for large settings, not for micro-configurations. My proposal to remove the setting completely was a rhetorical question, asking why we have a setting for this parameter and yet no tunables for other things. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Simon Riggs
Date:
On 6 May 2014 20:41, Jeff Janes <jeff.janes@gmail.com> wrote: > The e_c_s is assumed to be usable for each backend trying to run queries > sensitive to it. If you have dozens of such queries running simultaneously > (not something I personally witness, but also not insane) and each of these > queries has its own peculiar working set, then having e_c_s smaller than s_b > makes sense. > > I have a hard time believe that this is at all common, however. If larger queries are frequent enough to care about, they will happen together. We should be acting conservatively with default settings. You can be as aggressive as you like with your own config. > Certainly > not common enough so to justify cranking the setting all the way the other > direction and then removing the crank handle. Yes, that part was mostly rhetorical, I wasn't arguing for complete removal, especially when autotuning is unclear. I am worried about people that set effective_cache_size but not shared_buffers, which is too common. If we link the two parameters it should work in both directions by default. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Robert Haas
Date:
On Tue, May 6, 2014 at 4:38 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > I read the code, think what to say and then say what I think, not > rely on dogma. > > I tried to help years ago by changing the docs on e_c_s, but that's > been mostly ignored down the years, as it is again here. Well, for what it's worth, I've encountered systems where setting effective_cache_size too low resulted in bad query plans, but I've never encountered the reverse situation. My personal sample size is pretty small, though. And, when I did a study of 100+ pgsql-performance reports for last year's PGCon talk, I didn't turn up any that seemed related to effective_cache_size. Here's the subset of those reports that appeared settings-related: https://sites.google.com/site/robertmhaas/query-performance/settings I think the basic problem with effective_cache_size is that it's a pretty weak knob. I don't think it's a secret that we more often seq-scan when we should have index-scanned than the other other way around. So if I had to hard-code a value for effective_cache_size, I'd probably pick positive infinity. Yeah, that could be overkill - but I bet I'd be able to compensate by frobbing seq_page_cost and random_page_cost in a pinch. I basically think the auto-tuning we've installed for effective_cache_size is stupid. Most people are going to run with only a few GB of shared_buffers, so setting effective_cache_size to a small multiple of that isn't going to make many more people happy than just raising the value - say from the current default of 128MB to, oh, 4GB - especially because in my experience queries aren't very sensitive to the exact value; it just has to not be way too small. I bet the number of PostgreSQL users who would be made happy by a much higher hard-coded default is not too different from the number that will be made happy by the (completely unprincipled) auto-tuning. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Andrew Dunstan
Date:
On 05/06/2014 05:54 PM, Robert Haas wrote: > On Tue, May 6, 2014 at 4:38 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> I read the code, think what to say and then say what I think, not >> rely on dogma. >> >> I tried to help years ago by changing the docs on e_c_s, but that's >> been mostly ignored down the years, as it is again here. > Well, for what it's worth, I've encountered systems where setting > effective_cache_size too low resulted in bad query plans, but I've > never encountered the reverse situation. I have encountered both. Recently I discovered that a client's performance problems were solved pretty instantly by reducing a ridiculously high setting down to something more reasonable (in this case about 50% of physical RAM is what we set it to). cheers andrew
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Simon Riggs
Date:
On 6 May 2014 22:54, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, May 6, 2014 at 4:38 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> I read the code, think what to say and then say what I think, not >> rely on dogma. >> >> I tried to help years ago by changing the docs on e_c_s, but that's >> been mostly ignored down the years, as it is again here. > > Well, for what it's worth, I've encountered systems where setting > effective_cache_size too low resulted in bad query plans, but I've > never encountered the reverse situation. I agree with that. Though that misses my point, which is that you can't know that all of that memory is truly available on a server with many concurrent users. Choosing settings that undercost memory intensive plans are not the best choice for a default strategy in a mixed workload when cache may be better used elsewhere, even if such settings make sense for some individual users. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > I basically think the auto-tuning we've installed for > effective_cache_size is stupid. Most people are going to run with > only a few GB of shared_buffers, so setting effective_cache_size to a > small multiple of that isn't going to make many more people happy than > just raising the value - say from the current default of 128MB to, oh, > 4GB - especially because in my experience queries aren't very > sensitive to the exact value; it just has to not be way too small. I > bet the number of PostgreSQL users who would be made happy by a much > higher hard-coded default is not too different from the number that > will be made happy by the (completely unprincipled) auto-tuning. There is a lot to be said for that argument, especially considering that we're not even really happy with the auto-tuning mechanism, never mind the behavior it's trying to implement. regards, tom lane
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
On 05/06/2014 01:38 PM, Simon Riggs wrote: >> Most of them? Really? > > I didn't use the word "most" anywhere. So not really clear what you are saying. Sorry, those were supposed to be periods, not question marks. As in "Most of them. Really." >> I have to tell you, your post sounds like you've missed out on the last >> 12 years of PostgreSQL query tuning. Which is a little shocking >> considering where you've spent that 12 years. > > I read the code, think what to say and then say what I think, not > rely on dogma. > > I tried to help years ago by changing the docs on e_c_s, but that's > been mostly ignored down the years, as it is again here. Well, if you're going to buck the conventional wisdom, you need to provide a factual and numerical basis for your arguments. So far, I haven't seen you do so, although I'll admit that I haven't read 100% of hackers traffic. So if you have previously presented benchmarking results or math, please post a link to the archives. >> (1) Table & Index is larger than shared_buffers; >> (2) Table & Index is smaller than RAM; >> (3) Selectivity is 0.02 >> (4) ECS is set lower than shared_buffers > > Is that it? The above use case is the basis for a default setting?? Are you just going to ask rhetorical questions? > It's a circular argument, since you're assuming we've all followed > your advice of setting shared_buffers to 25% of RAM, which then > presumes a large gap between (1) and (2). That 20% to 25% recommendation had a factual and numerical basis, based on extensive testing using DBT2 at OSDL. While we have reason to believe that advice may be somewhat dated, nobody has undertaken the benchmarking work to create a new advice basis. If you have done so, you have not shared the results. > Setting it high generates lovely EXPLAINs for a single query, but do > we have any evidence that whole workloads are better off with higher > settings? And that represents the general case? So? Create a benchmark. Prove that you're right. I'd love to see it, we're suffering from a serious lack of data here. > In the absence of performance measurements that show the genuine > effect on workloads, I am attempting to make a principle-based > argument. I suggested 25% of shared_buffers because we already use > that as the point where other features cut in to minimise cache churn. That makes no sense whatsoever. Again, show me the math. > In case it wasn't clear, I am only suggesting 25% of shared_buffers > for large settings, not for micro-configurations. My proposal to > remove the setting completely was a rhetorical question, asking why we > have a setting for this parameter and yet no tunables for other > things. Asking rhetorical questions based on extreme perspectives that you don't really believe in is *definitionally* trolling. If you're going to make an argument in favor of different tuning advice, then do it based on something in which you actually believe, based on hard evidence. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
Robert, Tom: On 05/06/2014 03:28 PM, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I basically think the auto-tuning we've installed for >> effective_cache_size is stupid. Most people are going to run with >> only a few GB of shared_buffers, so setting effective_cache_size to a >> small multiple of that isn't going to make many more people happy than >> just raising the value - say from the current default of 128MB to, oh, >> 4GB - especially because in my experience queries aren't very >> sensitive to the exact value; it just has to not be way too small. I >> bet the number of PostgreSQL users who would be made happy by a much >> higher hard-coded default is not too different from the number that >> will be made happy by the (completely unprincipled) auto-tuning. > > There is a lot to be said for that argument, especially considering > that we're not even really happy with the auto-tuning mechanism, > never mind the behavior it's trying to implement. Right, the decisive question with this patch is: does it improve things over what would have happened with most users anyway? Based on the users I deal with ... which skew rather strongly EC2 web applications and one-off data warehouses ... most users don't set effective_cache_size *at all* until they hire me or chat me up on IRC. This means that ECS is running with the default of 128MB, which means that Postgres is seriously underestimating the probability of cached data on most machines today. The users I deal with are a lot more likely to have set shared_buffers themselves, based on the 25% conventional wisdom (or based on some other advice). And, when they want to tinker with pushing index usage, they change random_page_cost instead, sometimes to silly values (like 0.5). So, based solely on the users I deal with, I wouldfind automatically setting effective_cache_size to 3X or 4X shared_buffers to be a benefit compared to leaving it at its current fixed low default. Other people on this list deal with different kinds of users, so if people work with a class of users where a default of 4 X shared_buffers would be definitely a worse idea than the current default, then please speak up. ECS is definitely a "weak knob", as Robert says, but it's *good* that it's a weak knob. This means that we can make large adjustments in it without introducing a lot of highly variable behavior ... instead of random_page_cost, which does. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Simon Riggs
Date:
On 6 May 2014 23:47, Josh Berkus <josh@agliodbs.com> wrote: > If you're going to make > an argument in favor of different tuning advice, then do it based on > something in which you actually believe, based on hard evidence. The proposed default setting of 4x shared_buffers is unprincipled *and* lacks hard evidence from you and everybody else. You've used the phrase "conventional wisdom" to describe things which you have spoken loudly about. I personally have not seen sufficient evidence to rely on that as wisdom. I note that my detailed comments as to why it is unsafe have been ignored, again. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Simon Riggs
Date:
On 6 May 2014 23:28, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I basically think the auto-tuning we've installed for >> effective_cache_size is stupid. Most people are going to run with >> only a few GB of shared_buffers, so setting effective_cache_size to a >> small multiple of that isn't going to make many more people happy than >> just raising the value - say from the current default of 128MB to, oh, >> 4GB - especially because in my experience queries aren't very >> sensitive to the exact value; it just has to not be way too small. I >> bet the number of PostgreSQL users who would be made happy by a much >> higher hard-coded default is not too different from the number that >> will be made happy by the (completely unprincipled) auto-tuning. > > There is a lot to be said for that argument, especially considering > that we're not even really happy with the auto-tuning mechanism, > never mind the behavior it's trying to implement. +1 -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Peter Geoghegan
Date:
On Tue, May 6, 2014 at 10:20 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 6 May 2014 23:47, Josh Berkus <josh@agliodbs.com> wrote: > >> If you're going to make >> an argument in favor of different tuning advice, then do it based on >> something in which you actually believe, based on hard evidence. > > The proposed default setting of 4x shared_buffers is unprincipled > *and* lacks hard evidence from you and everybody else. +1. In my view, we probably should have set it to a much higher absolute default value. The main problem with setting it to any multiple of shared_buffers that I can see is that shared_buffers is a very poor proxy for what effective_cache_size is supposed to represent. In general, the folk wisdom around sizing shared_buffers has past its sell-by date. -- Peter Geoghegan
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Mark Kirkwood
Date:
On 07/05/14 17:35, Peter Geoghegan wrote: > On Tue, May 6, 2014 at 10:20 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> On 6 May 2014 23:47, Josh Berkus <josh@agliodbs.com> wrote: >> >>> If you're going to make >>> an argument in favor of different tuning advice, then do it based on >>> something in which you actually believe, based on hard evidence. >> The proposed default setting of 4x shared_buffers is unprincipled >> *and* lacks hard evidence from you and everybody else. > +1. In my view, we probably should have set it to a much higher > absolute default value. The main problem with setting it to any > multiple of shared_buffers that I can see is that shared_buffers is a > very poor proxy for what effective_cache_size is supposed to > represent. In general, the folk wisdom around sizing shared_buffers > has past its sell-by date. > +1. ISTM the only sensible approach to auto tune this requires us to have a plugin to detect how much RAM the system has (and then setting it to 1/2 that say). I wonder if it might be worthwhile writing plugins for the handful of popular platforms. For the remainder maybe we could leave it defaulting to the current (small) value, and encourage volunteers to code the missing ones if they want something better. Regards Mark
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Simon Riggs
Date:
On 6 May 2014 17:55, Andres Freund <andres@2ndquadrant.com> wrote: >> All this changes is the cost of >> IndexScans that would use more than 25% of shared_buffers worth of >> data. Hopefully not many of those in your workload. Changing the cost >> doesn't necessarily prevent index scans either. And if there are many >> of those in your workload AND you run more than one at same time, then >> the larger setting will work against you. So the benefit window for >> such a high setting is slim, at best. > > Why? There's many workloads where indexes are larger than shared buffers > but fit into the operating system's cache. And that's precisely what > effective_cache_size is about. > Especially on bigger machines shared_buffers can't be set big enough to > actually use all the machine's memory. It's not uncommon to have 4GB > shared buffers on a machine with 512GB RAM... It'd be absolutely > disastrous to set effective_cache_size to 1GB for an analytics workload. In this case, a setting of effective_cache_size > (4 * shared_buffers) could be appropriate, as long as we are certain we have the memory. We don't have any stats on peak memory usage to be certain - although in that case its pretty clear. If we had stats on how effective the indexscan was at multiple-hitting earlier read blocks, we'd be able to autotune, but I accept that without that we do still need the parameter. >> I specifically picked 25% of shared_buffers because that is the point >> at which sequential scans become more efficient and use the cache more >> efficiently. If our cost models are correct, then switching away from >> index scans shouldn't hurt at all. > > More often than not indexes are smaller than the table size, so this > argument doesn't seem to make much sense. If we believe that 25% of shared_buffers worth of heap blocks would flush the cache doing a SeqScan, why should we allow 400% of shared_buffers worth of index blocks? In your example, that would be 1GB of heap blocks, or 16GB of index blocks. If our table is 100GB with a 32GB index, then yes, that is 1% of the heap and 50% of the index. But that doesn't matter, since I am discussing the point at which we prevent the cache being churned. Given your example we do not allow a SeqScan of a table larger than 1GB to flush cache, since we use BAS_BULKREAD. If we allow an indexscan plan that will touch 16GB of an index that will very clearly flush out our 4GB of shared_buffers, increasing time for later queries even if they only have to read from OS buffers back into shared_buffers. That will still show itself as a CPU spike, which is what people say they are seeing. I think I'm arguing myself towards using a BufferAccessStrategy of BAS_BULKREAD for large IndexScans, BitMapIndexScans and BitMapHeapScans. Yes, we can make plans assuming we can use OS cache, but we shouldn't be churning shared_buffers when we execute those plans. "large" here meaning the same thing as it does for SeqScans, which is a scan that seems likely to touch more than 25% of shared buffers. I'll work up a patch. Perhaps it would also be useful to consider using a sequential scan of the index relation for less selective BitmapIndexScans, just as we do very effectively during VACUUM. Maybe that is a better idea than bitmap indexes. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Robert Haas
Date:
On Wed, May 7, 2014 at 3:18 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > If we believe that 25% of shared_buffers worth of heap blocks would > flush the cache doing a SeqScan, why should we allow 400% of > shared_buffers worth of index blocks? I think you're comparing apples and oranges. The 25% threshold is answering the question "How big does a sequential scan have to be before it's likely to flush so much so much unrelated data out of shared_buffers that it hurts the performance of other things running on the system?". So it's not really about whether or not things will *fit* in the cache, but rather a judgement about at what point caching that stuff is going to be less value than continuing to cache other things. Also, it's specifically a judgement about shared_buffers, not system memory. But effective_cache_size is used to estimate the likelihood that an index scan which accesses the same heap or index block twice will still be in cache on the second hit, and thus need to be faulted in only once. So this *is* a judgment about what will fit - generally over a very short time scale. And, since bringing a page into shared_buffers from the OS cache is much less expensive than bringing a page into memory from disk, it's really about what will fit in overall system memory, not just shared_buffers. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Simon Riggs
Date:
On 7 May 2014 13:31, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, May 7, 2014 at 3:18 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> If we believe that 25% of shared_buffers worth of heap blocks would >> flush the cache doing a SeqScan, why should we allow 400% of >> shared_buffers worth of index blocks? > > I think you're comparing apples and oranges. I understood the distinction, which is why I changed the direction of my thinking to say > Yes, we can make plans assuming we can use OS cache, > but we shouldn't be churning shared_buffers when we execute those > plans. and hence why I proposed > I think I'm arguing myself towards using a BufferAccessStrategy of > BAS_BULKREAD for large IndexScans, BitMapIndexScans and > BitMapHeapScans. which I hope will be effective in avoiding churn in shared_buffers even though we may use much larger memory from the OS. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes: > I think I'm arguing myself towards using a BufferAccessStrategy of > BAS_BULKREAD for large IndexScans, BitMapIndexScans and > BitMapHeapScans. As soon as you've got some hard evidence to present in favor of such changes, we can discuss it. I've got other things to do besides hypothesize. In the meantime, it seems like there is an emerging consensus that nobody much likes the existing auto-tuning behavior for effective_cache_size, and that we should revert that in favor of just increasing the fixed default value significantly. I see no problem with a value of say 4GB; that's very unlikely to be worse than the pre-9.4 default (128MB) on any modern machine. Votes for or against? regards, tom lane
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Merlin Moncure
Date:
On Wed, May 7, 2014 at 9:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: >> I think I'm arguing myself towards using a BufferAccessStrategy of >> BAS_BULKREAD for large IndexScans, BitMapIndexScans and >> BitMapHeapScans. > > As soon as you've got some hard evidence to present in favor of such > changes, we can discuss it. I've got other things to do besides > hypothesize. Let me throw out one last point: It's pretty likely that s_b is going to be raised higher as a percentage of RAM. I never really bought into the conventional wisdom of 25% and have had to set it lower many times. Nevertheless, it was a documented suggestion. The core issues are: 1) There is no place to enter total system memory available to the database in postgresql.conf 2) Memory settings (except for the above) are given as absolute amounts, not percentages. It would be a lot easier to standardize configurations particularly if there was a way to electronically support #1 with auto-detection. Then, e_c_s. s_b, work_mem, and various other settings could be given using standard (and perhaps somewhat conservative) percentages using the best and hopefully factually supported recommendations. I oversee dozens of servers in a virtualized environment (as most enterprise shops are these days). Everything is 'right sized', often on demand, and often nobody bothers to adjust the various settings. > In the meantime, it seems like there is an emerging consensus that nobody > much likes the existing auto-tuning behavior for effective_cache_size, > and that we should revert that in favor of just increasing the fixed > default value significantly. I see no problem with a value of say 4GB; > that's very unlikely to be worse than the pre-9.4 default (128MB) on any > modern machine. In lieu of something fancy like the above, adjusting the defaults seems a better way to go (so I vote to revert). merlin
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Andres Freund
Date:
On 2014-05-07 10:07:07 -0400, Tom Lane wrote: > In the meantime, it seems like there is an emerging consensus that nobody > much likes the existing auto-tuning behavior for effective_cache_size, > and that we should revert that in favor of just increasing the fixed > default value significantly. I see no problem with a value of say 4GB; > that's very unlikely to be worse than the pre-9.4 default (128MB) on any > modern machine. > > Votes for or against? +1 for increasing it to 4GB and remove the autotuning. I don't like the current integration into guc.c much and a new static default doesn't seem to be worse than the current autotuning. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Robert Haas
Date:
On Wed, May 7, 2014 at 10:12 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-05-07 10:07:07 -0400, Tom Lane wrote: >> In the meantime, it seems like there is an emerging consensus that nobody >> much likes the existing auto-tuning behavior for effective_cache_size, >> and that we should revert that in favor of just increasing the fixed >> default value significantly. I see no problem with a value of say 4GB; >> that's very unlikely to be worse than the pre-9.4 default (128MB) on any >> modern machine. >> >> Votes for or against? > > +1 for increasing it to 4GB and remove the autotuning. I don't like the > current integration into guc.c much and a new static default doesn't > seem to be worse than the current autotuning. It was my proposal originally, so I assume I'd be counted as in favor, but for the sake of clarity: +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Magnus Hagander
Date:
On Wed, May 7, 2014 at 4:12 PM, Andres Freund <andres@2ndquadrant.com> wrote:
If we can't make the autotuning better than that, we're better off holding off on that one until we can actually figure out something better. (At which point perhaps we can reach the level where we can just remove it.. But that's all handwaving about the future of course).
On 2014-05-07 10:07:07 -0400, Tom Lane wrote:+1 for increasing it to 4GB and remove the autotuning. I don't like the
> In the meantime, it seems like there is an emerging consensus that nobody
> much likes the existing auto-tuning behavior for effective_cache_size,
> and that we should revert that in favor of just increasing the fixed
> default value significantly. I see no problem with a value of say 4GB;
> that's very unlikely to be worse than the pre-9.4 default (128MB) on any
> modern machine.
>
> Votes for or against?
current integration into guc.c much and a new static default doesn't
seem to be worse than the current autotuning.
+1.
If we can't make the autotuning better than that, we're better off holding off on that one until we can actually figure out something better. (At which point perhaps we can reach the level where we can just remove it.. But that's all handwaving about the future of course).
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, May 7, 2014 at 3:18 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> If we believe that 25% of shared_buffers worth of heap blocks would >> flush the cache doing a SeqScan, why should we allow 400% of >> shared_buffers worth of index blocks? > I think you're comparing apples and oranges. The 25% threshold is > answering the question "How big does a sequential scan have to be > before it's likely to flush so much so much unrelated data out of > shared_buffers that it hurts the performance of other things running > on the system?". So it's not really about whether or not things will > *fit* in the cache, but rather a judgement about at what point caching > that stuff is going to be less value than continuing to cache other > things. Also, it's specifically a judgement about shared_buffers, not > system memory. > But effective_cache_size is used to estimate the likelihood that an > index scan which accesses the same heap or index block twice will > still be in cache on the second hit, and thus need to be faulted in > only once. So this *is* a judgment about what will fit - generally > over a very short time scale. And, since bringing a page into > shared_buffers from the OS cache is much less expensive than bringing > a page into memory from disk, it's really about what will fit in > overall system memory, not just shared_buffers. Another point is that the 25% seqscan threshold actually controls some specific caching decisions, which effective_cache_size does not. Raising effective_cache_size "too high" is unlikely to result in cache trashing; in fact I'd guess the opposite. What that would do is cause the planner to prefer indexscans over seqscans in more cases involving large tables. But if you've got a table+index that's bigger than RAM, seqscans are probably going to be worse for the OS cache than indexscans, because they're going to require bringing in more data. So I still think this whole argument is founded on shaky hypotheses with a complete lack of hard data showing that a smaller default for effective_cache_size would be better. The evidence we have points in the other direction. regards, tom lane
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Andrew Dunstan
Date:
On 05/07/2014 10:12 AM, Andres Freund wrote: > On 2014-05-07 10:07:07 -0400, Tom Lane wrote: >> In the meantime, it seems like there is an emerging consensus that nobody >> much likes the existing auto-tuning behavior for effective_cache_size, >> and that we should revert that in favor of just increasing the fixed >> default value significantly. I see no problem with a value of say 4GB; >> that's very unlikely to be worse than the pre-9.4 default (128MB) on any >> modern machine. >> >> Votes for or against? > +1 for increasing it to 4GB and remove the autotuning. I don't like the > current integration into guc.c much and a new static default doesn't > seem to be worse than the current autotuning. > > +1. If we ever want to implement an auto-tuning heuristic it seems we're going to need some hard empirical evidence to support it, and that doesn't seem likely to appear any time soon. cheers andrew
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Simon Riggs
Date:
On 7 May 2014 15:07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: >> I think I'm arguing myself towards using a BufferAccessStrategy of >> BAS_BULKREAD for large IndexScans, BitMapIndexScans and >> BitMapHeapScans. > > As soon as you've got some hard evidence to present in favor of such > changes, we can discuss it. I've got other things to do besides > hypothesize. Now we have a theory to test, I'll write a patch and we can collect evidence for, or against. > In the meantime, it seems like there is an emerging consensus that nobody > much likes the existing auto-tuning behavior for effective_cache_size, > and that we should revert that in favor of just increasing the fixed > default value significantly. I see no problem with a value of say 4GB; > that's very unlikely to be worse than the pre-9.4 default (128MB) on any > modern machine. > > Votes for or against? +1 for fixed 4GB and remove the auto-tuning code. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Simon Riggs
Date:
On 7 May 2014 15:10, Merlin Moncure <mmoncure@gmail.com> wrote: > The core issues are: > 1) There is no place to enter total system memory available to the > database in postgresql.conf > 2) Memory settings (except for the above) are given as absolute > amounts, not percentages. Those sound useful starting points. The key issue for me is that effective_cache_size is a USERSET. It applies per-query, just like work_mem (though work_mem is per query node). If we had "total system memory" we wouldn't know how to divide it up amongst users since we have no functionality for "workload management". It would be very nice to be able to tell Postgres that "I have 64GB RAM, use it wisely". At present, any and all users can set effective_cache_size and work_mem to any value they please, any time they wish and thus overuse available memory. Which is why I've had to write plugins to manage the memory allocations better in userspace. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
On 05/07/2014 07:31 AM, Andrew Dunstan wrote: > +1. If we ever want to implement an auto-tuning heuristic it seems we're > going to need some hard empirical evidence to support it, and that > doesn't seem likely to appear any time soon. 4GB default it is, then. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
On 05/06/2014 10:35 PM, Peter Geoghegan wrote: > +1. In my view, we probably should have set it to a much higher > absolute default value. The main problem with setting it to any > multiple of shared_buffers that I can see is that shared_buffers is a > very poor proxy for what effective_cache_size is supposed to > represent. In general, the folk wisdom around sizing shared_buffers > has past its sell-by date. Unfortunately nobody has the time/resources to do the kind of testing required for a new recommendation for shared_buffers. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Peter Geoghegan
Date:
On Wed, May 7, 2014 at 11:04 AM, Josh Berkus <josh@agliodbs.com> wrote: > Unfortunately nobody has the time/resources to do the kind of testing > required for a new recommendation for shared_buffers. I meant to suggest that the buffer manager could be improved to the point that the old advice becomes obsolete. Right now, it's much harder to analyze shared_buffers than it should be, presumably because of the problems with the buffer manager. I think that if we could formulate better *actionable* advice around what we have right now, that would have already happened. We ought to be realistic about the fact that the current recommendations around sizing shared_buffers are nothing more than folk wisdom. That's the best we have right now, but that seems quite unsatisfactory to me. -- Peter Geoghegan
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Jeff Janes
Date:
On Tue, May 6, 2014 at 9:55 AM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2014-05-06 17:43:45 +0100, Simon Riggs wrote:
> All this changes is the cost of
> IndexScans that would use more than 25% of shared_buffers worth of
> data. Hopefully not many of those in your workload. Changing the cost
> doesn't necessarily prevent index scans either. And if there are many
> of those in your workload AND you run more than one at same time, then
> the larger setting will work against you. So the benefit window for
> such a high setting is slim, at best.
Not only do you need to run more than one at a time, but they also must use mostly disjoint sets of data, in order for the larger estimate to be bad.
Why? There's many workloads where indexes are larger than shared buffers
but fit into the operating system's cache. And that's precisely what
effective_cache_size is about.
It is more about the size of the table referenced by the index, rather than the size of the index. The point is that doing a large index scan might lead you to visit the same table blocks repeatedly within quick succession. (If a small index scan is on the inner side of a nested loop, then you might access the same index leaf blocks and the same table blocks repeatedly--that is why is only mostly about the table size, rather than exclusively).
Cheers,
Jeff
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Merlin Moncure
Date:
On Wed, May 7, 2014 at 1:13 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Wed, May 7, 2014 at 11:04 AM, Josh Berkus <josh@agliodbs.com> wrote: >> Unfortunately nobody has the time/resources to do the kind of testing >> required for a new recommendation for shared_buffers. > > I meant to suggest that the buffer manager could be improved to the > point that the old advice becomes obsolete. Right now, it's much > harder to analyze shared_buffers than it should be, presumably because > of the problems with the buffer manager. I think that if we could > formulate better *actionable* advice around what we have right now, > that would have already happened. > > We ought to be realistic about the fact that the current > recommendations around sizing shared_buffers are nothing more than > folk wisdom. That's the best we have right now, but that seems quite > unsatisfactory to me. I think the stock advice is worse then nothing because it is A. based on obsolete assumptions and B. doesn't indicate what the tradeoffs are or what kinds of symptoms adjusting the setting could alleviate. The documentation should be reduced to things that are known, for example: *) raising shared buffers does not 'give more memory to postgres for caching' -- it can only reduce it via double paging *) are generally somewhat faster than fault to o/s buffers *) large s_b than working dataset size can be good configuration for read only loads especially *) have bad interplay with o/s in some configurations with large settings *) shared buffers can reduce write i/o in certain workloads *) interplay with checkpoint *) have different mechanisms for managing contention than o/s buffers merlin
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Andres Freund
Date:
On 2014-05-07 13:32:41 -0500, Merlin Moncure wrote: > On Wed, May 7, 2014 at 1:13 PM, Peter Geoghegan <pg@heroku.com> wrote: > > On Wed, May 7, 2014 at 11:04 AM, Josh Berkus <josh@agliodbs.com> wrote: > >> Unfortunately nobody has the time/resources to do the kind of testing > >> required for a new recommendation for shared_buffers. > > > > I meant to suggest that the buffer manager could be improved to the > > point that the old advice becomes obsolete. Right now, it's much > > harder to analyze shared_buffers than it should be, presumably because > > of the problems with the buffer manager. I think that if we could > > formulate better *actionable* advice around what we have right now, > > that would have already happened. > > > > We ought to be realistic about the fact that the current > > recommendations around sizing shared_buffers are nothing more than > > folk wisdom. That's the best we have right now, but that seems quite > > unsatisfactory to me. > > I think the stock advice is worse then nothing because it is A. based > on obsolete assumptions and B. doesn't indicate what the tradeoffs are > or what kinds of symptoms adjusting the setting could alleviate. The > documentation should be reduced to things that are known, for example: > > *) raising shared buffers does not 'give more memory to postgres for > caching' -- it can only reduce it via double paging That's absolutely not a necessary consequence. If pages are in s_b for a while the OS will be perfectly happy to throw them away. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
On 05/07/2014 11:13 AM, Peter Geoghegan wrote: > We ought to be realistic about the fact that the current > recommendations around sizing shared_buffers are nothing more than > folk wisdom. That's the best we have right now, but that seems quite > unsatisfactory to me. So, as one of several people who put literally hundreds of hours into the original benchmarking which established the sizing recommendations for shared_buffers (and other settings), I find the phrase "folk wisdom" personally offensive. So, can we stop with this? Otherwise, I don't think I can usefully participate in this discussion. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Peter Geoghegan
Date:
On Wed, May 7, 2014 at 11:38 AM, Andres Freund <andres@2ndquadrant.com> wrote: >> *) raising shared buffers does not 'give more memory to postgres for >> caching' -- it can only reduce it via double paging > > That's absolutely not a necessary consequence. If pages are in s_b for a > while the OS will be perfectly happy to throw them away. The biggest problem with double buffering is not that it wastes memory. Rather, it's that it wastes memory bandwidth. I think that lessening that problem will be the major benefit of making larger shared_buffers settings practical. -- Peter Geoghegan
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Andres Freund
Date:
On 2014-05-07 11:45:04 -0700, Peter Geoghegan wrote: > On Wed, May 7, 2014 at 11:38 AM, Andres Freund <andres@2ndquadrant.com> wrote: > >> *) raising shared buffers does not 'give more memory to postgres for > >> caching' -- it can only reduce it via double paging > > > > That's absolutely not a necessary consequence. If pages are in s_b for a > > while the OS will be perfectly happy to throw them away. > > The biggest problem with double buffering is not that it wastes > memory. Rather, it's that it wastes memory bandwidth. Doesn't match my experience. Even with the current buffer manager there's usually enough locality to keep important pages in s_b for a meaningful time. I *have* seen workloads that should have fit into memory not fit because of double buffering. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Robert Haas
Date:
On Wed, May 7, 2014 at 2:40 PM, Josh Berkus <josh@agliodbs.com> wrote: > On 05/07/2014 11:13 AM, Peter Geoghegan wrote: >> We ought to be realistic about the fact that the current >> recommendations around sizing shared_buffers are nothing more than >> folk wisdom. That's the best we have right now, but that seems quite >> unsatisfactory to me. > > So, as one of several people who put literally hundreds of hours into > the original benchmarking which established the sizing recommendations > for shared_buffers (and other settings), I find the phrase "folk wisdom" > personally offensive. So, can we stop with this? > > Otherwise, I don't think I can usefully participate in this discussion. +1. I think it is quite accurate to say that we can't predict precisely what value of shared_buffers will perform best for a particular workload and on a particular system. There are people out there using very large values and very small ones, according to what they have found most effective. But that does not mean, as the phrase "folk wisdom" might be taken to imply, that we don't know anything at all about what actually works well in practice. Because we do know quite a bit about that. I and people I work with have been able to improve performance greatly on many systems by providing guidance based on what this community has been able to understand on this topic, and dismissing it as rubbish is wrong. Also, I seriously doubt that a one-size-fits-all guideline about setting shared_buffers will ever be right for every workload. Workloads, by their nature, are complex beasts. The size of the workload varies, and which portions of it are how hot vary, and the read-write mix varies, and those are not problems with PostgreSQL; those are problems with data. That is not to say that we can't do anything to make PostgreSQL work better across a wider range of settings for shared_buffers, but it is to say that no matter how much work we do on the code, setting this optimally for every workload will probably remain complex. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Robert Haas
Date:
On Wed, May 7, 2014 at 2:49 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-05-07 11:45:04 -0700, Peter Geoghegan wrote: >> On Wed, May 7, 2014 at 11:38 AM, Andres Freund <andres@2ndquadrant.com> wrote: >> >> *) raising shared buffers does not 'give more memory to postgres for >> >> caching' -- it can only reduce it via double paging >> > >> > That's absolutely not a necessary consequence. If pages are in s_b for a >> > while the OS will be perfectly happy to throw them away. >> >> The biggest problem with double buffering is not that it wastes >> memory. Rather, it's that it wastes memory bandwidth. > > Doesn't match my experience. Even with the current buffer manager > there's usually enough locality to keep important pages in s_b for a > meaningful time. I *have* seen workloads that should have fit into > memory not fit because of double buffering. Same here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Peter Geoghegan
Date:
On Wed, May 7, 2014 at 11:40 AM, Josh Berkus <josh@agliodbs.com> wrote: > So, as one of several people who put literally hundreds of hours into > the original benchmarking which established the sizing recommendations > for shared_buffers (and other settings), I find the phrase "folk wisdom" > personally offensive. So, can we stop with this? I have also put a lot of time into benchmarking. No personal offence was intended, and I'm glad that we have some advice to give to users, but the fact of the matter is that current *official* recommendations are very vague. -- Peter Geoghegan
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Peter Geoghegan
Date:
On Wed, May 7, 2014 at 11:50 AM, Robert Haas <robertmhaas@gmail.com> wrote: > But that does not mean, as the phrase "folk > wisdom" might be taken to imply, that we don't know anything at all > about what actually works well in practice. Folk wisdom doesn't imply that. It implies that we think this works, and we may well be right, but there isn't all that much rigor behind some of it. I'm not blaming anyone for this state of affairs. I've heard plenty of people repeat the "don't exceed 8GB" rule - I regularly repeated it myself. I cannot find any rigorous defense of this, though. If you're aware of one, please point it out to me. -- Peter Geoghegan
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
On 05/07/2014 11:52 AM, Peter Geoghegan wrote: > On Wed, May 7, 2014 at 11:40 AM, Josh Berkus <josh@agliodbs.com> wrote: >> So, as one of several people who put literally hundreds of hours into >> the original benchmarking which established the sizing recommendations >> for shared_buffers (and other settings), I find the phrase "folk wisdom" >> personally offensive. So, can we stop with this? > > I have also put a lot of time into benchmarking. No personal offence > was intended, and I'm glad that we have some advice to give to users, > but the fact of the matter is that current *official* recommendations > are very vague. Well, they should be vague; the only hard data we have is rather out-of-date (I think 8.2 was our last set of tests). If we gave users specific, detailed recommendations, we'd be misleading them. For that matter, our advice on shared_buffers ... and our design for it ... is going to need to change radically soon, since Linux is getting an ARC with a frequency cache as well as a recency cache, and FreeBSD and OpenSolaris already have them. FWIW, if someone could fund me for a month, I'd be happy to create a benchmarking setup where we could test these kinds of things; I have pretty clear ideas how to build one. I imagine some of our other consultants could make the same offer. However, it's too much work for anyone to get done "in their spare time". -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Peter Geoghegan
Date:
On Wed, May 7, 2014 at 11:50 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> Doesn't match my experience. Even with the current buffer manager >> there's usually enough locality to keep important pages in s_b for a >> meaningful time. I *have* seen workloads that should have fit into >> memory not fit because of double buffering. > > Same here. I think that it depends on whether or not you're thinking about the worst case. Most people are not going to be in the category you describe here. Plenty of people in the Postgres community run with very large shared_buffers settings, on non i/o bound workloads, and report good results - often massive, quickly apparent improvements. I'm mostly concerned with obsoleting the 8GB hard ceiling rule here. It probably doesn't matter whether and by how much one factor is worse than the other, though. I found the section "5.2 Temporal Control: Buffering" in the following paper, that speaks about the subject quite interesting: http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf -- Peter Geoghegan
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Robert Haas
Date:
On Wed, May 7, 2014 at 2:58 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Wed, May 7, 2014 at 11:50 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> But that does not mean, as the phrase "folk >> wisdom" might be taken to imply, that we don't know anything at all >> about what actually works well in practice. > > Folk wisdom doesn't imply that. It implies that we think this works, > and we may well be right, but there isn't all that much rigor behind > some of it. I'm not blaming anyone for this state of affairs. I've > heard plenty of people repeat the "don't exceed 8GB" rule - I > regularly repeated it myself. I cannot find any rigorous defense of > this, though. If you're aware of one, please point it out to me. I'm not sure the level of rigor you'd like to see is going to be available here. Complex systems have complex behavior; that's life. At any rate, I'm not aware of any rigorous defense of the "don't exceed 8GB" rule. But, #1, I'd never put it that simply. What I've found is more like this: If it's possible to size shared_buffers so that the working set fits entirely within shared_buffers, that configuration is worthy of strong consideration. Otherwise, you probably want to keep shared_buffers low in order to avoid checkpoint-related I/O spikes and minimize double buffering; try 25% of system memory up to 512MB on Windows or up to 2GB on 32-bit Linux or up to 8GB on 64-bit Linux for starters, and then tune based on your workload. And #2, I think the origin of the 8GB number on 64-bit non-Windows systems is that people found that checkpoint-related I/O spikes became intolerable when you went too much above that number. On some systems, the threshold is lower than that - for example, I believe Merlin and others have reported numbers more like 2GB than 8GB - and on other systems, the threshold is higher - indeed, some people go way higher and never hit it at all. I agree that it would be nice to better-characterize why different users hit it at different levels, but it's probably highly dependent on hardware, workload, and kernel version, so I tend to doubt it can be characterized very simply. If I had go to guess, I'd bet that fixing Linux's abominable behavior around the fsync() call would probably go a long way toward making higher values of shared_buffers more practical. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Jeff Janes
Date:
On Wed, May 7, 2014 at 11:04 AM, Josh Berkus <josh@agliodbs.com> wrote:
On 05/06/2014 10:35 PM, Peter Geoghegan wrote:Unfortunately nobody has the time/resources to do the kind of testing
> +1. In my view, we probably should have set it to a much higher
> absolute default value. The main problem with setting it to any
> multiple of shared_buffers that I can see is that shared_buffers is a
> very poor proxy for what effective_cache_size is supposed to
> represent. In general, the folk wisdom around sizing shared_buffers
> has past its sell-by date.
required for a new recommendation for shared_buffers.
I think it is worse than that. I don't think we know what such testing would even look like. SSD? BBU? max_connections=20000 with 256 cores? pgbench -N? capture and replay of Amazon's workload?
If we could spell out/agree upon what kind of testing we would find convincing, that would probably go a long way to getting some people to work on carrying out the tests. Unless the conclusion was "please have 3TB or RAM and a 50 disk RAID", then there might be few takers.
Cheers,
Jeff
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Josh Berkus
Date:
On 05/07/2014 01:36 PM, Jeff Janes wrote: > On Wed, May 7, 2014 at 11:04 AM, Josh Berkus <josh@agliodbs.com> wrote: >> Unfortunately nobody has the time/resources to do the kind of testing >> required for a new recommendation for shared_buffers. > I think it is worse than that. I don't think we know what such testing > would even look like. SSD? BBU? max_connections=20000 with 256 cores? > pgbench -N? capture and replay of Amazon's workload? > > If we could spell out/agree upon what kind of testing we would find > convincing, that would probably go a long way to getting some people to > work on carrying out the tests. Unless the conclusion was "please have 3TB > or RAM and a 50 disk RAID", then there might be few takers. Well, step #1 would be writing some easy-to-run benchmarks which carry out selected workloads and measure response times. The minimum starting set would include one OLTP/Web benchmark, and one DW benchmark. I'm not talking about the software to run the workload; we have that, in several varieties. I'm talking about the actual database generator and queries to run. That's the hard work. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Jeff Janes
Date:
On Wed, May 7, 2014 at 11:38 AM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2014-05-07 13:32:41 -0500, Merlin Moncure wrote:That's absolutely not a necessary consequence. If pages are in s_b for a
>
> *) raising shared buffers does not 'give more memory to postgres for
> caching' -- it can only reduce it via double paging
while the OS will be perfectly happy to throw them away.
Is that an empirical observation? I've run some simulations a couple years ago, and also wrote some instrumentation to test that theory under favorably engineered (but still plausible) conditions, and couldn't get more than a small fraction of s_b to be so tightly bound in that the kernel could forget about them. Unless of course the entire workload or close to it fits in s_b.
Cheers,
Jeff
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Andres Freund
Date:
On 2014-05-07 13:51:57 -0700, Jeff Janes wrote: > On Wed, May 7, 2014 at 11:38 AM, Andres Freund <andres@2ndquadrant.com>wrote: > > > On 2014-05-07 13:32:41 -0500, Merlin Moncure wrote: > > > > > > *) raising shared buffers does not 'give more memory to postgres for > > > caching' -- it can only reduce it via double paging > > > > That's absolutely not a necessary consequence. If pages are in s_b for a > > while the OS will be perfectly happy to throw them away. > > > > Is that an empirical observation? Yes. > I've run some simulations a couple years > ago, and also wrote some instrumentation to test that theory under > favorably engineered (but still plausible) conditions, and couldn't get > more than a small fraction of s_b to be so tightly bound in that the kernel > could forget about them. Unless of course the entire workload or close to > it fits in s_b. I think it depends on your IO access patterns. If the whole working set fits into the kernel's page cache and there's no other demand for pages it will stay in. If you constantly rewrite most all your pages they'll also stay in the OS cache because they'll get written out. If the churn in shared_buffers is so high (because it's so small in comparison to the core hot data set) that there'll be dozens if not hundreds clock sweeps a second you'll also have no locality. It's also *hugely* kernel version specific :( Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Merlin Moncure
Date:
On Wed, May 7, 2014 at 4:15 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-05-07 13:51:57 -0700, Jeff Janes wrote: >> On Wed, May 7, 2014 at 11:38 AM, Andres Freund <andres@2ndquadrant.com>wrote: >> >> > On 2014-05-07 13:32:41 -0500, Merlin Moncure wrote: >> > > >> > > *) raising shared buffers does not 'give more memory to postgres for >> > > caching' -- it can only reduce it via double paging >> > >> > That's absolutely not a necessary consequence. If pages are in s_b for a >> > while the OS will be perfectly happy to throw them away. >> > >> >> Is that an empirical observation? > > Yes. > >> I've run some simulations a couple years >> ago, and also wrote some instrumentation to test that theory under >> favorably engineered (but still plausible) conditions, and couldn't get >> more than a small fraction of s_b to be so tightly bound in that the kernel >> could forget about them. Unless of course the entire workload or close to >> it fits in s_b. > > I think it depends on your IO access patterns. If the whole working set > fits into the kernel's page cache and there's no other demand for pages > it will stay in. If you constantly rewrite most all your pages they'll > also stay in the OS cache because they'll get written out. If the churn > in shared_buffers is so high (because it's so small in comparison to the > core hot data set) that there'll be dozens if not hundreds clock sweeps > a second you'll also have no locality. > It's also *hugely* kernel version specific :( right. This is, IMNSHO, exactly the sort of language that belongs in the docs. merlin
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Peter Geoghegan
Date:
On Wed, May 7, 2014 at 2:24 PM, Merlin Moncure <mmoncure@gmail.com> wrote: > right. This is, IMNSHO, exactly the sort of language that belongs in the docs. +1 -- Peter Geoghegan
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Andres Freund
Date:
On 2014-05-07 16:24:53 -0500, Merlin Moncure wrote: > On Wed, May 7, 2014 at 4:15 PM, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2014-05-07 13:51:57 -0700, Jeff Janes wrote: > >> On Wed, May 7, 2014 at 11:38 AM, Andres Freund <andres@2ndquadrant.com>wrote: > >> > >> > On 2014-05-07 13:32:41 -0500, Merlin Moncure wrote: > >> > > > >> > > *) raising shared buffers does not 'give more memory to postgres for > >> > > caching' -- it can only reduce it via double paging > >> > > >> > That's absolutely not a necessary consequence. If pages are in s_b for a > >> > while the OS will be perfectly happy to throw them away. > >> > > >> > >> Is that an empirical observation? > > > > Yes. > > > >> I've run some simulations a couple years > >> ago, and also wrote some instrumentation to test that theory under > >> favorably engineered (but still plausible) conditions, and couldn't get > >> more than a small fraction of s_b to be so tightly bound in that the kernel > >> could forget about them. Unless of course the entire workload or close to > >> it fits in s_b. > > > > I think it depends on your IO access patterns. If the whole working set > > fits into the kernel's page cache and there's no other demand for pages > > it will stay in. If you constantly rewrite most all your pages they'll > > also stay in the OS cache because they'll get written out. If the churn > > in shared_buffers is so high (because it's so small in comparison to the > > core hot data set) that there'll be dozens if not hundreds clock sweeps > > a second you'll also have no locality. > > It's also *hugely* kernel version specific :( > > right. This is, IMNSHO, exactly the sort of language that belongs in the docs. Well, that's just the tip of the iceberg though. Whether you can accept small shared_buffers to counteract double buffering or not is also a hard to answer question... That again heavily depends on the usage patterns. If you have high concurrency and your working set has some locality it's very important to have a high s_b lest you fall afoul of the freelist lock. If you have high concurrency but 90+ of your page lookups *aren't* going to be in the cache you need to be very careful with a large s_b because the clock sweeps to lower the usagecounts can enlarge the lock contention. Then there's both memory and cache efficiency questions around both the PrivateRefCount array and the lwlocks.... In short: I think it's pretty hard to transfer this into language that's a) agreed upon b) understandable to someone that hasn't discovered several of the facts for him/herself. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Peter Geoghegan
Date:
On Wed, May 7, 2014 at 12:06 PM, Josh Berkus <josh@agliodbs.com> wrote: > For that matter, our advice on shared_buffers ... and our design for it > ... is going to need to change radically soon, since Linux is getting an > ARC with a frequency cache as well as a recency cache, and FreeBSD and > OpenSolaris already have them. I knew about ZFS, but Linux is implementing ARC? There are good reasons to avoid ARC. CAR seems like a more plausible candidate, since it apparently acknowledges ARC's shortcomings and fixes them. -- Peter Geoghegan
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Bruce Momjian
Date:
On Tue, May 6, 2014 at 11:15:17PM +0100, Simon Riggs wrote: > > Well, for what it's worth, I've encountered systems where setting > > effective_cache_size too low resulted in bad query plans, but I've > > never encountered the reverse situation. > > I agree with that. > > Though that misses my point, which is that you can't know that all of > that memory is truly available on a server with many concurrent users. > Choosing settings that undercost memory intensive plans are not the > best choice for a default strategy in a mixed workload when cache may > be better used elsewhere, even if such settings make sense for some > individual users. This is the same problem we had with auto-tuning work_mem, in that we didn't know what other concurrent activity was happening. Seems we need concurrent activity detection before auto-tuning work_mem and effective_cache_size. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Amit Langote
Date:
On Thu, May 15, 2014 at 9:06 PM, Bruce Momjian <bruce@momjian.us> wrote: > > This is the same problem we had with auto-tuning work_mem, in that we > didn't know what other concurrent activity was happening. Seems we need > concurrent activity detection before auto-tuning work_mem and > effective_cache_size. > Perhaps I am missing something obvious here, but would mmgr have any useful numbers on this? Like any book-keeping info maintained by mcxt.c/aset.c? Would extending that interface help? -- Amit
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Bruce Momjian
Date:
On Thu, May 15, 2014 at 10:23:19PM +0900, Amit Langote wrote: > On Thu, May 15, 2014 at 9:06 PM, Bruce Momjian <bruce@momjian.us> wrote: > > > > This is the same problem we had with auto-tuning work_mem, in that we > > didn't know what other concurrent activity was happening. Seems we need > > concurrent activity detection before auto-tuning work_mem and > > effective_cache_size. > > > > Perhaps I am missing something obvious here, but would mmgr have any > useful numbers on this? Like any book-keeping info maintained by > mcxt.c/aset.c? Would extending that interface help? No, all memory allocat is per-process, except for shared memory. We probably need a way to record our large local memory allocations in PGPROC that other backends can see; same for effective cache size assumptions we make. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Amit Langote
Date:
On Thu, May 15, 2014 at 11:24 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Thu, May 15, 2014 at 10:23:19PM +0900, Amit Langote wrote: >> On Thu, May 15, 2014 at 9:06 PM, Bruce Momjian <bruce@momjian.us> wrote: >> > >> > This is the same problem we had with auto-tuning work_mem, in that we >> > didn't know what other concurrent activity was happening. Seems we need >> > concurrent activity detection before auto-tuning work_mem and >> > effective_cache_size. >> > >> >> Perhaps I am missing something obvious here, but would mmgr have any >> useful numbers on this? Like any book-keeping info maintained by >> mcxt.c/aset.c? Would extending that interface help? > > No, all memory allocat is per-process, except for shared memory. We > probably need a way to record our large local memory allocations in > PGPROC that other backends can see; same for effective cache size > assumptions we make. > I see. I thought there would be some centralised way to traverse, say, a linked list of contexts that individual backends create or something like that. But, I suppose it would not be straightforward to make any of that work for what we are after here. -- Amit
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Bruce Momjian
Date:
On Thu, May 15, 2014 at 11:36:51PM +0900, Amit Langote wrote: > > No, all memory allocat is per-process, except for shared memory. We > > probably need a way to record our large local memory allocations in > > PGPROC that other backends can see; same for effective cache size > > assumptions we make. > > > > I see. I thought there would be some centralised way to traverse, say, > a linked list of contexts that individual backends create or something > like that. But, I suppose it would not be straightforward to make any > of that work for what we are after here. The problem is locking overhead between sessions. Right now we avoid all of that, and I think if we just put the value in PGPROC, it will be good enough with limited locking required. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
From
Robert Haas
Date:
On Thu, May 15, 2014 at 8:06 AM, Bruce Momjian <bruce@momjian.us> wrote: > On Tue, May 6, 2014 at 11:15:17PM +0100, Simon Riggs wrote: >> > Well, for what it's worth, I've encountered systems where setting >> > effective_cache_size too low resulted in bad query plans, but I've >> > never encountered the reverse situation. >> >> I agree with that. >> >> Though that misses my point, which is that you can't know that all of >> that memory is truly available on a server with many concurrent users. >> Choosing settings that undercost memory intensive plans are not the >> best choice for a default strategy in a mixed workload when cache may >> be better used elsewhere, even if such settings make sense for some >> individual users. > > This is the same problem we had with auto-tuning work_mem, in that we > didn't know what other concurrent activity was happening. Seems we need > concurrent activity detection before auto-tuning work_mem and > effective_cache_size. I think it's worse than that: we don't even know what else is happening *in the same query*. For example, look at this: http://www.postgresql.org/message-id/16161.1324414006@sss.pgh.pa.us That's pretty awful, and it's just one example of a broader class of problems that we haven't even tried to solve. We really need a way to limit memory usage on a per-query basis rather than a per-node basis. For example, consider a query plan that needs to do four sorts. If work_mem = 64MB, we'll happily use 256MB total, 64MB for each sort. Now, that might cause the system to swap: since there are four sorts, maybe we ought to have used only 16MB per sort, and switched to a heap sort if that wasn't enough. But it's even subtler than that: if we had known when building the query plan that we only had 16MB per sort rather than 64MB per sort, we would potentially have estimated higher costs for those sorts in the first place, which might have led to a different plan that needed fewer sorts to begin with. When you start to try to balance memory usage across multiple backends, things get even more complicated. If the first query that starts up is allowed to use all the available memory, and we respond to that by lowering the effective value of work_mem to something very small, a second query that shows up a bit later might choose a very inefficient plan as a result. That in turn might cause heavy I/O load on the system for a long time, making the first query run very slowly.We might have been better off just letting the firstquery finish, and the running the second one (with a much better plan) after it was done. Or, maybe we should have only let the first query take a certain fraction (half? 10%?) of the available memory, so that there was more left for the second guy. But that could be wrong too - it might cause the first plan to be unnecessarily inefficient when nobody was planning to run any other queries anyway. Plus, DBAs hate it when plans change on them unexpectedly, so anything that involves a feedback loop between current utilization and query plans will be unpopular with some people for that reason. These are hard problems. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company