Thread: bgwriter, regression tests, and default shared_buffers settings
After fixing the assorted breakage discussed yesterday, I still wasn't seeing any ForwardFsyncRequest requests coming from the bgwriter during a regression test run, which made me wonder if there was yet another bug. What I find is that because of the recent increase in the out-of-the-box shared_buffers setting to 128MB, the regression database fits comfortably in shared buffers (its total footprint appears to be about 41MB at the moment). This means that the shared-buffer freelist never becomes empty, so StrategyGetBuffer never advances the clock sweep pointer, so after one pass over the buffer pool BgBufferSync decides that it's lapped the clock sweep and never does anything more. In short, then, the background writer process is entirely useless for any database that fits completely into shared buffers. The only background writes that get generated in such a case are from the checkpoint sweep, and AFAICT the "backend buffer writes" that get counted by ForwardFsyncRequest are not true buffer writes but mdextend calls. (Which likely explains why their number is so consistent over repeated regression test runs --- the variance is well under 1% for me.) So that raises two independent sets of questions: 1. Do we like the fact that the bgwriter isn't doing anything in this situation? It seems arguably OK for writes to happen only for checkpointing purposes if there is no memory pressure. But having the bgwriter wake up 5 times a second to decide it has nothing to do seems a bit wasteful. I'm inclined to think maybe it should go into the recently added "hibernation mode" anytime the buffer freelist isn't empty. Or maybe you could argue that this scenario isn't worth any optimization effort, but with many-gig RAM becoming more and more common, I don't think I agree. 2. It's rather disturbing that a fairly large swath of functionality just stopped getting tested at all by the buildfarm. Do we want to rethink the shared_buffers increase? Or artificially bloat the regression database to make it larger than 128MB? Or do something else to ensure we still exercise the DB-bigger-than-buffers case? regards, tom lane
On 07/18/2012 03:59 PM, Tom Lane wrote: > 2. It's rather disturbing that a fairly large swath of functionality > just stopped getting tested at all by the buildfarm. Do we want to > rethink the shared_buffers increase? Or artificially bloat the > regression database to make it larger than 128MB? Or do something else > to ensure we still exercise the DB-bigger-than-buffers case? A couple of other ideas: The buildfarm does have the ability to set config data after initdb has run (which I just enhanced in the latest release). So a buildfarm owner could add a config line for shared_buffers which would override what initdb had set. Or we could provide an initdb flag which would set an upper bound on shared_buffers, and have make check (at least) use it. I'd rather not bloat the regression database if we can reasonably avoid it. Buildfarm members are often tight on space. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > The buildfarm does have the ability to set config data after initdb has > run (which I just enhanced in the latest release). So a buildfarm owner > could add a config line for shared_buffers which would override what > initdb had set. > Or we could provide an initdb flag which would set an upper bound on > shared_buffers, and have make check (at least) use it. > I'd rather not bloat the regression database if we can reasonably avoid > it. Buildfarm members are often tight on space. Agreed on not wanting to bloat the regression DB just for this reason. We see enough "out of disk space" failures already in the buildfarm. I like the idea of modifying make check only, because then a typical buildfarm run could exercise both DB-smaller-than-buffers (in the installcheck case) and DB-larger-than-buffers (in make check). regards, tom lane
On 07/18/2012 05:37 PM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> The buildfarm does have the ability to set config data after initdb has >> run (which I just enhanced in the latest release). So a buildfarm owner >> could add a config line for shared_buffers which would override what >> initdb had set. >> Or we could provide an initdb flag which would set an upper bound on >> shared_buffers, and have make check (at least) use it. >> I'd rather not bloat the regression database if we can reasonably avoid >> it. Buildfarm members are often tight on space. > Agreed on not wanting to bloat the regression DB just for this reason. > We see enough "out of disk space" failures already in the buildfarm. > > I like the idea of modifying make check only, because then a typical > buildfarm run could exercise both DB-smaller-than-buffers (in the > installcheck case) and DB-larger-than-buffers (in make check). > > Agreed. Something like: --max_shared_buffers=32MB for initdb, plus code to use it in pg_regress, should fit the bill. cheers andrew
> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] > On Behalf Of Tom Lane > So that raises two independent sets of questions: > 1. Do we like the fact that the bgwriter isn't doing anything in this > situation? It seems arguably OK for writes to happen only for > checkpointing purposes if there is no memory pressure. But having the > bgwriter wake up 5 times a second to decide it has nothing to do seems > a bit wasteful. I'm inclined to think maybe it should go into the > recently added "hibernation mode" anytime the buffer freelist isn't > empty. Or maybe you could argue that this scenario isn't worth any > optimization effort, but with many-gig RAM becoming more and more > common, I don't think I agree. I also believe it should go to "hibernation mode" if the freelist is not exhanusted or even it could be when freelist is less than (50% some threshold number) used. I have one doubt regarding this approach. According to above, what I understood is after going to hibernation ideally it should get wakeup either at timeout or when freelist is exhausted. However current code wakes up before ensuring whether the buffer allocation can be done from freelist. With Regards, Amit Kapila.
On Wed, Jul 18, 2012 at 3:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > In short, then, the background writer process is entirely useless for > any database that fits completely into shared buffers. Or to phrase that a bit more positively, there's no reason to do a bunch of unnecessary writes if we are lucky enough to encounter the happy situation where the database fits in shared buffers. The background writer's reason for existence is to make buffer eviction faster by cleaning buffers that will soon be evicted, so if we're not going to evict any buffers then we needn't clean them either (except at checkpoint time). > So that raises two independent sets of questions: > > 1. Do we like the fact that the bgwriter isn't doing anything in this > situation? It seems arguably OK for writes to happen only for > checkpointing purposes if there is no memory pressure. But having the > bgwriter wake up 5 times a second to decide it has nothing to do seems > a bit wasteful. I'm inclined to think maybe it should go into the > recently added "hibernation mode" anytime the buffer freelist isn't > empty. Or maybe you could argue that this scenario isn't worth any > optimization effort, but with many-gig RAM becoming more and more > common, I don't think I agree. I feel like the hibernation behavior ought to be tied to buffer eviction, not the freelist. When there's no buffer eviction happening, the background writer should hibernate, because there's no need to clean buffers in preparation for future eviction in that case.It is true that when the freelist is non-empty, there'sno buffer eviction occurring, but that will typically only happen at start-up. It's not uncommon to have a database that is larger than shared_buffers but whose active portion is smaller than shared_buffers. In that case you expect the freelist to converge to empty (since the only things that put buffers back on the freelist after startup are relation or database drops) but yet you probably don't need the background writer working. Another consideration is that we might actually want to arrange things so that the free-list remains non-empty on an ongoing basis. Right now buffer eviction is a major scalability bottleneck. Maybe we'll find some other way to fix that, but then again maybe we won't. > 2. It's rather disturbing that a fairly large swath of functionality > just stopped getting tested at all by the buildfarm. Do we want to > rethink the shared_buffers increase? Or artificially bloat the > regression database to make it larger than 128MB? Or do something else > to ensure we still exercise the DB-bigger-than-buffers case? It seems like it could be useful to test with a variety of shared_buffers settings. Maybe we should even have one or two buildfarm animals that run with a REALLY small shared_buffers setting, like 1MB, just to see if that breaks anything. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jul 18, 2012 at 5:30 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > Or we could provide an initdb flag which would set an upper bound on > shared_buffers, and have make check (at least) use it. How about a flag that sets the exact value for shared_buffers, rather than a maximum? I think a lot of users would like initdb --shared-buffers=8GB or whatever. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Jul 18, 2012 at 5:30 PM, Andrew Dunstan <andrew@dunslane.net> wrote: >> Or we could provide an initdb flag which would set an upper bound on >> shared_buffers, and have make check (at least) use it. > How about a flag that sets the exact value for shared_buffers, rather > than a maximum? I think a lot of users would like initdb > --shared-buffers=8GB or whatever. That would be significantly harder to deploy in the buildfarm context. We don't know that all the animals are capable of coping with 16MB (or whatever target we settle on for make check) today. regards, tom lane
On 07/19/2012 10:12 AM, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Wed, Jul 18, 2012 at 5:30 PM, Andrew Dunstan <andrew@dunslane.net> wrote: >>> Or we could provide an initdb flag which would set an upper bound on >>> shared_buffers, and have make check (at least) use it. >> How about a flag that sets the exact value for shared_buffers, rather >> than a maximum? I think a lot of users would like initdb >> --shared-buffers=8GB or whatever. > That would be significantly harder to deploy in the buildfarm context. > We don't know that all the animals are capable of coping with 16MB > (or whatever target we settle on for make check) today. > > Yeah - unless we allow some fallback things could get ugly. I do like the idea of allowing a settable ceiling on shared_buffers instead of having it completely hardcoded as now. cheers andrew
On 07/19/2012 10:32 AM, Andrew Dunstan wrote: > > On 07/19/2012 10:12 AM, Tom Lane wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> On Wed, Jul 18, 2012 at 5:30 PM, Andrew Dunstan >>> <andrew@dunslane.net> wrote: >>>> Or we could provide an initdb flag which would set an upper bound on >>>> shared_buffers, and have make check (at least) use it. >>> How about a flag that sets the exact value for shared_buffers, rather >>> than a maximum? I think a lot of users would like initdb >>> --shared-buffers=8GB or whatever. >> That would be significantly harder to deploy in the buildfarm context. >> We don't know that all the animals are capable of coping with 16MB >> (or whatever target we settle on for make check) today. >> >> > > Yeah - unless we allow some fallback things could get ugly. I do like > the idea of allowing a settable ceiling on shared_buffers instead of > having it completely hardcoded as now. > > > Here's a draft patch. cheers andrew