Re: pg_prewarm - Mailing list pgsql-hackers

From Hans-Jürgen Schönig
Subject Re: pg_prewarm
Date
Msg-id AF9AF9CA-AE1F-409F-B569-E5DA715D7286@cybertec.at
Whole thread Raw
In response to Re: pg_prewarm  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mar 9, 2012, at 2:34 PM, Robert Haas wrote:

> On Fri, Mar 9, 2012 at 5:42 AM, Hans-Jürgen Schönig
> <postgres@cybertec.at> wrote:
>> we had some different idea here in the past: what if we had a procedure / method to allow people to save the list of
currentbuffers / cached blocks to be written to disk (sorted). we could then reload this "cache profile" on startup in
thebackground or people could load a certain cache content at runtime (maybe to test or whatever). 
>> writing those block ids in sorted order would help us to avoid some random I/O on reload.
>
> I don't think that's a bad idea at all, and someone actually did write
> a patch for it at one point, though it didn't get committed, partly I
> believe because of technical issues and partly because Greg Smith was
> uncertain how much good it did to restore shared_buffers without
> thinking about the OS cache.  Personally, I don't buy into the latter
> objection: a lot of people are running with data sets that fit inside
> shared_buffers, and those people would benefit tremendously.
>
> However, this just provides mechanism, not policy, and is therefore
> more general.  You could use pg_buffercache to save the cache contents
> at shutdown and pg_prewarm to load those blocks back in at startup, if
> you were so inclined.  Or if you just want to load up your main
> relation, and its indexes, you can do that, too.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



i also think that it can be beneficial.
once in a while people ask how to "bring a database up to speed" after a restart. i have seen more than one case when a
DBwas close to death after a restart because random I/O was simply killing it during cache warmup. it seems the problem
isgetting worse as we see machines with more and more RAM in the field. 
technically i would see a rather brute force approach: if we just spill out of the list of blocks we got in shared
bufferatm (not content of course, just physical location sorted by file / position in file) it would be good enough. if
ablock physically does not exist on reload any more it would not even be an issue and allow people basically to
"snapshot"their cache status. we could allow named cache profiles or so and make a GUC to indicate of one of them
shouldbe preloaded on startup (background or beforehand - i see usecases for both approaches). 

yes, somehow linking to pg_buffercache makes a lot of sense. maybe just extending it with some extra functions is
alreadyenough for most cases. 
hans


--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de



pgsql-hackers by date:

Previous
From: Marti Raudsepp
Date:
Subject: Refactoring simplify_function (was: Caching constant stable expressions)
Next
From: Stefan Keller
Date:
Subject: Re: pg_prewarm