Thread: preloading indexes
I am working with some pretty convoluted queries that work very slowly the first time they’re called but perform fine on the second call. I am fairly certain that these differences are due to the caching. Can someone point me in a direction that would allow me to pre-cache the critical indexes?
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of stuff@opensourceonline.com
Sent: 03 November 2004 17:31
To: pgsql-performance@postgresql.org
Subject: [PERFORM] preloading indexes
I am working with some pretty convoluted queries that work very slowly the first time they’re called but perform fine on the second call. I am fairly certain that these differences are due to the caching. Can someone point me in a direction that would allow me to pre-cache the critical indexes?
That’s correct – I’d like to be able to keep particular indexes in RAM available all the time
The best way to get all the stuff needed by a query into RAM is to run the query. Is it more that you want to 'pin' the data in RAM so it doesn't get overwritten by other queries?
I am working with some pretty convoluted queries that work very slowly the first time they’re called but perform fine on the second call. I am fairly certain that these differences are due to the caching. Can someone point me in a direction that would allow me to pre-cache the critical indexes?
On Wed, Nov 03, 2004 at 12:12:43PM -0700, stuff@opensourceonline.com wrote: > That's correct - I'd like to be able to keep particular indexes in RAM > available all the time If these are queries that run frequently, then the relevant cache will probably remain populated[1]. If they _don't_ run frequently, why do you want to force the memory to be used to optimise something that is uncommon? But in any case, there's no mechanism to do this. A [1] there are in fact limits on the caching: if your data set is larger than memory, for instance, there's no way it will all stay cached. Also, VACUUM does nasty things to the cache. It is hoped that nastiness is fixed in 8.0. -- Andrew Sullivan | ajs@crankycanuck.ca The plural of anecdote is not data. --Roger Brinner
-- uh, you can always load a table in cache by doing a seq scan on it... like select count(1) from table or something... this doesn't work for indexes of course, but you can always look in the system catalogs, find the filename for the index, then just open() it from an external program and read it without caring for the data... it'll save you the seeks in the index... of course you'll have problems with file permissions etc, not mentioning security, locking, etc, etc, etc, is that worth the trouble ? On Wed, 3 Nov 2004 14:35:28 -0500, Andrew Sullivan <ajs@crankycanuck.ca> wrote: > On Wed, Nov 03, 2004 at 12:12:43PM -0700, stuff@opensourceonline.com > wrote: >> That's correct - I'd like to be able to keep particular indexes in RAM >> available all the time > > If these are queries that run frequently, then the relevant cache > will probably remain populated[1]. If they _don't_ run frequently, why > do you want to force the memory to be used to optimise something that > is uncommon? But in any case, there's no mechanism to do this. > > A > > [1] there are in fact limits on the caching: if your data set is > larger than memory, for instance, there's no way it will all stay > cached. Also, VACUUM does nasty things to the cache. It is hoped > that nastiness is fixed in 8.0. >
<stuff@opensourceonline.com> writes: > I am working with some pretty convoluted queries that work very slowly the > first time they're called but perform fine on the second call. I am fairly > certain that these differences are due to the caching. Can someone point me > in a direction that would allow me to pre-cache the critical indexes? Buy more RAM. Also check your shared_buffers setting (but realize that more is not necessarily better). regards, tom lane
The caching appears to disappear overnight. The environment is not in production yet so I'm the only one on it. Is there a time limit on the length of time in cache? I believe there is sufficient RAM, but maybe I need to look again. s -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Andrew Sullivan Sent: Wednesday, November 03, 2004 12:35 PM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] preloading indexes On Wed, Nov 03, 2004 at 12:12:43PM -0700, stuff@opensourceonline.com wrote: > That's correct - I'd like to be able to keep particular indexes in RAM > available all the time If these are queries that run frequently, then the relevant cache will probably remain populated[1]. If they _don't_ run frequently, why do you want to force the memory to be used to optimise something that is uncommon? But in any case, there's no mechanism to do this. A [1] there are in fact limits on the caching: if your data set is larger than memory, for instance, there's no way it will all stay cached. Also, VACUUM does nasty things to the cache. It is hoped that nastiness is fixed in 8.0. -- Andrew Sullivan | ajs@crankycanuck.ca The plural of anecdote is not data. --Roger Brinner ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend
<stuff@opensourceonline.com> writes: > The caching appears to disappear overnight. You've probably got cron jobs that run late at night and blow out your kernel disk cache by accessing a whole lot of non-Postgres stuff. (A nightly disk backup is one obvious candidate.) The most likely solution is to run some cron job a little later to exercise your database and thereby repopulate the cache with Postgres files before you get to work ;-) regards, tom lane
On Wed, Nov 03, 2004 at 01:19:43PM -0700, stuff@opensourceonline.com wrote: > The caching appears to disappear overnight. The environment is not in > production yet so I'm the only one on it. Are you vacuuming at night? It grovels through the entire database, and may bust your query out of the cache. Also, we'd need some more info about how you've tuned this thing. Maybe check out the archives first for some tuning pointers to help you. A -- Andrew Sullivan | ajs@crankycanuck.ca In the future this spectacle of the middle classes shocking the avant- garde will probably become the textbook definition of Postmodernism. --Brad Holland
Thanks - this is what I was afraid of, but I may have to do this Is there a good way to monitor what's in the cache? j <stuff@opensourceonline.com> writes: > The caching appears to disappear overnight. You've probably got cron jobs that run late at night and blow out your kernel disk cache by accessing a whole lot of non-Postgres stuff. (A nightly disk backup is one obvious candidate.) The most likely solution is to run some cron job a little later to exercise your database and thereby repopulate the cache with Postgres files before you get to work ;-) regards, tom lane
If your running Linux, and kernel 2.6.x, you can try playing with the: /proc/sys/vm/swappiness setting. My understanding is that: echo "0" > /proc/sys/vm/swappiness Will try to keep all in-use application memory from being swapped out when other processes query the disk a lot. Although, since PostgreSQL utilizes the disk cache quite a bit, this may not help you. On Wed, 2004-11-03 at 15:53 -0500, Tom Lane wrote: > <stuff@opensourceonline.com> writes: > > The caching appears to disappear overnight. > > You've probably got cron jobs that run late at night and blow out your > kernel disk cache by accessing a whole lot of non-Postgres stuff. > (A nightly disk backup is one obvious candidate.) The most likely > solution is to run some cron job a little later to exercise your > database and thereby repopulate the cache with Postgres files before > you get to work ;-) > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match -- Mike Benoit <ipso@snappymail.ca>
Attachment
On Wed, Nov 03, 2004 at 03:53:16PM -0500, Andrew Sullivan wrote: > and may bust your query out of the cache. Also, we'd need some more Uh, the data you're querying, of course. Queries themselves aren't cached. A -- Andrew Sullivan | ajs@crankycanuck.ca I remember when computers were frustrating because they *did* exactly what you told them to. That actually seems sort of quaint now. --J.D. Baldwin