Thread: preloading indexes

preloading indexes

From
Date:

I am working with some pretty convoluted queries that work very slowly the first time they’re called but perform fine on the second call. I am fairly certain that these differences are due to the caching. Can someone point me in a direction that would allow me to pre-cache the critical indexes?

Re: preloading indexes

From
"Matt Clark"
Date:
The best way to get all the stuff needed by a query into RAM is to run the query.  Is it more that you want to 'pin' the data in RAM so it doesn't get overwritten by other queries?
 
-----Original Message-----
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of stuff@opensourceonline.com
Sent: 03 November 2004 17:31
To: pgsql-performance@postgresql.org
Subject: [PERFORM] preloading indexes

I am working with some pretty convoluted queries that work very slowly the first time they’re called but perform fine on the second call. I am fairly certain that these differences are due to the caching. Can someone point me in a direction that would allow me to pre-cache the critical indexes?

Re: preloading indexes

From
Date:

That’s correct – I’d like to be able to keep particular indexes in RAM available all the time

 

The best way to get all the stuff needed by a query into RAM is to run the query.  Is it more that you want to 'pin' the data in RAM so it doesn't get overwritten by other queries?

 

I am working with some pretty convoluted queries that work very slowly the first time they’re called but perform fine on the second call. I am fairly certain that these differences are due to the caching. Can someone point me in a direction that would allow me to pre-cache the critical indexes?

Re: preloading indexes

From
Andrew Sullivan
Date:
On Wed, Nov 03, 2004 at 12:12:43PM -0700, stuff@opensourceonline.com wrote:
> That's correct - I'd like to be able to keep particular indexes in RAM
> available all the time

If these are queries that run frequently, then the relevant cache
will probably remain populated[1].  If they _don't_ run frequently, why
do you want to force the memory to be used to optimise something that
is uncommon?  But in any case, there's no mechanism to do this.

A

[1] there are in fact limits on the caching: if your data set is
larger than memory, for instance, there's no way it will all stay
cached.  Also, VACUUM does nasty things to the cache.  It is hoped
that nastiness is fixed in 8.0.

--
Andrew Sullivan  | ajs@crankycanuck.ca
The plural of anecdote is not data.
        --Roger Brinner

Re: preloading indexes

From
Pierre-Frédéric Caillaud
Date:
    --

    uh, you can always load a table in cache by doing a seq scan on it...
like select count(1) from table or something... this doesn't work for
indexes of course, but you can always look in the system catalogs, find
the filename for the index, then just open() it from an external program
and read it without caring for the data... it'll save you the seeks in the
index... of course you'll have problems with file permissions etc, not
mentioning security, locking, etc, etc, etc, is that worth the trouble ?

On Wed, 3 Nov 2004 14:35:28 -0500, Andrew Sullivan <ajs@crankycanuck.ca>
wrote:

> On Wed, Nov 03, 2004 at 12:12:43PM -0700, stuff@opensourceonline.com
> wrote:
>> That's correct - I'd like to be able to keep particular indexes in RAM
>> available all the time
>
> If these are queries that run frequently, then the relevant cache
> will probably remain populated[1].  If they _don't_ run frequently, why
> do you want to force the memory to be used to optimise something that
> is uncommon?  But in any case, there's no mechanism to do this.
>
> A
>
> [1] there are in fact limits on the caching: if your data set is
> larger than memory, for instance, there's no way it will all stay
> cached.  Also, VACUUM does nasty things to the cache.  It is hoped
> that nastiness is fixed in 8.0.
>



Re: preloading indexes

From
Tom Lane
Date:
<stuff@opensourceonline.com> writes:
> I am working with some pretty convoluted queries that work very slowly the
> first time they're called but perform fine on the second call. I am fairly
> certain that these differences are due to the caching. Can someone point me
> in a direction that would allow me to pre-cache the critical indexes?

Buy more RAM.  Also check your shared_buffers setting (but realize that
more is not necessarily better).

            regards, tom lane

Re: preloading indexes

From
Date:
The caching appears to disappear overnight. The environment is not in
production yet so I'm the only one on it.

Is there a time limit on the length of time in cache? I believe there is
sufficient RAM, but maybe I need to look again.

s

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Andrew Sullivan
Sent: Wednesday, November 03, 2004 12:35 PM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] preloading indexes

On Wed, Nov 03, 2004 at 12:12:43PM -0700, stuff@opensourceonline.com wrote:
> That's correct - I'd like to be able to keep particular indexes in RAM
> available all the time

If these are queries that run frequently, then the relevant cache
will probably remain populated[1].  If they _don't_ run frequently, why
do you want to force the memory to be used to optimise something that
is uncommon?  But in any case, there's no mechanism to do this.

A

[1] there are in fact limits on the caching: if your data set is
larger than memory, for instance, there's no way it will all stay
cached.  Also, VACUUM does nasty things to the cache.  It is hoped
that nastiness is fixed in 8.0.

--
Andrew Sullivan  | ajs@crankycanuck.ca
The plural of anecdote is not data.
        --Roger Brinner

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend



Re: preloading indexes

From
Tom Lane
Date:
<stuff@opensourceonline.com> writes:
> The caching appears to disappear overnight.

You've probably got cron jobs that run late at night and blow out your
kernel disk cache by accessing a whole lot of non-Postgres stuff.
(A nightly disk backup is one obvious candidate.)  The most likely
solution is to run some cron job a little later to exercise your
database and thereby repopulate the cache with Postgres files before
you get to work ;-)

            regards, tom lane

Re: preloading indexes

From
Andrew Sullivan
Date:
On Wed, Nov 03, 2004 at 01:19:43PM -0700, stuff@opensourceonline.com wrote:
> The caching appears to disappear overnight. The environment is not in
> production yet so I'm the only one on it.

Are you vacuuming at night?  It grovels through the entire database,
and may bust your query out of the cache.  Also, we'd need some more
info about how you've tuned this thing.  Maybe check out the archives
first for some tuning pointers to help you.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism.
                --Brad Holland

Re: preloading indexes

From
Date:
Thanks - this is what I was afraid of, but I may have to do this

Is there a good way to monitor what's in the cache?

j

<stuff@opensourceonline.com> writes:
> The caching appears to disappear overnight.

You've probably got cron jobs that run late at night and blow out your
kernel disk cache by accessing a whole lot of non-Postgres stuff.
(A nightly disk backup is one obvious candidate.)  The most likely
solution is to run some cron job a little later to exercise your
database and thereby repopulate the cache with Postgres files before
you get to work ;-)

            regards, tom lane



Re: preloading indexes

From
Mike Benoit
Date:
If your running Linux, and kernel 2.6.x, you can try playing with the:

/proc/sys/vm/swappiness

setting.

My understanding is that:

echo "0" > /proc/sys/vm/swappiness

Will try to keep all in-use application memory from being swapped out
when other processes query the disk a lot.

Although, since PostgreSQL utilizes the disk cache quite a bit, this may
not help you.


On Wed, 2004-11-03 at 15:53 -0500, Tom Lane wrote:
> <stuff@opensourceonline.com> writes:
> > The caching appears to disappear overnight.
>
> You've probably got cron jobs that run late at night and blow out your
> kernel disk cache by accessing a whole lot of non-Postgres stuff.
> (A nightly disk backup is one obvious candidate.)  The most likely
> solution is to run some cron job a little later to exercise your
> database and thereby repopulate the cache with Postgres files before
> you get to work ;-)
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match
--
Mike Benoit <ipso@snappymail.ca>

Attachment

Re: preloading indexes

From
Andrew Sullivan
Date:
On Wed, Nov 03, 2004 at 03:53:16PM -0500, Andrew Sullivan wrote:
> and may bust your query out of the cache.  Also, we'd need some more

Uh, the data you're querying, of course.  Queries themselves aren't
cached.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
I remember when computers were frustrating because they *did* exactly what
you told them to.  That actually seems sort of quaint now.
        --J.D. Baldwin