Re: Database Caching - Mailing list pgsql-hackers

From Marc G. Fournier
Subject Re: Database Caching
Date
Msg-id 20020301110357.H49236-100000@mail1.hub.org
Whole thread Raw
In response to Re: Database Caching  (Jan Wieck <janwieck@yahoo.com>)
Responses Re: Database Caching  (Jan Wieck <janwieck@yahoo.com>)
Re: Database Caching  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
List pgsql-hackers
On Fri, 1 Mar 2002, Jan Wieck wrote:

> Tom Lane wrote:
> > "Greg Sabino Mullane" <greg@turnstep.com> writes:
> > > III. Relation caching
> >
> > > The final cache is the relation itself, and simply involves putting the entire
> > > relation into memory. This cache has a field for the name of the relation,
> > > the table info itself, the type (indexes should ideally be cached more than
> > > tables, for example), the access time, and the acccess number. Loading could
> > > be done automatically, but most likely should be done according to a flag
> > > on the table itself or as an explicit command by the user.
> >
> > This would be a complete waste of time; the buffer cache (both Postgres'
> > own, and the kernel's disk cache) serves the purpose already.
> >
> > As I've commented before, I have deep misgivings about the idea of a
> > query-result cache, too.
>
>     I  wonder how this sort of query result caching could work in
>     our MVCC and visibility world  at  all.  Multiple  concurrent
>     running  transactions  see  different snapshots of the table,
>     hence different result sets for  exactly  one  and  the  same
>     querystring  at the same time ... er ...  yeah, one cache set
>     per query/snapshot combo, great!
>
>     To really gain some speed with this sort of query cache, we'd
>     have to adopt the #1 MySQL design rule "speed over precision"
>     and ignore MVCC for query-cached relations, or what?

Actually, you are missing, I think, as is everyone, the 'semi-static'
database ... you know?  the one where data gets dumped to it by a script
every 5 minutes, but between dumps, there are hundreds of queries per
second/minute between the updates that are the same query repeated each
time ...

As soon as there is *any* change to the data set, the query cache should
be marked dirty and reloaded ... mark it dirty on any update, delete or
insert ...

So, if I have 1000 *pure* SELECTs, the cache is fine ... as soon as one
U/I/D pops up, its invalidated ...





pgsql-hackers by date:

Previous
From: Jan Wieck
Date:
Subject: Re: Database Caching
Next
From: Bruce Momjian
Date:
Subject: Re: elog() patch