Thanks, Chris.
> > What is it about the buffer cache that makes it so unhappy being able to
> > hold everything? I don't want to be seen as a cache hit fascist, but
isn't
> > it just better if the data is just *there*, available in the
postmaster's
> > address space ready for each backend process to access it, rather than
> > expecting the Linux cache mechanism, optimised as it may be, to have to
do
> > the caching?
>
> Because the PostgreSQL buffer management algorithms are pitiful compared
> to Linux's. In 7.5, it's improved with the new ARC algorithm, but still
> - the Linux disk buffer cache will be very fast.
>
I've had that reply elsewhere too. Initially, I was afraid that there was a
memory copy involved if the OS buffer cache supplied a block of data to PG,
but I've learned a lot more about the linux buffer cache, so it now makes
more sense to me why it's not a terrible thing to let the OS manage the
lions' share of the caching on a high RAM system.
On another thread, (not in this mailing list), someone mentioned that there
are a class of databases which, rather than caching bits of database file
(be it in the OS buffer cache or the postmaster workspace), construct a a
well indexed memory representation of the entire data in the postmaster
workspace (or its equivalent), and this, remaining persistent, allows the DB
to service backend queries far quicker than if the postmaster was working
with the assumption that most of the data was on disk (even if, in practice,
large amounts or perhaps even all of it resides in OS cache).
Though I'm no stranger to data management in general, I'm still in a steep
learning curve for databases in general and PG in particular, but I just
wondered how big a subject this is in the development group for PG at the
moment?
After all, we're now seeing the first wave of 'reasonably priced' 64 bit
servers supported by a proper 64 bit OS (e.g. linux). HP are selling a 4
Opteron server which can take 256GB of RAM, and that starts at $10000 (ok -
they don't give you that much RAM for that price - not yet, anyway!)
This is the future, isn't it? Each year, a higher percentage of DB
applications will be able to fit entirely in RAM, and that percentage is
going to be quite significant in just a few years. The disk system gets
relegated to a data preload on startup and servicing the writes as the
server does its stuff.
Regards,
Andy