Re: Database Caching - Mailing list pgsql-hackers

From Rod Taylor
Subject Re: Database Caching
Date
Msg-id 03c201c1c46d$f5a141f0$b002000a@jester
Whole thread Raw
In response to Re: Database Caching  (F Harvell <fharvell@fts.net>)
Responses Re: Database Caching  (F Harvell <fharvell@fts.net>)
List pgsql-hackers
> Certainly, as has been pointed out several times, the application
can
> do the caching, however, utilizing todays "generic" tools such as
> Apache and PHP, it is relatively difficult.  By moving caching to
the
> database server, there is little that needs to be done by the
> application to realize the benefits.

PHP has an amazingly easy to use Shared Memory resource.

Create a semaphore, and an index hash in position 0 which points to
the rest of the resources.

On query lock memory, lookup in index, pull values if it exists
otherwise run db query and stuff values in.

The tricky part is expiring the cache.  I store a time in position 1
and a table in the database.  To see if cache should be expired I do a
select against the 'timer' table in the db.  If it's expired, clear
cache and let it be rebuilt.  A trigger updates the time on change.

Doesn't work well at all for frequent updates -- but thats not what
it's for.


Took about 15 minutes to write the caching mechanism in a generic
fashion for all my applications.  Step 2 of my process was to cache
the generated HTML page itself so I generate it once.  I store it gzip
compressed, and serve directly to browsers which support gzip
compression (most).  Pages are stored in shared memory using the above
mechanism.   This took about 40 minutes (php output buffer) and is
based on the uniqueness of the pages requested.  Transparent to the
application too (prepend and append code elements). (Zend cache does
this too I think).

Anyway, I don't feel like rewriting it as non-private code, but the
concept is simple enough.  Perhaps Zend or PEAR could write the above
data lookup wrappers to shared memory.  Although you don't even have
to worry about structures or layout, just the ID that represents the
location of your object in memory.

It can serve a couple million per hour using this on a E220 with lots
of ram -- bandwidth always runs out first.


Anyway, first suggestion is to buy Zend Cache (if available) to cache
entire HTML pages, not to cache db queries.  Even the great Slashdot
(which everone appears to be comparing this to) uses a page cache to
reduce load and serves primarily static HTML pages which were
pregenerated.

I agree with the solution, I don't agree with the proposed location as
caching DB queries only solves about 1/4 of the whole problem.  The
other being network (millions of queries across 100mbit isn't fast
either), and genereation of the non-static page from the static
(cached) query.  Build a few large (10k row) tables to see what I mean
about where the slowdown really is.



pgsql-hackers by date:

Previous
From: Lee Kindness
Date:
Subject: Re: [PATCHES] WITH DELIMITERS in COPY
Next
From: "Arguile"
Date:
Subject: Intervening in Parser -> Planner Stage