> Briefly, my problem:
> ------------------
> I expect the database I'm working on to reach something in the order of
> 12-16 Gigabytes, and I am interested in understanding as much as I can
> about how I can make this go as fast as possible on a linux system. I
> haven't run such a large database before. The nature of the database is
> such that successive queries are very likely to lead to poor cache
> performance.
>
> I have lots of experience with disks, operating systems, caching, virtual
> memory, RAM etc. - just none running gargantuan databases very quickly!
Two things come into my mind, although I don't have experience with both of
them ,-))
1) Use a kind of "silicon disk". Ususally it's battery backuped DDR-Ram with
a kind of IDE-interface ready to attach to the IDE bus. This is not only
fast in terms of transfer rates but also in BLAZING fast in terms of seek
times.
2) Put the database completely in a RAM disk and replicate it live to a disk
based DB server - if the server crashes you still have the "copy" on
another server on the disk. I don't know how much overhead is imposed with
such replicating for Postgres, but for commercial DBs you normally get
happy with ~5% addon. This depends strongly from the usage pattern of the
DB, i.e. a OLTP-usage imposes stronger overhead then a read-only-DB (MIS
type).
Some years ago I've committed lots of work in tuning a commercial database
product for 10 Gig+ databases. The problem was really always that tuning on
a database level always needed some help from the application development
too to really work good. And when the database changed (schemes, indexes
and some costly operations) you needed to re-tune again for some bigger
changes - always needing lots of testing, approving etc.etc.
As RAM got cheaper and cheaper (even for the big and evil big iron servers
<g>>) the last years I can see an increasing tendency to buy more RAM for
the usage of caches, ramdisks etc. because: This normally helps always,
fast and is not that expensive nowadays. Although I personally think that
tuning is much more fun than putting some ram sticks into the mobo, the
latter one becomes more effective in terms of "bang for the bucks".
This explains why my two solutions have nothing really specific to do with
Postgres but more with a general approach. If you decide to switch from
Linux to Solaris (or other way round) in some time you get likely lots of
work to do if you now stay to a specific solution.
Best regards,
Soeren Gerlach