Steve Wampler <swampler@noao.edu> writes:
> Hervé Piedvache wrote:
>
> > No ... as I have said ... how I'll manage a database getting a table of may
> > be 250 000 000 records ? I'll need incredible servers ... to get quick access
> > or index reading ... no ?
>
> Probably by carefully partitioning their data. I can't imagine anything
> being fast on a single table in 250,000,000 tuple range.
Why are you all so psyched out by the size of the table? That's what indexes
are for.
The size of the table really isn't relevant here. The important thing is the
size of the working set. Ie, How many of those records are required to respond
to queries.
As long as you tune your application so every query can be satisfied by
reading a (very) limited number of those records and have indexes to speed
access to those records you can have quick response time even if you have
terabytes of raw data.
I would start by looking at the plans for the queries you're running and
seeing if you have any queries that are reading more than hundred records or
so. If so then you have to optimize them or rethink your application design.
You might need to restructure your data so you don't have to scan too many
records for any query.
No clustering system is going to help you if your application requires reading
through too much data. If every query is designed to not have to read more
than a hundred or so records then there's no reason you can't have sub-100ms
response time even if you had terabytes of raw data.
If the problem is just that each individual query is fast but there's too many
coming for a single server then something like slony is all you need. It'll
spread the load over multiple machines. If you spread the load in an
intelligent way you can even concentrate each server on certain subsets of the
data. But that shouldn't even really be necessary, just a nice improvement.
--
greg