On Thu, 2007-03-29 at 21:30 -0700, Benjamin Arai wrote:
> Rebuilding an index can't be the PostgreSQL solution for all
> cases. I am dealing with databases in the hundreds of gigs
> range and I am adding about 10gigs of data a week. At
> some point its going to take longer than a week to rebuild
> all of the indexes in the database.
>
> On the other hand, if I am to partition the data into
> several tables then it might not be such a big deal since
> I am only adding and never deleting... This makes it a
> little more of a pain in the ass.
I am leaning toward a policy of always partitioning large
tables. I haven't found the time to do it properly yet,
thinking about it, hoping that someone who'se done it will
chime in with their rules of thumb.
Like Benjamin, I have a database that is close to 600GB
for 2.25 years of data, and if I were to add the other
4 years of data that we have archived away, will easily go
into the terabyte range. There are a few individual tables
which approach 100GB all by themselves.
As it is, I can't afford to do reindex or even backup
(pg_dump or any other method) or other administrative tasks
on those tables since the processes take too long (there are
workarounds, i could backup single tables at slack times,
which would allow me to do a complete backup (but not
self-consistent as a set) over the course of a week or so.
So I'm leaning toward partitioning, perhaps selecting
partition rules so that no table will be larger than
around 5GB, at which point, reindex or admin procedures
that take exclusive locks now become only minor
inconveniences rather than showstoppers.
How do people take consistent backups of very large
databases on Linux/FreeBSD? I'm aware of PITR, but
might not be able to set aside a box with enough
drives for it. LVM Snapshot? performance issues with
LVM, etc?
tiger
--
Gerald Timothy Quimpo gerald.quimpo@qualservcentral.com
Business Systems Development, KFC/Mr Donut/Ramcar
There is nothing more dreadful than imagination without taste.
-- Johann Wolfgang von Goethe