Thread: How to avoid large tables - WAS RE: Fast Inserts and H ardware Questions

How to avoid large tables - WAS RE: Fast Inserts and H ardware Questions

From

"Creager, Robert S"

Date:

14 March 2001, 20:06:52

Alex (or any one else),

How would you suggest avoiding large tables?  I attempting to develop an
application which might see as much as 6M records inserted a day during peak
times.  Thankfully, this is a volunteer effort, so no real pressure for me
to solve this problem quickly.

Later,
Rob

Robert Creager
Senior Software Engineer
Client Server Library
303.673.2365 V
303.661.5379 F
888.912.4458 P
StorageTek
INFORMATION made POWERFUL



> -----Original Message-----
> From: adb [mailto:adb@Beast.COM]
> Sent: Wednesday, March 14, 2001 5:15 PM
> To: Orion Henry
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Fast Inserts and Hardware Questions
>
>
> What I would be most concered about is a table that grows by
> a few million
> rows a week, that to me seems like a liability in itself since
> maintenance on that table will get pretty slow after a few months.
>
> Alex.
>

Re: How to avoid large tables - WAS RE: Fast Inserts and H ardware Questions

From

adb

Date:

14 March 2001, 20:53:43

Well I guess it depends on what you are building but in general the ways
to avoid large tables are either by partitioning them or by aggregating the
data.

In a datawarehouse you typically do lots of aggregation of your data and
over time purge the most granular data or take it offline.

In an application like a website with millions of users you typically
partition the data.  This can be done at a number of levels such as
partition up the entire database where a given set of users is stored on
a given database or partition up the table on a single database so that
you take the customer table and split it into a bunch of smaller customer
tables.  This of course makes certain queries painful since the
database won't automatically join all those tables for you if you wanted
a count(*) on customer so you have to write more sql.

Some of the more pricey databases like Oracle offer this partitioning
built into the database engine where you can tell it to partition a
given table based on ranges of some key and then you can do things
like rebuild an index on just one partition or reload the data
on a partition without taking the entire table out of service.

More often than not, DBA's end up with ridiculously large tables
because of inadequate requirements definitions up front.  Too often
people just want to store all kinds of stuff in a database that doesn't
really need to be in there and then they fail to define any purge
strategy.

Alex.

On Wed, 14 Mar 2001, Creager, Robert S wrote:

>
> Alex (or any one else),
>
> How would you suggest avoiding large tables?  I attempting to develop an
> application which might see as much as 6M records inserted a day during peak
> times.  Thankfully, this is a volunteer effort, so no real pressure for me
> to solve this problem quickly.
>
> Later,
> Rob
>
> Robert Creager
> Senior Software Engineer
> Client Server Library
> 303.673.2365 V
> 303.661.5379 F
> 888.912.4458 P
> StorageTek
> INFORMATION made POWERFUL
>
>
>
> > -----Original Message-----
> > From: adb [mailto:adb@Beast.COM]
> > Sent: Wednesday, March 14, 2001 5:15 PM
> > To: Orion Henry
> > Cc: pgsql-general@postgresql.org
> > Subject: Re: [GENERAL] Fast Inserts and Hardware Questions
> >
> >
> > What I would be most concered about is a table that grows by
> > a few million
> > rows a week, that to me seems like a liability in itself since
> > maintenance on that table will get pretty slow after a few months.
> >
> > Alex.
> >
>