Thread: How to avoid large tables - WAS RE: Fast Inserts and H ardware Questions
How to avoid large tables - WAS RE: Fast Inserts and H ardware Questions
From
"Creager, Robert S"
Date:
Alex (or any one else), How would you suggest avoiding large tables? I attempting to develop an application which might see as much as 6M records inserted a day during peak times. Thankfully, this is a volunteer effort, so no real pressure for me to solve this problem quickly. Later, Rob Robert Creager Senior Software Engineer Client Server Library 303.673.2365 V 303.661.5379 F 888.912.4458 P StorageTek INFORMATION made POWERFUL > -----Original Message----- > From: adb [mailto:adb@Beast.COM] > Sent: Wednesday, March 14, 2001 5:15 PM > To: Orion Henry > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Fast Inserts and Hardware Questions > > > What I would be most concered about is a table that grows by > a few million > rows a week, that to me seems like a liability in itself since > maintenance on that table will get pretty slow after a few months. > > Alex. >
Well I guess it depends on what you are building but in general the ways to avoid large tables are either by partitioning them or by aggregating the data. In a datawarehouse you typically do lots of aggregation of your data and over time purge the most granular data or take it offline. In an application like a website with millions of users you typically partition the data. This can be done at a number of levels such as partition up the entire database where a given set of users is stored on a given database or partition up the table on a single database so that you take the customer table and split it into a bunch of smaller customer tables. This of course makes certain queries painful since the database won't automatically join all those tables for you if you wanted a count(*) on customer so you have to write more sql. Some of the more pricey databases like Oracle offer this partitioning built into the database engine where you can tell it to partition a given table based on ranges of some key and then you can do things like rebuild an index on just one partition or reload the data on a partition without taking the entire table out of service. More often than not, DBA's end up with ridiculously large tables because of inadequate requirements definitions up front. Too often people just want to store all kinds of stuff in a database that doesn't really need to be in there and then they fail to define any purge strategy. Alex. On Wed, 14 Mar 2001, Creager, Robert S wrote: > > Alex (or any one else), > > How would you suggest avoiding large tables? I attempting to develop an > application which might see as much as 6M records inserted a day during peak > times. Thankfully, this is a volunteer effort, so no real pressure for me > to solve this problem quickly. > > Later, > Rob > > Robert Creager > Senior Software Engineer > Client Server Library > 303.673.2365 V > 303.661.5379 F > 888.912.4458 P > StorageTek > INFORMATION made POWERFUL > > > > > -----Original Message----- > > From: adb [mailto:adb@Beast.COM] > > Sent: Wednesday, March 14, 2001 5:15 PM > > To: Orion Henry > > Cc: pgsql-general@postgresql.org > > Subject: Re: [GENERAL] Fast Inserts and Hardware Questions > > > > > > What I would be most concered about is a table that grows by > > a few million > > rows a week, that to me seems like a liability in itself since > > maintenance on that table will get pretty slow after a few months. > > > > Alex. > > >