Thread: very large table

very large table

From
"Praveen Raja"
Date:

Hi

 

I’m trying to move an existing solution from MySQL to PostgreSQL. As it is now the solution has 4 tables where data in inserted by an application. At regular intervals (10min) data from these tables is consolidated and moved to another table for reporting purposes. There exist many instances of these reporting tables and in total they are expected to hold about 500 million rows. There are about 200 of these reporting tables at the moment with data split among them. When a request comes in all these tables are searched. While moving to PostgreSQL is it a good idea to move from using multiple tables to one table for so many rows?

Re: very large table

From
Tom Lane
Date:
"Praveen Raja" <praveen.raja@netlight.se> writes:
> I'm trying to move an existing solution from MySQL to PostgreSQL. As it
> is now the solution has 4 tables where data in inserted by an
> application. At regular intervals (10min) data from these tables is
> consolidated and moved to another table for reporting purposes. There
> exist many instances of these reporting tables and in total they are
> expected to hold about 500 million rows. There are about 200 of these
> reporting tables at the moment with data split among them. When a
> request comes in all these tables are searched. While moving to
> PostgreSQL is it a good idea to move from using multiple tables to one
> table for so many rows?

If the multiple tables represent a partitioning scheme that makes sense
to your application (ie, you can tell a priori which tables to look in
for a given query) then it's probably worth keeping.  But it sounds like
they don't make that much sense, since you mention searching all the tables.
In that case you should think about consolidating.

There is lots of stuff in the recent list archives about partitioned
tables; might be worth reading, even though much of it is talking about
features we don't yet have.  It would point out the issues you need
to think about --- for example, do you periodically discard some of the
data, and if so do the tables correspond to the discard units?  DROP
TABLE is a lot quicker than trying to DELETE and then VACUUM a portion
of a very large table.

            regards, tom lane

Re: very large table

From
Simon Riggs
Date:
On Tue, 2005-05-31 at 11:37 +0200, Praveen Raja wrote:
> I’m trying to move an existing solution from MySQL to PostgreSQL. As
> it is now the solution has 4 tables where data in inserted by an
> application. At regular intervals (10min) data from these tables is
> consolidated and moved to another table for reporting purposes. There
> exist many instances of these reporting tables and in total they are
> expected to hold about 500 million rows. There are about 200 of these
> reporting tables at the moment with data split among them. When a
> request comes in all these tables are searched.

> While moving to PostgreSQL is it a good idea to move from using
> multiple tables to one table for so many rows?

No. All of the same reasoning applies.

Try to keep each table small enough to fit easily in RAM.

Make sure you specify WITHOUT OIDS on the main data tables.

Best Regards, Simon Riggs