Re: select count() out of memory - Mailing list pgsql-general

From Tom Lane
Subject Re: select count() out of memory
Date
Msg-id 18094.1193331308@sss.pgh.pa.us
Whole thread Raw
In response to Re: select count() out of memory  (tfinneid@student.matnat.uio.no)
Responses Re: select count() out of memory  (Thomas Finneid <tfinneid@student.matnat.uio.no>)
List pgsql-general
tfinneid@student.matnat.uio.no writes:
>> In other words, you really should have only one table; they aren't
>> independent.  What you need to do is dial down your ideas of how many
>> partitions are reasonable to have.

> Yes, but no. Each partition represents a chunk of information on a
> discrete timeline. So there is no point in grouping it all into a single
> table, because the access pattern is to only access data from a specific
> point in time, i.e. a single partition, usually the latest. Since the
> amount of data is so big, approx 3MB per second, and each partition needs
> to be indexed before the clients start reading the data (in the same
> second). I find its better to use partitions, even though I am not
> actually using it.

You are making a common beginner error, which is to suppose that N
little tables are better than one big one.  They are not.  What you're
effectively doing is replacing the upper levels of a big table's indexes
with lookups in the system catalogs, which in point of fact is a
terrible tradeoff from a performance standpoint.

From a database-theory standpoint, if all this data is alike then you
should have it all in one big table.  There are certain practical cases
where it's worth partitioning, but not at the level of granularity that
you are proposing.  This is why nobody, not even Oracle, tries to
support tens of thousands of partitions.

            regards, tom lane

pgsql-general by date:

Previous
From: Gregory Stark
Date:
Subject: Re: 8.3b1 in production?
Next
From: Steve Crawford
Date:
Subject: Re: select count() out of memory