Re: select count() out of memory - Mailing list pgsql-general

From Erik Jones
Subject Re: select count() out of memory
Date
Msg-id 86DC8709-CFBD-4E63-9605-2A9A567E3679@myemma.com
Whole thread Raw
In response to Re: select count() out of memory  (tfinneid@student.matnat.uio.no)
Responses Re: select count() out of memory  (tfinneid@student.matnat.uio.no)
List pgsql-general
On Oct 25, 2007, at 9:36 AM, tfinneid@student.matnat.uio.no wrote:

>> tfinneid@student.matnat.uio.no wrote:
>>>> tfinneid@student.matnat.uio.no wrote:
>>>>
>>>>>> are a dump of Postgres's current memory allocations and could be
>>>>> useful in
>>>>>> showing if there's a memory leak causing this.
>>>>>
>>>>> The file is 20M, these are the last lines: (the first line
>>>>> continues
>>>>> unttill ff_26000)
>>>>>
>>>>>
>>>>> idx_attributes_g1_seq_1_ff_4_value7: 1024 total in 1 blocks;
>>>>> 392 free
>>> (0
>>>>> chunks); 632 used
>>>>
>>>> You have 26000 partitions???
>>>
>>> At the moment the db has 55000 partitions, and thats only a fifth
>>> of the
>>> complete volume the system will have in production. The reason I
>>> chose
>>> this solution is that a partition will be loaded with new data every
>>> 3-30
>>> seconds, and all that will be read by up to 15 readers every time
>>> new
>>> data
>>> is available. The data will be approx 2-4TB in production in
>>> total. So
>>> it
>>> will be too slow if I put it in a single table with permanent
>>> indexes.
>>>
>>> I did a test previously, where I created 1 million partitions
>>> (without
>>> data) and I checked the limits of pg, so I think it should be ok.
>>
>> Clearly it's not.
>
> I does not mean my problem has anything to do with the number of
> partitions. It might have, or it might not, and thats the problem, the
> cause has not been located yet.
>
> According to the documented limits of pg,
>   The difference could be the memory usage and wastage
>> for all those relcache entries and other stuff.  I would reduce the
>> number of partitions to a more reasonable value (within the tens,
>> most
>> likely)
>
> The db worked fine until it reached perhaps 30-40 thousand partitions.

It depends on how you have the partitions set up and how you're
accessing them.  Are all of these partitions under the same parent
table?  If so, then trying run a SELECT COUNT(*) against the parent
table is simply insane.  Think about it, you're asking one query to
scan 55000 tables.  What you need to do is partition based on your
access patterns, not what you *think* will help with performance down
the road.  Look into constraint exclusion, whether or not you can
just access child tables directly, and whether you really need all of
these under one logical table.  Also, no matter how you do the
partitioning, once you get up to that many and more relations in your
system, dumps and restores take a lot longer.

Erik Jones
Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



pgsql-general by date:

Previous
From: "Scott Marlowe"
Date:
Subject: Re: select count() out of memory
Next
From: "Scott Marlowe"
Date:
Subject: Re: select count() out of memory