Thread: Out of memory error on select

Out of memory error on select

From
Werner Bohl
Date:
I have a fairly large table (21M) records. One field of type varchar(16)
has some duplicate values, which I'm trying to identify.
Executing select dup_field from dup_table group by dup_field having
count(*) > 1 errs with Out of Memory error. Server has 4GB memory, the
backend-process errs after 3.7GB consumed. Is there any work-around that
I may use to get this duplicates?

Explain output:
"HashAggregate  (cost=881509.02..881510.02 rows=200 width=20)"
"  Filter: (count(*) > 1)"
"  ->  Seq Scan on lssi_base  (cost=0.00..872950.68 rows=1711668
width=20)"

Why is the hash eating so much memory? A fast calc of the memory
occupied by this data is less than 512MB.


TIA,


--
Werner Bohl <WernerBohl@infutor.com>
IDS de Costa Rica S.A.


Re: Out of memory error on select

From
Scott Marlowe
Date:
On Tue, 2005-04-05 at 16:04, Werner Bohl wrote:
> I have a fairly large table (21M) records. One field of type varchar(16)
> has some duplicate values, which I'm trying to identify.
> Executing select dup_field from dup_table group by dup_field having
> count(*) > 1 errs with Out of Memory error. Server has 4GB memory, the
> backend-process errs after 3.7GB consumed. Is there any work-around that
> I may use to get this duplicates?
>
> Explain output:
> "HashAggregate  (cost=881509.02..881510.02 rows=200 width=20)"
> "  Filter: (count(*) > 1)"
> "  ->  Seq Scan on lssi_base  (cost=0.00..872950.68 rows=1711668
> width=20)"
>
> Why is the hash eating so much memory? A fast calc of the memory
> occupied by this data is less than 512MB.

Have you run analyze across this table?  It looks like either you
haven't or the query planner is making a mistake about how many rows it
expects to get from this.

HashAggregate chews through memory pretty fast, and it best used for
smaller sets, so it's usually a mistake when the planner picks it for
large ones.

Re: Out of memory error on select

From
Greg Stark
Date:
Werner Bohl <WernerBohl@infutor.com> writes:

> Explain output:
> "HashAggregate  (cost=881509.02..881510.02 rows=200 width=20)"
> "  Filter: (count(*) > 1)"
> "  ->  Seq Scan on lssi_base  (cost=0.00..872950.68 rows=1711668
> width=20)"

If this is just a one-time query just do

set enable_hashagg = off

then run your query.

--
greg