Thread: Out of memory error on select
I have a fairly large table (21M) records. One field of type varchar(16) has some duplicate values, which I'm trying to identify. Executing select dup_field from dup_table group by dup_field having count(*) > 1 errs with Out of Memory error. Server has 4GB memory, the backend-process errs after 3.7GB consumed. Is there any work-around that I may use to get this duplicates? Explain output: "HashAggregate (cost=881509.02..881510.02 rows=200 width=20)" " Filter: (count(*) > 1)" " -> Seq Scan on lssi_base (cost=0.00..872950.68 rows=1711668 width=20)" Why is the hash eating so much memory? A fast calc of the memory occupied by this data is less than 512MB. TIA, -- Werner Bohl <WernerBohl@infutor.com> IDS de Costa Rica S.A.
On Tue, 2005-04-05 at 16:04, Werner Bohl wrote: > I have a fairly large table (21M) records. One field of type varchar(16) > has some duplicate values, which I'm trying to identify. > Executing select dup_field from dup_table group by dup_field having > count(*) > 1 errs with Out of Memory error. Server has 4GB memory, the > backend-process errs after 3.7GB consumed. Is there any work-around that > I may use to get this duplicates? > > Explain output: > "HashAggregate (cost=881509.02..881510.02 rows=200 width=20)" > " Filter: (count(*) > 1)" > " -> Seq Scan on lssi_base (cost=0.00..872950.68 rows=1711668 > width=20)" > > Why is the hash eating so much memory? A fast calc of the memory > occupied by this data is less than 512MB. Have you run analyze across this table? It looks like either you haven't or the query planner is making a mistake about how many rows it expects to get from this. HashAggregate chews through memory pretty fast, and it best used for smaller sets, so it's usually a mistake when the planner picks it for large ones.
Werner Bohl <WernerBohl@infutor.com> writes: > Explain output: > "HashAggregate (cost=881509.02..881510.02 rows=200 width=20)" > " Filter: (count(*) > 1)" > " -> Seq Scan on lssi_base (cost=0.00..872950.68 rows=1711668 > width=20)" If this is just a one-time query just do set enable_hashagg = off then run your query. -- greg