memory usage of group by select - Mailing list pgsql-general

From Anthony
Subject memory usage of group by select
Date
Msg-id 71cd4dd90912291241p57d8ef98w9fd4feda573eb677@mail.gmail.com
Whole thread Raw
Responses Re: memory usage of group by select  (Anthony <osm@inbox.org>)
List pgsql-general
Hi all,

I'm running a group by query on a table with over a billion rows and my memory usage is seemingly growing without bounds.  Eventually the mem usage exceeds my physical memory and everything starts swapping.  Here is what I gather to be the relevant info:

My machine has 768 megs of ram.

shared_buffers = 128MB
work_mem = 8MB # this was originally higher, but I brought it down to try to fix the problem - it hasn't
maintenance_work_mem = 256MB
fsync = off
checkpoint_segments = 30
effective_cache_size = 256MB #this was originally 512MB but I just recently brought it down - as I expected that didn't affect anything

data=# explain select pid, min(oid) into nd_min from nd group by pid;
                               QUERY PLAN
------------------------------------------------------------------------
 HashAggregate  (cost=28173891.00..28174955.26 rows=85141 width=8)
   ->  Seq Scan on nd  (cost=0.00..21270355.00 rows=1380707200 width=8)
(2 rows)

data=# \d+ nd
            Table "fullplanet091207osm.nd"
 Column |  Type   | Modifiers | Storage | Description
--------+---------+-----------+---------+-------------
 oid    | integer | not null  | plain   |
 pid    | integer | not null  | plain   |
 ref    | integer |           | plain   |
Indexes:
    "nd_pkey" PRIMARY KEY, btree (pid, oid)
Has OIDs: no

VERSION = 'PostgreSQL 8.4.1 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Ubuntu 4.4.1-3ubuntu3) 4.4.1, 64-bit'

pgsql-general by date:

Previous
From: "Leonardo M." Ramé
Date:
Subject: Re: DDL commands take forever
Next
From: Tom Lane
Date:
Subject: Re: Planner Row Estimate with Function