Re: Out of memory error during large hashagg - Mailing list pgsql-bugs

From Casey Duncan
Subject Re: Out of memory error during large hashagg
Date
Msg-id B1C34EEB-6102-489F-B4F1-884059F61251@pandora.com
Whole thread Raw
In response to Re: Out of memory error during large hashagg  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Out of memory error during large hashagg
List pgsql-bugs
I posted that in a subsequent mail, but here it is again:

> I'm interested in collecting info on the distribution of data.
> Can you post:
>
> select tablename, attname, n_distinct from pg_stats
> where attname = 'st_id';
>

   tablename   | attname | n_distinct
--------------+---------+------------
st            | st_id   |         -1
seed          | st_id   |     164656
feed          | st_id   |      14250
book          | st_id   |      14856
legacy_st     | st_id   |         -1
(5 rows)

I ran analyze after this, but the results were roughly the same.


> select count(distinct st_id) from seed;
>

   count
----------
40418083
(1 row)

Looks a tad bit different than the above ;^)

> and also the table definition, including the PK
>

Table "public.seed"
     Column    |            Type             |   Modifiers
--------------+-----------------------------+---------------
seed_id      | bigint                      | not null
mc_id        | character varying(20)       |
st_id        | bigint                      |
date_created | timestamp without time zone | default now()
Indexes:
     "seed_pkey" PRIMARY KEY, btree (seed_id)
     "seed_st_mc_id_idx" UNIQUE, btree (mc_id, st_id)
     "seed_mc_id_idx" btree (mc_id)
     "seed_st_id" btree (st_id)
Foreign-key constraints:
     "seed_st_id_fkey" FOREIGN KEY (st_id) REFERENCES st(st_id) ON
DELETE RESTRICT

I imagine this means I need to crank up the statistics on that column.

Thanks

-Casey


On Sep 23, 2006, at 3:17 PM, Tom Lane wrote:

> Casey Duncan <casey@pandora.com> writes:
>> select st_id, min(seed_id) as "initial_seed_id", count(*) as
>> "seed_count" from seed group by st_id;
>
>> The query plan and table stats are:
>
>>                                QUERY PLAN
>> ---------------------------------------------------------------------
>> --
>> HashAggregate  (cost=1362694.83..1365164.68 rows=164656 width=16)
>>     ->  Seq Scan on seed  (cost=0.00..964065.62 rows=53150562
>> width=16)
>
> How many distinct st_id values are there really?  The planner's
> evidently expecting 164656 but I suppose that's wrong?  What's
> in pg_stats for st_id?
>
>             regards, tom lane

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: Out of memory error during large hashagg
Next
From: Tom Lane
Date:
Subject: Re: Out of memory error during large hashagg