Brian Cox <brian.cox@ca.com> writes:
> Tom Lane [tgl@sss.pgh.pa.us] wrote:
>> Um, are you sure that is the query that PID 7397 is running? It doesn't
>> match your previous pg_stat_activity printout, nor do I see anything
>> about partitioning by PKs.
> Umm, indeed. I had to construct the query by hand and left out the
> partition part. Here's the full query. Also, I took the liberty of
> reducing the completely expanded column list (shown in part in the
> pg_stat_activity printout) in the actual query to "*".
Okay ... I think the problem is right here:
> Nested Loop IN Join (cost=82370.45..128489.59 rows=1 width=779)
> Join Filter: (b.ts_id = a.ts_id)
> -> Index Scan using ts_stats_transetgroup_user_daily_pkey on
> ts_stats_transetgroup_user_daily a (cost=0.00..8.22 rows=1 width=779)
> Index Cond: ((ts_id > 0) AND (ts_id < 100000))
> -> Hash Join (cost=82370.45..127026.87 rows=232721 width=8)
It's choosing this plan shape because it thinks that the indexscan on
ts_stats_transetgroup_user_daily will return only one row, which I bet
is off by something close to 100000x. The memory usage pulsation
corresponds to re-executing the inner hash join, from scratch (including
rebuilding its hash table) for each outer row. Ouch.
This seems like kind of a stupid plan anyway (which PG version was this
exactly?) but certainly the big issue is the catastrophically bad
rowcount estimate for the indexscan. Do you have ANALYZE stats for
ts_stats_transetgroup_user_daily at all (look in pg_stats)?
regards, tom lane