BUG #4264: Optimizer fails to use hash_aggregate when appropriate. - Mailing list pgsql-bugs
From | Scott Carey |
---|---|
Subject | BUG #4264: Optimizer fails to use hash_aggregate when appropriate. |
Date | |
Msg-id | 200806251800.m5PI0XbB085962@wwwmaster.postgresql.org Whole thread Raw |
Responses |
Re: BUG #4264: Optimizer fails to use hash_aggregate when appropriate.
|
List | pgsql-bugs |
The following bug has been logged online: Bug reference: 4264 Logged by: Scott Carey Email address: scott@richrelevance.com PostgreSQL version: 8.3.3 Operating system: Linux (CentOS) Description: Optimizer fails to use hash_aggregate when appropriate. Details: The query optimizer fails to use a hash aggregate most of the time. This is an inconsistent behavior. On one particular table this is especially painful. This table has 24 million rows, and when aggregating on a column that the optimizer expects only a few unique values, it chooses a full sort of those 24 million rows before a group aggregate, rather than using a hash aggregate that would be 2 to 3 orders of magnitude faster and use less memory. The simple statement of this bug is the following EXPLAIN output and corresponding output from the statistics tables. The actual query used has a more complicated GROUP BY and aggregation (and joins, etc). The condition will occur for any column used to group by. Even one that has only two unique values in a 25 million row table. rr=# explain SELECT count(distinct v_guid) as view_count, p_type FROM p_log.creative_display_logs_012_2008_06_15 GROUP BY p_type; QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------- GroupAggregate (cost=5201495.80..5395385.38 rows=7 width=47) -> Sort (cost=5201495.80..5266125.63 rows=25851932 width=47) Sort Key: p_type -> Seq Scan on creative_display_logs_012_2008_06_15 (cost=0.00..1223383.32 rows=25851932 width=47) rr=# select attname, null_frac, avg_width,n_distinct ,correlation from pg_stats where tablename='creative_display_logs_012_2008_06_15' and attname in ('g_id', 'p_type', 'strat', 'datetime', 'ext_s_id', 't_id'); attname | null_frac | avg_width | n_distinct | correlation ----------------+-----------+-----------+------------+-------------- g_id | 0 | 8 | 14 | 0.221548 p_type | 0 | 4 | 7 | 0.350718 datetime | 0 | 8 | 12584 | 0.977156 ext_s_id | 0.001 | 38 | 11444 | -0.000842848 strat | 0 | 13 | 11 | 0.147418 t_id | 0 | 8 | 2 | 0.998711 (5 rows) I have dumped, dropped, and restored this table twice recently. Both times followed by a full vacuum analyze. And in both cases the query optimizer behaves differently. Additionally, this table is a partition table, and using the inheritance facade instead of the table produces consistently worse plans -- at first the direct-to-table query used the hash aggregate but not the one through inheritance and I thought this was a partitioning bug. But it definitely occurs in general and its reproducibility is affected by partitioning but not dependent on it. The database is tuned with the default optimizer settings for 8.3.3 + constraint exclusion for the partition tables. Yes, hash_agg is ON. It happens sometimes on some tables. The configuration has ample RAM and all the memory tuning parameters are generous (shared_mem 7g, temp space 200m, sort/agg space 500m -- I've tried various settings here with no effect on the plan, just the execution of it w.r.t. disk based sort or mem based sort). The table definition is: Column | Type | Modifiers --------------------+-----------------------------+----------- v_guid | character varying(255) | site_id | bigint | c_id | bigint | item_id | bigint | creative_id | bigint | camp_id | bigint | p_type | integer | datetime | timestamp without time zone | date | date | ext_u_id | character varying(50) | ext_s_id | character varying(50) | u_guid | character varying(50) | strat | character varying(50) | sub_p_type | character varying(32) | exp_id | bigint | t_id | bigint | htmlpi_id | bigint | p_score | double precision | Of course DB hints would solve this. So would some sort of tuning parameter that lets you dial up or down the tendency to do a hash aggregate rather than a full sort followed by a group aggregate. This is broken rather severely, especially in combination with partitions (where it is about 3x as likely to fail to use a hash_aggregate where appropriate in limited experiments so far -- there are a few thousand partition tables). All I want is it to stop being brain-dead and deciding to sort large tables to produce aggregates. In fact, given the rarity in which a sort is preferred over a hash_agg with large tables -- i'd turn off the group aggregate if possible! I have yet to look for a work-around by changing the statistics targets for the table, but I consider this a bug because even with the default sample size, the statistics given should PLAINLY lead to use of a hash_aggregate rather than a full sort followed by group aggregate. The optimizer clearly expects a small number of unique buckets after aggregation, and the number of unique items in the result would have to approach the number of rows in the table for a full sort to make any sense whatsoever. Thanks! -Scott
pgsql-bugs by date: