Thread: Poor Plan selected w/ not provided a date/time but selecting date/time from a table
Poor Plan selected w/ not provided a date/time but selecting date/time from a table
From
Ow Mun Heng
Date:
Query2 is way faster mainly because the plan does not choose a seq scan on a table w/ >20million rows. The only difference between Query 1 and query 2 is that the audio_key_dtime is chosen from a table rather than provided on the query. I'm not sure why this is the case and why it chooses such plans. (should I be posting to pg-performance?) Query1 ===== select foo, bar, foobar from A, join B on a.id = b.id join C on c.id = b.id and c.start_dtime = b.start_dtime where audit_key_dtime >= (select last_refreshed from denorm_log where tablename = 'zon') and audit_key_dtime < (select last_refreshed + refresh_interval from denorm_log where tablename = 'zon') Query2 ===== select foo, bar, foobar from A, join B on a.id = b.id join C on c.id = b.id and c.start_dtime = b.start_dtime where audit_key_dtime >= '2007-08-08 18:00:00' and audit_key_dtime < '2007-08-08 18:01:00' structure of denorm_log (very small table, max 10 rows) ======================= Table "xmms.denorm_log" Column | Type | Modifiers -------------------------+-----------------------------+--------------- table_name | text | not null denorm_table_name | text | not null last_refreshed | timestamp without time zone | refresh_interval | interval | record_update_date_time | timestamp without time zone | default now() sql_delete | text | sql_insert | text | Indexes: "denorm_log_pkey" PRIMARY KEY, btree (table_name) Plan 1 ====== HashAggregate (cost=478784.63..478784.92 rows=1 width=139) (actual time=60593.583..60601.532 rows=82 loops=1) InitPlan -> Index Scan using denorm_log_pkey on denorm_log (cost=0.00..8.27 rows=1 width=8) (actual time=0.040..0.045 rows=1loops=1) Index Cond: (table_name = 'zon'::text) -> Index Scan using denorm_log_pkey on denorm_log (cost=0.00..8.27 rows=1 width=24) (actual time=0.015..0.016 rows=1loops=1) Index Cond: (table_name = 'zon'::text) -> Nested Loop (cost=387334.78..478768.01 rows=1 width=139) (actual time=46024.153..59479.035 rows=1545 loops=1) -> Hash Join (cost=387334.78..478759.22 rows=1 width=109) (actual time=45999.776..58651.991 rows=1545 loops=1) Hash Cond: (((trz.id)::text = (ts.id)::text) AND (trz.start_dtime = ts.start_dtime)) -> Index Scan using idx_trz_uptime on zon trz (cost=0.00..3252.87 rows=65008 width=54) (actual time=25.905..240.211rows=1545 loops=1) Index Cond: ((audit_key_dtime >= $0) AND (audit_key_dtime < $1)) Filter: (pber_err_rate_hd_zn_2 <> 0::numeric) -> Hash (cost=242589.91..242589.91 rows=5518191 width=63) (actual time=45873.643..45873.643 rows=5518191loops=1) -> Seq Scan on ts (cost=0.00..242589.91 rows=5518191 width=63) (actual time=355.084..28490.120 rows=5518191loops=1) -> Index Scan using d_pkey on drive d (cost=0.00..8.70 rows=1 width=62) (actual time=0.506..0.510 rows=1 loops=1545) Index Cond: ((d.id)::text = (trz.id)::text) Total runtime: 60607.891 ms Plan 2 - No Seq Scans ====== HashAggregate (cost=28.16..28.46 rows=1 width=139) (actual time=2240.899..2248.964 rows=82 loops=1) -> Nested Loop (cost=0.00..28.08 rows=1 width=139) (actual time=485.178..1175.938 rows=1545 loops=1) -> Nested Loop (cost=0.00..19.29 rows=1 width=109) (actual time=485.032..1125.322 rows=1545 loops=1) -> Index Scan using idx_trz_uptime on zon trz (cost=0.00..9.70 rows=1 width=54) (actual time=24.481..33.667rows=1545 loops=1) Index Cond: ((audit_key_dtime >= '2007-08-08 18:00:00'::timestamp without time zone) AND (audit_key_dtime< '2007-08-08 18:01:00'::timestamp without time zone)) Filter: (web_id <> 0::numeric) -> Index Scan using idx_ts_sn_sdate_ttype on ts (cost=0.00..9.58 rows=1 width=63) (actual time=0.696..0.698rows=1 loops=1545) Index Cond: (((ts.id)::text = (trz.id)::text) AND (ts.start_dtime = trz.start_dtime)) -> Index Scan using d_pkey on d (cost=0.00..8.70 rows=1 width=62) (actual time=0.012..0.015 rows=1 loops=1545) Index Cond: ((d.id)::text = (trz.id)::text) Total runtime: 2250.075 ms
Re: Poor Plan selected w/ not provided a date/time but selecting date/time from a table
From
Richard Huxton
Date:
Ow Mun Heng wrote: > Query2 is way faster mainly because the plan does not choose a seq scan > on a table w/ >20million rows. > The only difference between Query 1 and query 2 is that the > audio_key_dtime is chosen from a table rather than provided on the > query. > > I'm not sure why this is the case and why it chooses such plans. > (should I be posting to pg-performance?) Your query plans don't seem to match your queries. That makes it difficult to provide meaningful advice. -- Richard Huxton Archonet Ltd
Re: Poor Plan selected w/ not provided a date/time but selecting date/time from a table
From
Ow Mun Heng
Date:
On Wed, 2007-10-17 at 20:37 +0100, Richard Huxton wrote: > Ow Mun Heng wrote: > > Query2 is way faster mainly because the plan does not choose a seq scan > > on a table w/ >20million rows. > > The only difference between Query 1 and query 2 is that the > > audio_key_dtime is chosen from a table rather than provided on the > > query. > > > > I'm not sure why this is the case and why it chooses such plans. > > (should I be posting to pg-performance?) > > Your query plans don't seem to match your queries. That makes it > difficult to provide meaningful advice. > > Well, then that makes both you and me(both) stumped. because the 2 queries are exactly the same except for the data part. Index Cond: ((audit_key_dtime >= $0) AND (audit_key_dtime < $1)) Index Cond: ((audit_key_dtime >= '2007-08-08 18:00:00'::timestamp without time zone) AND (audit_key_dtime < '2007-08-08 18:01:00'::timestamp without time zone)) This is _the_ only difference between the 2 queries where on one, the dates are provided, and the other is selected from a table. I have no idea why the plans are so different between the two.
Re: Poor Plan selected w/ not provided a date/time but selecting date/time from a table
From
Alvaro Herrera
Date:
Ow Mun Heng wrote: > Index Cond: ((audit_key_dtime >= $0) AND (audit_key_dtime < $1)) > > > Index Cond: ((audit_key_dtime >= '2007-08-08 18:00:00'::timestamp without time zone) > AND (audit_key_dtime < '2007-08-08 18:01:00'::timestamp without time zone)) > > This is _the_ only difference between the 2 queries where on one, the > dates are provided, and the other is selected from a table. > > I have no idea why the plans are so different between the two. The difference is that it has to consider the worst possibility in the second case, whereas the other one knows the interval is just one minute. I didn't read the entire thread but I've seen similar things go much better when you grab the constants beforehand and interpolate them into the query by yourself. Yes, it's really ugly. -- Alvaro Herrera http://www.flickr.com/photos/alvherre/ "Crear es tan difícil como ser libre" (Elsa Triolet)
Re: Poor Plan selected w/ not provided a date/time but selecting date/time from a table
From
Ow Mun Heng
Date:
On Wed, 2007-10-17 at 22:47 -0300, Alvaro Herrera wrote: > Ow Mun Heng wrote: > > > Index Cond: ((audit_key_dtime >= $0) AND (audit_key_dtime < $1)) > > > > > > Index Cond: ((audit_key_dtime >= '2007-08-08 18:00:00'::timestamp without time zone) > > AND (audit_key_dtime < '2007-08-08 18:01:00'::timestamp without time zone)) > > > > This is _the_ only difference between the 2 queries where on one, the > > dates are provided, and the other is selected from a table. > > > > I have no idea why the plans are so different between the two. > > The difference is that it has to consider the worst possibility in the > second case, whereas the other one knows the interval is just one minute. Actually the dates are just 1 min apart in both cases. > > I didn't read the entire thread but I've seen similar things go much > better when you grab the constants beforehand and interpolate them into > the query by yourself. Yes, it's really ugly. > Yeah.. In the end, I just made a hack of it and did sort of like a dymanic sql. (it is a dynamic sql anyway) where I selected the dates into a variable (the whole query is encapsulated as a function ran at x intervals) and then use query_string := replace(query_string,'fromdate',date_inputed_into_variable) query_string := replace(query_string,'todate',date_inputed_into_variable+refresh_interval) and that got me much better performance.
Re: Poor Plan selected w/ not provided a date/time but selecting date/time from a table
From
Richard Huxton
Date:
Ow Mun Heng wrote: > On Wed, 2007-10-17 at 20:37 +0100, Richard Huxton wrote: >> Your query plans don't seem to match your queries. That makes it >> difficult to provide meaningful advice. > > Well, then that makes both you and me(both) stumped. because the 2 > queries are exactly the same except for the data part. Your queries were like this: > select > foo, > bar, > foobar > from > A, > join B > on a.id = b.id > join C > on c.id = b.id > and c.start_dtime = b.start_dtime > where audit_key_dtime >= '2007-08-08 18:00:00' > and audit_key_dtime < '2007-08-08 18:01:00' Your plans contained these: > Filter: (pber_err_rate_hd_zn_2 <> 0::numeric) > Filter: (web_id <> 0::numeric) I don't see the <> 0 condition anywhere in your SQL - presumably there's a view involved somehwere? -- Richard Huxton Archonet Ltd
Re: Poor Plan selected w/ not provided a date/time but selecting date/time from a table
From
Ow Mun Heng
Date:
On Thu, 2007-10-18 at 07:28 +0100, Richard Huxton wrote: > Ow Mun Heng wrote: > > On Wed, 2007-10-17 at 20:37 +0100, Richard Huxton wrote: > >> Your query plans don't seem to match your queries. That makes it > >> difficult to provide meaningful advice. > > > > Well, then that makes both you and me(both) stumped. because the 2 > > queries are exactly the same except for the data part. > > Your queries were like this: > > > select > > foo, > > bar, > > foobar > > from > > A, > > join B > > on a.id = b.id > > join C > > on c.id = b.id > > and c.start_dtime = b.start_dtime > > where audit_key_dtime >= '2007-08-08 18:00:00' > > and audit_key_dtime < '2007-08-08 18:01:00' and web_id <> 0 > > > Your plans contained these: > > > Filter: (pber_err_rate_hd_zn_2 <> 0::numeric) > > > Filter: (web_id <> 0::numeric) > > I don't see the <> 0 condition anywhere in your SQL - presumably there's > a view involved somehwere? My apologies, in an effort to disguise the column_names, I missed out the 1st one. There is no View it's just a filtering condition. Having said that, the only difference between the 2 queries are just the dates. (or the way the dates are provided to the query..)
Re: Poor Plan selected w/ not provided a date/time but selecting date/time from a table
From
Alvaro Herrera
Date:
Ow Mun Heng wrote: > > On Wed, 2007-10-17 at 22:47 -0300, Alvaro Herrera wrote: > > Ow Mun Heng wrote: > > > > > Index Cond: ((audit_key_dtime >= $0) AND (audit_key_dtime < $1)) > > > > > > > > > Index Cond: ((audit_key_dtime >= '2007-08-08 18:00:00'::timestamp without time zone) > > > AND (audit_key_dtime < '2007-08-08 18:01:00'::timestamp without time zone)) > > > > > > This is _the_ only difference between the 2 queries where on one, the > > > dates are provided, and the other is selected from a table. > > > > > > I have no idea why the plans are so different between the two. > > > > The difference is that it has to consider the worst possibility in the > > second case, whereas the other one knows the interval is just one minute. > > Actually the dates are just 1 min apart in both cases. Of course. My point is that the planner doesn't know that in the first case. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.