Re: horrendous query challenge :-) - Mailing list pgsql-general

From Tom Lane
Subject Re: horrendous query challenge :-)
Date
Msg-id 26423.1022711722@sss.pgh.pa.us
Whole thread Raw
In response to horrendous query challenge :-)  (Fran Fabrizio <ffabrizio@mmrd.com>)
List pgsql-general
Fran Fabrizio <ffabrizio@mmrd.com> writes:
> I'm posting this sort of as a learning exercise to see how others
> approach a slow query and what steps they take to optimize it.

You haven't really given us enough information.  For starters, do the
row count estimates shown in EXPLAIN have any relationship to reality?
(Posting EXPLAIN ANALYZE results, if you are on 7.2, would help answer
that question.)  Possibly even more relevant is what the functions used
in the view definition do --- it's not unlikely that the function
evaluations are where much of the time goes.

> monitor=# explain select
> cs.entity_id,e.type,e.name,w.interface_label,cs.remote_ts,s.name,r.name
> from current_status_test cs, entity e, site s, region r,watch w where
> cs.entity_id in (select entity_id from entity where active=true) and
> cs.current = false and cs.msgtype=w.msg_type and e.entity_id =
> cs.entity_id and s.region_id = r.region_id and cs.site_id = s.site_id
> order by cs.remote_ts desc, r.name, s.name;

IN is almost always bad news.  Is entity.entity_id a unique key?  If so
I'd think you could rewrite this into a join ... but wait, you're
*already* joining to entity.  Isn't the
    cs.entity_id in (select entity_id from entity where active=true)
clause replaceable by just
    e.active = true
given that you have e.entity_id = cs.entity_id in there already?

> View definition: SELECT findsite(e.entity_id) AS site_id, e.entity_id,
> get_status(e.entity_id, e.watch_id) AS status, e.watch_id, e.msg_type AS
> msgtype, cat.name AS msgcat, 'Ok' AS message, now() AS local_ts,
> e.last_contact AS remote_ts, e.current FROM entity_watch_map e,
> classifications class, categories cat WHERE (((e.last_contact >
> "timestamp"((date(now()) - 180))) AND (e.msg_type = class.msg_type)) AND
> (class.category_id = cat.category_id));

The e.last_contact > "timestamp"((date(now()) - 180)) clause will not be
indexable without some rework (at least not in PG 7.2 and earlier ---
this will be a nonissue once 7.3 comes out).  I am not sure how much
that matters; the clause may not be selective enough to justify trying
to indexscan on last_contact anyhow.  Again it's tough to say much in
advance of seeing EXPLAIN ANALYZE results.

            regards, tom lane

pgsql-general by date:

Previous
From: Manuel Sugawara
Date:
Subject: Re: Changing ownership of objects
Next
From: Richard Poole
Date:
Subject: Re: Query plan w/ like clause question