Re: help with query - Mailing list pgsql-performance
From | Brad Bulger |
---|---|
Subject | Re: help with query |
Date | |
Msg-id | 4124AD68.8000208@madfish.com Whole thread Raw |
In response to | help with query (Dave Cramer <pg@fastcrypt.com>) |
Responses |
Re: help with query
Re: help with query |
List | pgsql-performance |
You're doing a join except not, is the trouble, looks like. The query is really "FROM Tickets main, Links", but when Tickets.id is 17417, you've got no join to the Links table. So you end up getting every row in Links for each row in Tickets with id = 17417. I'd think this wants to be two queries or a union: SELECT distinct main.oid,main.* FROM Tickets main WHERE (main.EffectiveId = main.id) AND (main.Status != 'deleted') AND ( (main.Type = 'ticket') OR (main.Type = 'subticket') ) AND ( (main.Queue = '9') ) AND ( (main.id = '17417')) union SELECT distinct main.oid,main.* FROM Tickets main, Links WHERE (main.EffectiveId = main.id) AND (main.Status != 'deleted') AND ( (main.Type = 'ticket') OR (main.Type = 'subticket') ) AND ( (main.Queue = '9') ) AND ( (Links.Type = 'MemberOf') ) AND ( (Links.LocalTarget = '17417') ) AND ( (main.id = Links.LocalBase) ) OR (main.id = Links.LocalTarget) ) ; or else, yah, a subquery: [...] AND ( main.id = '17417' or exists( select true from Links where Type = 'MemberOf' and LocalTarget = '17417' and (LocalBase = main.id or LocalTarget = main.id) ) ) Those are the only things I can think of to make it work, anyways. Dave Cramer wrote: > RT uses a query like: > > SELECT distinct main.oid,main.* FROM Tickets main > WHERE > (main.EffectiveId = main.id) > AND > (main.Status != 'deleted') > AND > ( (main.Type = 'ticket') OR (main.Type = 'subticket') ) > AND > ( (main.Queue = '9') ) > AND (( > ( (Links.Type = 'MemberOf') AND (Links.LocalTarget = '17417') AND (main.id = Links.LocalBase) ) > OR > ( (Links.Type = 'MemberOf') AND (Links.LocalBase = '17417') AND (main.id = Links.LocalTarget) ) > or > (main.id = '17417') > ) > ); > > > which produces a query plan: > > Nested Loop (cost=0.00..813.88 rows=1 width=169) > Join Filter: (((("inner"."type")::text = 'MemberOf'::text) OR ("outer".id = 17417)) AND (("inner".localbase = 17417)OR (("inner"."type")::text = 'MemberOf'::text) OR ("outer".id > = 17417)) AND (("outer".id = "inner".localtarget) OR (("inner"."type")::text = 'MemberOf'::text) OR ("outer".id = 17417))AND ((("inner"."type")::text = 'MemberOf'::text) OR ("inner" > .localtarget = 17417) OR ("outer".id = 17417)) AND (("inner".localbase = 17417) OR ("inner".localtarget = 17417) OR ("outer".id= 17417)) AND (("outer".id = "inner".localtarget) OR ( > "inner".localtarget = 17417) OR ("outer".id = 17417)) AND ((("inner"."type")::text = 'MemberOf'::text) OR ("outer".id ="inner".localbase) OR ("outer".id = 17417)) AND (("inner".loca > lbase = 17417) OR ("outer".id = "inner".localbase) OR ("outer".id = 17417)) AND (("outer".id = "inner".localtarget) OR("outer".id = "inner".localbase) OR ("outer".id = 17417))) > -> Index Scan using tickets1 on tickets main (cost=0.00..657.61 rows=1 width=169) > Index Cond: (queue = 9) > Filter: ((effectiveid = id) AND ((status)::text <> 'deleted'::text) AND ((("type")::text = 'ticket'::text) OR(("type")::text = 'subticket'::text))) > -> Seq Scan on links (cost=0.00..46.62 rows=1462 width=20) > > If I rewrite the query as: > > SELECT main.* FROM Tickets main > WHERE > (main.EffectiveId = main.id) > AND > (main.Status != 'deleted') > AND > ( (main.Type = 'ticket') OR (main.Type = 'subticket') ) > AND > ( (main.Queue = '9') ) > AND ( > 17417 in (select links.localtarget from links where links.type='MemberOf' and main.id=links.localbase) > or > 17417 in ( select links.localbase from links where links.type='MemberOf' and main.id=links.localtarget) > or > main.id = '17417' > ) > ; > > The time for the query goes from 1500ms to 15ms. The two OR clauses > > ( (Links.Type = 'MemberOf') AND (Links.LocalTarget = '17417') AND (main.id = Links.LocalBase) ) > OR > ( (Links.Type = 'MemberOf') AND (Links.LocalBase = '17417') AND (main.id = Links.LocalTarget) ) > > don't contribute to the result set in this particular dataset, which is why the speed increases so dramatically. > > Is there a way to rewrite the top query to get the same results? I have already talked to Best Practical, > and subqueries are not easily embraced. > > Dave
pgsql-performance by date: