Re: Query optimizing - Mailing list pgsql-sql
From | Helio Campos Mello de Andrade |
---|---|
Subject | Re: Query optimizing |
Date | |
Msg-id | 29e3942f0811100348q542b0ceck5a6af51296a4f45a@mail.gmail.com Whole thread Raw |
In response to | Query optimizing ("Sebastian Ritter" <ritter.sebastian@gmail.com>) |
Responses |
Re: Query optimizing
|
List | pgsql-sql |
Hi Sebastian,<br /><br /> - First of all i think there is an open-parenthesis missing in the query V2.<br />Maybe in theV2 version you cold restrict the results in the INNER query a bit more if you use a restriction clause like "WHERE n_issue= <a href="http://i.id/" target="_blank">i.id</a>" in that. It will certainly lower the number of rows returned byit to only 1 result. <br /><br />Regards<br /><br />--<br />Helio Campos Mello de Andrade<br /> <br /><br /><div class="gmail_quote">OnMon, Nov 10, 2008 at 8:43 AM, Sebastian Ritter <span dir="ltr"><<a href="mailto:ritter.sebastian@gmail.com">ritter.sebastian@gmail.com</a>></span>wrote:<br /><blockquote class="gmail_quote"style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Hi all,<br/><br />I was hoping to receive some advise on a slow running query in our business' Issue Tracking System. To shedsome light on the below mentioned queries, here is a brief summary of how users interact with the system. The two maincomponents in the system are a Issues and Followups. An Issue is created by our technical support staff when some formof action needs to be taken in order to resolve a problem. FollowUps are entries entered against an issue depicting thesequence of events taken to resolve the issue. There are about 15,000 Issues in the system at present and about 95,000FollowUps in the system. As we need the system to be very responsive, each query should ideally run in under 1 second.<br/><br />A lot of the reports our technical officers submit to us include a listing of all actioned issues for agiven day along with the last modified followup of each said issue. With the number of rows in our database increasing ata high rate, these queries are starting to run too slowly. <br /><br />Here is a condensed version of the two tables:<br/><br />Issues:<br />=================================<br />id - integer <br />dt_created - timestamp<br/>dt_modified - timestamp<br />t_title - varchar<br />t_description - varchar<br /><br />FollowUps:<br /> =================================<br/> id - integer <br /> dt_created - timestamp<br /> dt_modified - timestamp<br/> t_description - varchar<br />n_issue - foregin key to issues<br /><br />We have identified that the slownessin our queries is trying to return the lastest followup for each actioned issue that day. Without further ado hereare two variations I have tried within the system (neither of which are making the cut):<br /><br />V1 (correlated subquery- Very bad performance)<br /><br /> (SELECT<br /> fu.*<br /> FROM<br /> manage_followup fu,<br /> manage_issue i<br /> WHERE<br /> <a href="http://i.id" target="_blank">i.id</a> = fu.n_issue<br /> AND<br /> <a href="http://fu.id" target="_blank">fu.id</a> = (SELECT<br /> id<br /> FROM<br /> manage_followup<br /> WHERE<br /> n_issue = <a href="http://i.id" target="_blank">i.id</a><br /> ORDER BY<br/> dt_modified DESC<br /> LIMIT 1)) AS latestfu,<br /><br />V2 (Using GroupBy, "max" aggregate function and distinct- better performance, but still bad because of distinct) <br /><br /><br />SELECTDISTINCT ON (fu.n_issue)<br /> <a href="http://fu.id" target="_blank">fu.id</a>,<br/> fu.dt_created,<br /> fu.dt_modified,<br/> fu.t_description,<br /> fu.n_issue asissue_id<br /> FROM<br /> manage_followup fu,<br /> (SELECT<br /> n_issue,<br /> max(dt_modified) as dt_modified<br /> FROM<br /> manage_followup<br /> GROUP BY<br /> n_issue) as max_modified<br /> WHERE<br /> max_modified.n_issue = fu.n_issue<br /> AND<br /> fu.dt_modified = max_modified.dt_modified) AS latestfu ON (latestfu.issue_id = <a href="http://i.id"target="_blank">i.id</a>),<br /><br />We must use distinct here as we sometimes use batch scripts to enterfollowups, which will give them all similar, if not equal, modification dates. We also can't use followup ids as anindicator of the latest followup because users of the system can retrospectively go back and change older followups.<br/><br />I was hoping some one could provide a solution that does not require a corrolated subquery or make useof the distinct keyword. Any help would be much appreciated. <br /><br />Kind regards,<br /><font color="#888888">Sebastian<br /><br /><br /><br /><br /><br /></font></blockquote></div>