Query optimizing - Mailing list pgsql-sql
From | Sebastian Ritter |
---|---|
Subject | Query optimizing |
Date | |
Msg-id | 99b656cb0811100243v499884f9ra457233c5a168521@mail.gmail.com Whole thread Raw |
Responses |
Re: Query optimizing
Re: Query optimizing |
List | pgsql-sql |
Hi all,<br /><br />I was hoping to receive some advise on a slow running query in our business' Issue Tracking System. Toshed some light on the below mentioned queries, here is a brief summary of how users interact with the system. The twomain components in the system are a Issues and Followups. An Issue is created by our technical support staff when someform of action needs to be taken in order to resolve a problem. FollowUps are entries entered against an issue depictingthe sequence of events taken to resolve the issue. There are about 15,000 Issues in the system at present and about95,000 FollowUps in the system. As we need the system to be very responsive, each query should ideally run in under1 second.<br /><br />A lot of the reports our technical officers submit to us include a listing of all actioned issuesfor a given day along with the last modified followup of each said issue. With the number of rows in our database increasingat a high rate, these queries are starting to run too slowly. <br /><br />Here is a condensed version of the twotables:<br /><br />Issues:<br />=================================<br />id - integer <br />dt_created- timestamp<br />dt_modified - timestamp<br />t_title - varchar<br />t_description - varchar<br /><br />FollowUps:<br/> =================================<br /> id - integer <br /> dt_created - timestamp<br/> dt_modified - timestamp<br /> t_description - varchar<br />n_issue - foregin key to issues<br /><br />We haveidentified that the slowness in our queries is trying to return the lastest followup for each actioned issue that day.Without further ado here are two variations I have tried within the system (neither of which are making the cut):<br/><br />V1 (correlated subquery - Very bad performance)<br /><br /> (SELECT<br /> fu.*<br /> FROM<br /> manage_followup fu,<br /> manage_issue i<br /> WHERE<br /> <a href="http://i.id">i.id</a> = fu.n_issue<br/> AND<br /> <a href="http://fu.id">fu.id</a> = (SELECT<br /> id<br/> FROM<br /> manage_followup<br /> WHERE<br /> n_issue = <a href="http://i.id">i.id</a><br /> ORDER BY<br /> dt_modified DESC<br /> LIMIT 1)) AS latestfu,<br /><br />V2 (Using Group By,"max" aggregate function and distinct- better performance, but still bad because of distinct) <br /><br /><br />SELECTDISTINCT ON (fu.n_issue)<br /> <a href="http://fu.id">fu.id</a>,<br /> fu.dt_created,<br /> fu.dt_modified,<br /> fu.t_description,<br /> fu.n_issue as issue_id<br /> FROM<br /> manage_followup fu,<br /> (SELECT<br /> n_issue,<br /> max(dt_modified) as dt_modified<br /> FROM<br /> manage_followup<br /> GROUP BY<br /> n_issue) as max_modified<br /> WHERE<br /> max_modified.n_issue = fu.n_issue<br /> AND<br /> fu.dt_modified = max_modified.dt_modified) AS latestfu ON (latestfu.issue_id = <a href="http://i.id">i.id</a>),<br/><br />We must use distinct here as we sometimes use batch scripts to enter followups, whichwill give them all similar, if not equal, modification dates. We also can't use followup ids as an indicator of thelatest followup because users of the system can retrospectively go back and change older followups.<br /><br />I was hopingsome one could provide a solution that does not require a corrolated subquery or make use of the distinct keyword.Any help would be much appreciated. <br /><br />Kind regards,<br />Sebastian <br /><br /><br /><br /><br /><br />