Re: Help analyzing 7.2.4 EXPLAIN - Mailing list pgsql-performance
From | Tom Lane |
---|---|
Subject | Re: Help analyzing 7.2.4 EXPLAIN |
Date | |
Msg-id | 11860.1049935883@sss.pgh.pa.us Whole thread Raw |
In response to | Help analyzing 7.2.4 EXPLAIN (Josh Berkus <josh@agliodbs.com>) |
Responses |
Re: Help analyzing 7.2.4 EXPLAIN
Re: Help analyzing 7.2.4 EXPLAIN |
List | pgsql-performance |
Josh Berkus <josh@agliodbs.com> writes: > Becuase it's 7.2.4, it's a little hard to tell exactly which part of the > query is taking up 90% of the processing time. Keep in mind that in the subqueries, the "actual time" shown is the time per iteration --- you should multiply by the "loops" value to get an accurate idea of where the time is going. With that in mind, it's real clear that the first subplan is eating the bulk of the time. I think you are probably right that execution of the if_addendee_conflict() function is the main cost. But given this subquery that's not too surprising: > (SELECT if_addendee_conflict(users.user_id, 3272, '2003-04-15 10:00', '1 days'::INTERVAL, > events.event_id, events.event_date, events.duration, event_cats.status, '30 minutes') as cflt > FROM events, event_types, event_cats, event_days > WHERE events.event_id = event_days.event_id > and events.etype_id = event_types.etype_id > AND event_types.ecat_id = event_cats.ecat_id > AND event_days.event_day > BETWEEN '2003-04-15' AND '2003-04-16 10:00' > ORDER BY cflt LIMIT 1) AS conflict What you have here is a subquery that will execute if_addendee_conflict() for *each* row of the events table; then throw away all but one of the results. And then do that over again for each user row. It looks to me like if_addendee_conflict() is being called nearly 4000 times in this query. No wonder it's slow. The first thing that pops to mind is whether you really need the *first* conflict, or would it be enough to find any old conflict? If you could dispense with the ORDER BY then at least some evaluations of if_addendee_conflict() could be saved. Realistically, though, I think you're going to have to refactor the work to make this perform reasonably. How much of what if_addendee_conflict() does is actually dependent on the user_id? Could you separate out tests that depend only on the event, and do that in a separate pass that is done only once per event, instead once per event*user? If you could reduce the number of events that need to be examined for any given user, you could get somewhere. Also, I don't see where this query checks to see if the user is actually interested in attending the event. Is that one of the things if_addendee_conflict checks? If so, you should pull it out and make it a join condition. You're essentially forcing the stupidest possible join algorithm by burying that condition inside a user-defined function. It would win to check that sooner instead of later, since presumably the set of interesting events for any one user is a lot smaller than the set of all events. regards, tom lane
pgsql-performance by date: