Re: improving performance of UNION and ORDER BY - Mailing list pgsql-general
From | Chris Gamache |
---|---|
Subject | Re: improving performance of UNION and ORDER BY |
Date | |
Msg-id | 20020307135628.59688.qmail@web13804.mail.yahoo.com Whole thread Raw |
In response to | improving performance of UNION and ORDER BY (Chris Gamache <cgg007@yahoo.com>) |
List | pgsql-general |
I think I've found an acceptable compromise. The optimizer will use the index if I limit each of the subqueries to no more than 6200 rows. I think a composite total ~12000 is enough data, understanding that the users of this query would be silly to use it for paging through more than a week's data. I think it'd be silly to go beyond 24 hours... Anyway, if they're looking for something more specific, I take out the limits to allow it search through the entire database, and unless they're not using specific enough search criteria the query will still move rather quickly. A decent compromise. Thanks for helping me nail this one down! CG --- Stephan Szabo <sszabo@megazone23.bigpanda.com> wrote: > On Wed, 6 Mar 2002, Chris Gamache wrote: > > > UNION ALL was an excellent idea! It didn't cut much time off, but at least > no > > resources are devoted to eliminating the nonexistant duplicate rows. > > > > I've had days to think about this. It seems as though the ORDER BY part of > the > > first query is the culprit. When I run this query by itself, I can see that > it > > would comprise the bulk of the UNION query time. > > > > select > > a.username as "User", > > a.trans_date as "Date", > > tl.longtype as "Type", > > a.trans_data as "Query Data", > > a.trans_charge as "Charged", > > a.user_reference_id as "Reference ID" > > from a_trans_log a > > join addtypelong tl on a.trans_type = tl.shorttype > > where a.trans_date >= '12/31/01'::TIMESTAMP > > order by a.trans_date desc, a.trans_data limit 20; > > > > By removing the ORDER BY a.trans_data, it cut the query down to the "almost > > instant" level... EXPLAIN shows me that it uses the indeces! I guess I need > to > > drop that part of the ORDER BY, or make an index for it to use... Bah. > > Yeah, although since you're doing different kinds of scans on the two > columns (desc and asc), I'm not sure what'd be necessary to get the index > used. Normally you'd just make an index on trans_date,trans_data, but > that won't work here. I think Tom Lane may have discussed a way to get > the index scan for that case sometime in the past couple of months on the > mailing list. > > > Alas... > > Unless someone knows different, I don't believe that I can use the LIMIT > > statement in each of the subqueries. The app needs to page through the > > UNIONized table. As the app pages through LIMIT 20,20 LIMIT 20,40 LIMIT > 20,60, > > etc. It needs to look at the whole sorted UNIONized table. (am I making ANY > > sense?) > > That makes sense. You could cheat a little if you knew how large the > results sets for the parts were if you're using union all since I believe > it will do them in order and just append the sets, but that doesn't seem > like a good idea in general. > > > Even by dropping the order on column 4, it still takes 6 seconds to > assemble > > the data, sort it and limit it... Could I create a cross-table index > > specifically for this query? I doubt its a) possible, b) (even if possible) > a > > good idea. > > I don't think so. > > One question is whether some of this could be done by keeping another > table around that you do your limits through. This would at least cut > down some of the cost (or for that matter if you could make a table with > this using triggers and such). > > __________________________________________________ Do You Yahoo!? Try FREE Yahoo! Mail - the world's greatest free email! http://mail.yahoo.com/
pgsql-general by date: