Re: Another sub-select problem... - Mailing list pgsql-sql
From | Stephan Szabo |
---|---|
Subject | Re: Another sub-select problem... |
Date | |
Msg-id | 20030522184331.O51260-100000@megazone23.bigpanda.com Whole thread Raw |
In response to | Another sub-select problem... (Kevin Ready <kevin@meridianis.com>) |
List | pgsql-sql |
On 20 May 2003, Kevin Ready wrote: > (Originally addressed to Tom Lane) > I have spent the last 3 months developing an application using > PostgreSQL. This application analyzes raw electronic design/timing data > which can be in the range of 200 million fields + in size. > > I am currently using PostgreSQL 7.3.2. > > Performance on a 4 proc RISC AIX 6GB RAM machine is fair. Performance on > a 2 proc 2.0 GHZ Xeon RedHat 9.0 3 GB RAM machine is better.PG_SETTINGS > are all defaults except for logging. > > One query type is absolutely not working, so I am at a stopping point > and would like to ask for some advice-- I have been through the > PostgreSQL mailing list archives and Google search results many times > over the past 6 weeks looking for an answer. > > I have a required query that looks like this: > > ---BAD QUERY--- > LOG: query: SELECT * FROM row WHERE meta<>1 AND ((slew_rise < 10 AND > slew_rise <> -999999) OR (slew_fall < 10 AND slew_fall <> -999999)) > AND blockID IN (SELECT blockid FROM block WHERE parent_component_classid > IN(8,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,39,40,41,42,43,44,45, > 46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70, > 71,72,73,75,76,77,78,79,80,81,82,83,84,85,86,88,117,133,143,145,146,178,188, > 193,197)) ORDER BY slew_rise LIMIT 500 OFFSET 0 > LOG: duration: 3439.646507 sec > > ---GOOD QUERY FOR COMPARISON--- > LOG: query: SELECT * FROM row WHERE meta<>1 AND ((slew_rise < 10 AND > slew_rise <> -999999) OR (slew_fall < 10 AND slew_fall <> -999999)) > ORDER BY slew_rise LIMIT 500 OFFSET 0 > LOG: duration: 2.269856 sec > > The "IN(id_list)" portion is derived from a PHP-based recursive query > and is passed into postgres based on user input--always a different list > of ID's. This seems to rule out a "JOIN" as an alternative query type > which has been the general suggestion for avoiding the infamous "IN" Actually, it doesn't entirely afaics (you could get rid of the blockid in (subselect) by joining the subselect as a subselect in from to the main query. But without explain analyze we can't tell if that's really the bottleneck. > Question 1) Will this type of query be handled well in 7.4? Possibly, see above. > Question 2) If so,is there a dev version of 7.4 I can work with? Yes, I think anoncvs is back, so you can get the CVS copy. > Question 3) Can you suggest an alternative strategy for the query? (I > have not included my table structures as I do not want to bother you too > much.) The first question would be whether select * from row, (select distinct blockid from block ... ) as a wheremeta<>1 ... slew_fall<>-999999)) and row.blockid=a.blockidorder byslew_rise limit 500 offset 0 gives the same results and is faster.