Re: Another sub-select problem... - Mailing list pgsql-sql

From Stephan Szabo
Subject Re: Another sub-select problem...
Date
Msg-id 20030522184331.O51260-100000@megazone23.bigpanda.com
Whole thread Raw
In response to Another sub-select problem...  (Kevin Ready <kevin@meridianis.com>)
List pgsql-sql
On 20 May 2003, Kevin Ready wrote:

> (Originally addressed to Tom Lane)
> I have spent the last 3 months developing an application using
> PostgreSQL. This application analyzes raw electronic design/timing data
> which can be in the range of 200 million fields + in size.
>
> I am currently using PostgreSQL 7.3.2.
>
> Performance on a 4 proc RISC AIX 6GB RAM machine is fair. Performance on
> a 2 proc 2.0 GHZ Xeon RedHat 9.0 3 GB RAM machine is better.PG_SETTINGS
> are all defaults except for logging.
>
> One query type is absolutely not working, so I am at a stopping point
> and would like to ask for some advice-- I have been through the
> PostgreSQL mailing list archives and Google search results many times
> over the past 6 weeks looking for an answer.
>
> I have a required query that looks like this:
>
> ---BAD QUERY---
> LOG:  query: SELECT * FROM row WHERE meta<>1 AND ((slew_rise < 10 AND
> slew_rise <> -999999) OR (slew_fall < 10 AND slew_fall <> -999999))
> AND blockID IN (SELECT blockid FROM block WHERE parent_component_classid
> IN(8,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,39,40,41,42,43,44,45,
> 46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,
> 71,72,73,75,76,77,78,79,80,81,82,83,84,85,86,88,117,133,143,145,146,178,188,
> 193,197))   ORDER BY slew_rise  LIMIT 500 OFFSET 0
> LOG:  duration: 3439.646507 sec
>
> ---GOOD QUERY FOR COMPARISON---
> LOG:  query: SELECT * FROM row WHERE meta<>1 AND ((slew_rise < 10 AND
> slew_rise <> -999999) OR (slew_fall < 10 AND slew_fall <> -999999))
> ORDER BY slew_rise  LIMIT 500 OFFSET 0
> LOG:  duration: 2.269856 sec
>
> The "IN(id_list)" portion is derived from a PHP-based recursive query
> and is passed into postgres based on user input--always a different list
> of ID's.  This seems to rule out a "JOIN" as an alternative query type
> which has been the general suggestion for avoiding the infamous "IN"

Actually, it doesn't entirely afaics (you could get rid of the blockid in
(subselect) by joining the subselect as a subselect in from to the main
query.  But without explain analyze we can't tell if that's really the
bottleneck.

> Question 1) Will this type of query be handled well in 7.4?

Possibly, see above.

> Question 2) If so,is there a dev version of 7.4 I can work with?

Yes, I think anoncvs is back, so you can get the CVS copy.

> Question 3) Can you suggest an alternative strategy for the query? (I
> have not included my table structures as I do not want to bother you too
> much.)

The first question would be whether
select * from row, (select distinct blockid from block ... ) as a wheremeta<>1 ... slew_fall<>-999999)) and
row.blockid=a.blockidorder byslew_rise limit 500 offset 0
 

gives the same results and is faster.



pgsql-sql by date:

Previous
From: Bruno Wolff III
Date:
Subject: Re: AS?
Next
From: Stephan Szabo
Date:
Subject: Re: Referencing a column from another table in a different