[moving to pgsql-sql]
On Thu, 11 Jul 2002 17:22:14 +0200, "Luis Alberto Amigo Navarro"
<lamigo@atc.unican.es> wrote:
>I can't improve performance on this query:
>
>SELECT
> supplier.name,
> supplier.address
>FROM
> supplier,
> nation
>WHERE
> supplier.suppkey IN(
> SELECT
> partsupp.suppkey
> FROM
> partsupp
> WHERE
> partsupp.partkey IN(
> SELECT
> part.partkey
> FROM
> part
> WHERE
> part.name like 'forest%'
> )
> AND partsupp.availqty>(
> SELECT
> 0.5*(sum(lineitem.quantity)::FLOAT)
> FROM
> lineitem
> WHERE
> lineitem.partkey=partsupp.partkey
> AND lineitem.suppkey=partsupp.partkey ^^^^^^^
suppkey???
> AND lineitem.shipdate>=('1994-01-01')::DATE
> AND lineitem.shipdate<(('1994-01-01')::DATE+('1 year')::INTERVAL)::DATE
> )
> )
> AND supplier.nationkey=nation.nationkey
> AND nation.name='CANADA'
>ORDER BY
> supplier.name;
Luis,
rules of thumb: "Avoid subselects; use joins!" and "If you have to use
subselects, avoid IN, use EXISTS!"
Let's try. If partkey is unique in part, then
| FROM partsupp
| WHERE partsupp.partkey IN (SELECT part.partkey
can be replaced by FROM partsupp ps, part p WHERE ps.partkey = p.partkey
or partsupp ps INNER JOIN part p ON (ps.partkey = p.partkey AND p.name LIKE '...')
When we ignore "part" for now, your subselect boils down to
| SELECT partsupp.suppkey
| FROM partsupp
| WHERE partsupp.availqty > (
| SELECT 0.5*(sum(lineitem.quantity)::FLOAT)
| FROM lineitem
| WHERE lineitem.partkey=partsupp.partkey
| AND lineitem.suppkey=partsupp.suppkey
| AND lineitem.shipdate BETWEEN ... AND ...
| )
which can be rewritten to (untested)
SELECT ps.suppkey FROM partsupp ps, lineitem li WHERE li.partkey=ps.partkey AND li.suppkey=ps.suppkey AND
lineitem.shipdateBETWEEN ... AND ... GROUP BY ps.partkey, ps.suppkey HAVING min(ps.availqty) >
0.5*(sum(lineitem.quantity)::FLOAT) ^^^ As all ps.availqty are equal in one group, you can as well
use max() or avg().
Now we have left only one IN:
| WHERE supplier.suppkey IN (
| SELECT partsupp.suppkey FROM partsupp WHERE <condition> )
Being to lazy to find out, if this can be rewritten to a join, let`s
apply rule 2 here:
WHERE EXISTS ( SELECT ... FROM partsupp ps WHERE supplier.suppkey = ps.suppkey AND <condition> )
HTH, but use with a grain of salt ...
>Sort (cost=2777810917708.17..2777810917708.17 rows=200 width=81) ^^^^^^^^^^^^^^^^
BTW, how many years are these? :-)
ServusManfred