Re: Highly obscure and erratic - Mailing list pgsql-general
From | Varun Kacholia |
---|---|
Subject | Re: Highly obscure and erratic |
Date | |
Msg-id | 20020621032900.A15070@voxel.cse.iitb.ac.in Whole thread Raw |
In response to | Re: Highly obscure and erratic (Shaun Thomas <sthomas@townnews.com>) |
List | pgsql-general |
hi, > SELECT * > FROM dbmedia d, wdmedia w > WHERE w.word = 'word1' AND d.id=w.id > INTERSECT > SELECT * > FROM dbmedia d, wdmedia w > WHERE w.word = 'word2' AND d.id=w.id > INTERSECT > ... well this does not help... it results in a seq scan only. suryadb=# explain (select * from dbmedia d,wdmedia w where w.word='whatever' and d.id=w.id) INTERSECT (select * from dbmediad,wdmedia w where w.word='whatever' and d.id=w.id) ; NOTICE: QUERY PLAN: SetOp Intersect (cost=25519.49..25674.00 rows=515 width=92) -> Sort (cost=25519.49..25519.49 rows=5151 width=92) -> Append (cost=5033.63..25201.94 rows=5151 width=92) -> Subquery Scan *SELECT* 1 (cost=5033.63..12600.97 rows=2575 width=92) -> Hash Join (cost=5033.63..12600.97 rows=2575 width=92) -> Seq Scan on dbmedia d (cost=0.00..3499.83 rows=100883 width=76) ^^^^^^^^^^ -> Hash (cost=5027.19..5027.19 rows=2575 width=16) -> Index Scan using wdkmedia on wdmedia w (cost=0.00..5027.19 rows=2575 width=16) -> Subquery Scan *SELECT* 2 (cost=5033.63..12600.97 rows=2575 width=92) -> Hash Join (cost=5033.63..12600.97 rows=2575 width=92) -> Seq Scan on dbmedia d (cost=0.00..3499.83 rows=100883 width=76) ^^^^^^^^^^^^^^^^^^^^^^ -> Hash (cost=5027.19..5027.19 rows=2575 width=16) -> Index Scan using wdkmedia on wdmedia w (cost=0.00..5027.19 rows=2575 width=16) instead the self-join is better...atleast in the case of postgresql :p anyways.. thanks for the reply! > I think you get the picture from this. If he didn't need INTERSECT to > emulate AND for his word matches, he could just use OR, and do this: > > SELECT * > FROM dbmedia d, wdmedia w > WHERE w.word IN ('word1', 'word2', 'etc') AND d.id=w.id; > > My guess is the OR version is faster, but he wants to restrict the data, > not add to it. Unfortunate. ^_^ > > > and avoid IN, INTERSECT and UNION at all costs. > > Actually, if INTERSECT and UNION are done properly (which I'm guessing > they are) at most, the query has to be run once for each distinct query > between the INTERSECT/UNION clauses. On a fast/small query, an upper > bound of doubling execution time isn't too bad. Mathematical matrix > intersections and unions are pretty fast/easy, so that doesn't add much > overhead at all. > > It's IN you have to avoid. The way postgres does it is just insane, and > you'll suffer greatly upon using it. > > -- > +-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+ > | Shaun M. Thomas INN Database Administrator | > | Phone: (309) 743-0812 Fax : (309) 743-0830 | > | Email: sthomas@townnews.com AIM : trifthen | > | Web : www.townnews.com | > | | > | "Most of our lives are about proving something, either to | > | ourselves or to someone else." | > | -- Anonymous | > +-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+ > > -- ------ Varun Why do the people who know the least know it the loudest?
pgsql-general by date: