Re: Highly obscure and erratic - Mailing list pgsql-general

From Varun Kacholia
Subject Re: Highly obscure and erratic
Date
Msg-id 20020621032900.A15070@voxel.cse.iitb.ac.in
Whole thread Raw
In response to Re: Highly obscure and erratic  (Shaun Thomas <sthomas@townnews.com>)
List pgsql-general
hi,

> SELECT *
>   FROM dbmedia d, wdmedia w
>  WHERE w.word = 'word1' AND d.id=w.id
> INTERSECT
> SELECT *
>   FROM dbmedia d, wdmedia w
>  WHERE w.word = 'word2' AND d.id=w.id
> INTERSECT
> ...

 well this does not help...
 it results in a seq scan only.

 suryadb=# explain (select * from dbmedia d,wdmedia w where w.word='whatever' and d.id=w.id) INTERSECT (select * from
dbmediad,wdmedia w where w.word='whatever' and d.id=w.id) ; 
 NOTICE:  QUERY PLAN:

 SetOp Intersect  (cost=25519.49..25674.00 rows=515 width=92)
     ->  Sort  (cost=25519.49..25519.49 rows=5151 width=92)
     ->  Append  (cost=5033.63..25201.94 rows=5151 width=92)
     ->  Subquery Scan *SELECT* 1  (cost=5033.63..12600.97 rows=2575 width=92)
     ->  Hash Join  (cost=5033.63..12600.97 rows=2575 width=92)
     ->  Seq Scan on dbmedia d  (cost=0.00..3499.83 rows=100883 width=76)
         ^^^^^^^^^^
     ->  Hash  (cost=5027.19..5027.19 rows=2575 width=16)
     ->  Index Scan using wdkmedia on wdmedia w  (cost=0.00..5027.19 rows=2575 width=16)
     ->  Subquery Scan *SELECT* 2  (cost=5033.63..12600.97 rows=2575 width=92)
     ->  Hash Join  (cost=5033.63..12600.97 rows=2575 width=92)
     ->  Seq Scan on dbmedia d  (cost=0.00..3499.83 rows=100883 width=76)
         ^^^^^^^^^^^^^^^^^^^^^^
     ->  Hash  (cost=5027.19..5027.19 rows=2575 width=16)
->  Index Scan using wdkmedia on wdmedia w  (cost=0.00..5027.19 rows=2575 width=16)

instead the self-join is better...atleast in the case of postgresql :p
anyways..
thanks for the reply!

> I think you get the picture from this.  If he didn't need INTERSECT to
> emulate AND for his word matches, he could just use OR, and do this:
>
> SELECT *
>   FROM dbmedia d, wdmedia w
>  WHERE w.word IN ('word1', 'word2', 'etc') AND d.id=w.id;
>
> My guess is the OR version is faster, but he wants to restrict the data,
> not add to it.  Unfortunate.  ^_^
>
> > and avoid IN, INTERSECT and UNION at all costs.
>
> Actually, if INTERSECT and UNION are done properly (which I'm guessing
> they are) at most, the query has to be run once for each distinct query
> between the INTERSECT/UNION clauses.  On a fast/small query, an upper
> bound of doubling execution time isn't too bad.  Mathematical matrix
> intersections and unions are pretty fast/easy, so that doesn't add much
> overhead at all.
>
> It's IN you have to avoid.  The way postgres does it is just insane, and
> you'll suffer greatly upon using it.
>
> --
> +-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+
> | Shaun M. Thomas                INN Database Administrator           |
> | Phone: (309) 743-0812          Fax  : (309) 743-0830                |
> | Email: sthomas@townnews.com    AIM  : trifthen                      |
> | Web  : www.townnews.com                                             |
> |                                                                     |
> |     "Most of our lives are about proving something, either to       |
> |      ourselves or to someone else."                                 |
> |                                           -- Anonymous              |
> +-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+
>
>

--
------
Varun
Why do the people who know the least know it the loudest?


pgsql-general by date:

Previous
From: David Link
Date:
Subject: help on speeding up a one table query
Next
From: Stephan Szabo
Date:
Subject: Re: foreign Key problem