Thread: Re: Does PostgreSQL support EXISTS?

Re: Does PostgreSQL support EXISTS?

From

Nils Zonneveld

Date:

12 June 2001, 14:37:18


Raymond Chui wrote:
>
> The Subject says its all.
>

Yes 'exists' works (though I never understood the advantage to the 'in' operator).

Nils

Re: Re: Does PostgreSQL support EXISTS?

From

Martijn van Oosterhout

Date:

12 June 2001, 22:23:39

On Wed, Jun 06, 2001 at 10:44:04PM +0200, Nils Zonneveld wrote:
>
>
> Raymond Chui wrote:
> >
> > The Subject says its all.
> >
>
> Yes 'exists' works (though I never understood the advantage to the 'in' operator).

On postgres at least, exists is faster than in.

They are equivalent though.

select x from a where v in (select v from b)
select x from a where exists (select 1 from b where a.v = b.v)
select x from a, b where a.v = b.v

are all the same. Postgres doesn't quite know that yet though.
--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/

Re: Re: Does PostgreSQL support EXISTS?

From

KuroiNeko

Date:

13 June 2001, 00:33:11

> Yes 'exists'  works (though I never understood the  advantage to the 'in'
operator).

 Each  has its  advantages and  weak points.  Eg, in  some DBMS,  number of
entries of `in ( A1, ... AN )' has a pretty tight limit. Also, in PGSQL 7.0
having even fewer  entries may cause `Tuple too big'  error. Solution is to
replace explicit enumeration with comparisons. And so on.
 `exists'  is more  `relational pure'  on one  hand, but  is probably  more
resource consuming.


--

 ������������������

Re: Re: Does PostgreSQL support EXISTS?

From

Michael Meskes

Date:

13 June 2001, 03:36:29

On Wed, Jun 13, 2001 at 12:23:15PM +1000, Martijn van Oosterhout wrote:
> select x from a where v in (select v from b)
> select x from a where exists (select 1 from b where a.v = b.v)

The latter should be faster than the former on every relational database
system.

Michael
--
Michael Meskes
Michael@Fam-Meskes.De
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!

Re: Re: Does PostgreSQL support EXISTS?

From

Martijn van Oosterhout

Date:

13 June 2001, 05:50:20

On Wed, Jun 13, 2001 at 09:21:02AM +0200, Michael Meskes wrote:
> On Wed, Jun 13, 2001 at 12:23:15PM +1000, Martijn van Oosterhout wrote:
> > select x from a where v in (select v from b)
> > select x from a where exists (select 1 from b where a.v = b.v)
>
> The latter should be faster than the former on every relational database
> system.

Huh? Since they do the same thing should they (in theory) run in the same
time.

Now, I can imagine most DBMSs would optimes the latter better than the
forward, but that doesn't change the theory. In theory someone should be
able to program postgres' rule rewrite to rewrite the former to the latter
and then they would execute identically.
--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/

RE: Re: Does PostgreSQL support EXISTS?

From

"Erick Papadakis"

Date:

13 June 2001, 08:14:03

    | On postgres at least, exists is faster than in.
    |
    | They are equivalent though.
    |
    | select x from a where v in (select v from b)
    | select x from a where exists (select 1 from b where a.v = b.v)
    | select x from a, b where a.v = b.v
    |
    | are all the same. Postgres doesn't quite know that yet though.


If you do an EXPLAIN on Oracle with the first 2 queries above, you will
notice that the second one is faster in Oracle too (i.e., EXISTS is faster
than IN).

Cheers
Shanx


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.252 / Virus Database: 125 - Release Date: 09-May-01


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Re: Re: Does PostgreSQL support EXISTS?

From

Tom Lane

Date:

13 June 2001, 10:20:32

Martijn van Oosterhout <kleptog@svana.org> writes:
> On postgres at least, exists is faster than in.
> They are equivalent though.

Not really.

For one thing, IN can return a NULL (don't know) result; EXISTS cannot.

regression=# select * from foo;
 f1
----
  1

(2 rows)

regression=# select 2 in (select f1 from foo);
 ?column?
----------

(1 row)

regression=# select exists (select 1 from foo where f1 = 2);
 ?column?
----------
 f
(1 row)

Yes, this behavior is spec-compliant.  Think: we don't know what the NULL
represents, therefore we don't know whether 2 is in the column or not,
therefore IN should return NULL.

            regards, tom lane

Re: Re: Does PostgreSQL support EXISTS?

From

Bruce Momjian

Date:

13 June 2001, 10:25:58

> On Wed, Jun 13, 2001 at 12:23:15PM +1000, Martijn van Oosterhout wrote:
> > select x from a where v in (select v from b)
> > select x from a where exists (select 1 from b where a.v = b.v)
>
> The latter should be faster than the former on every relational database
> system.

That surprises me because the subquery is a correlated subquery which
are usually slower on other databases that normal subqueries.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Re: Does PostgreSQL support EXISTS?

From

Bruce Momjian

Date:

13 June 2001, 10:48:31

>     | On postgres at least, exists is faster than in.
>     |
>     | They are equivalent though.
>     |
>     | select x from a where v in (select v from b)
>     | select x from a where exists (select 1 from b where a.v = b.v)
>     | select x from a, b where a.v = b.v
>     |
>     | are all the same. Postgres doesn't quite know that yet though.
>
>
> If you do an EXPLAIN on Oracle with the first 2 queries above, you will
> notice that the second one is faster in Oracle too (i.e., EXISTS is faster
> than IN).

So it is confirmed.  Interesting.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Re: Does PostgreSQL support EXISTS?

From

Tom Lane

Date:

13 June 2001, 12:10:28

Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> On Wed, Jun 13, 2001 at 12:23:15PM +1000, Martijn van Oosterhout wrote:
> select x from a where v in (select v from b)
> select x from a where exists (select 1 from b where a.v = b.v)
>>
>> The latter should be faster than the former on every relational database
>> system.

> That surprises me because the subquery is a correlated subquery which
> are usually slower on other databases that normal subqueries.

However, the second form is easily able to make use of an index on b.v,
whereas the first form is impossible to optimize unless you are able to
rewrite it into some weird form of JOIN.

BTW, I just realized that the "weird form of JOIN" would have to be
much stranger than I previously thought.  The result of IN depends not
only on whether the subselect's output has any matches to the current
test value, but also on whether the subselect's output has any NULLs.
So it's not simply a matter of doing a join with a special rule about
producing no more than one output tuple per outer-query tuple.  How
would you check for the NULLs?

            regards, tom lane

Re: Re: Does PostgreSQL support EXISTS?

From

Bruce Momjian

Date:

13 June 2001, 12:14:16

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> On Wed, Jun 13, 2001 at 12:23:15PM +1000, Martijn van Oosterhout wrote:
> > select x from a where v in (select v from b)
> > select x from a where exists (select 1 from b where a.v = b.v)
> >>
> >> The latter should be faster than the former on every relational database
> >> system.
>
> > That surprises me because the subquery is a correlated subquery which
> > are usually slower on other databases that normal subqueries.
>
> However, the second form is easily able to make use of an index on b.v,
> whereas the first form is impossible to optimize unless you are able to
> rewrite it into some weird form of JOIN.

Assuming the index is there, yes, but again, do we use the index or run
the query.  This has the same dynamics as index/heap scan and is
probably more complicated to figure out accurately.

> BTW, I just realized that the "weird form of JOIN" would have to be
> much stranger than I previously thought.  The result of IN depends not
> only on whether the subselect's output has any matches to the current
> test value, but also on whether the subselect's output has any NULLs.
> So it's not simply a matter of doing a join with a special rule about
> producing no more than one output tuple per outer-query tuple.  How
> would you check for the NULLs?

I thought NOT IN was the only one that was concerned about any NULL?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026