Thread: replacing a subquery with an outer join?

replacing a subquery with an outer join?

From
Chris Curvey
Date:
in an earlier thread, the following query was submitted:

SELECT COUNT(*) FROM q
      WHERE NOT EXISTS (SELECT 1
                          FROM t AS t
                         WHERE t.mid = q.mid);

and as part of his answer, David Johnson responded


3. Also, you are using a correlated sub-query instead of a LEFT OUTER JOIN

This interests me.  How would you rewrite the correlated subquery above using the outer join? 

--
e-Mail is the equivalent of a postcard written in pencil.  This message may not have been sent by me, or intended for you.  It may have been read or even modified while in transit.  e-Mail disclaimers have the same force in law as a note passed in study hall.  If your corporate attorney says that you need an disclaimer in your signature, you need a new corporate attorney.

Re: replacing a subquery with an outer join?

From
Tom Lane
Date:
Chris Curvey <chris@chriscurvey.com> writes:
> in an earlier thread, the following query was submitted:
>> SELECT COUNT(*) FROM q
>> WHERE NOT EXISTS (SELECT 1
>> FROM t AS t
>> WHERE t.mid = q.mid);

> and as part of his answer, David Johnson responded
> 3. Also, you are using a correlated sub-query instead of a LEFT OUTER JOIN

> This interests me.  How would you rewrite the correlated subquery above
> using the outer join?

That NOT EXISTS idiom is a way of writing an "anti join", ie, select all
the q rows that lack joining rows in t.  In SQL you can get the same
effect with

SELECT COUNT(*) FROM q LEFT JOIN t ON (t.mid = q.mid)
WHERE t.mid IS NULL;

as long as you assume the "=" operator is strict (that is, it can't
succeed for any null input).  Then, if t.mid appears as null in some
row produced by the left join, you know that that wasn't a *real*
matching row for which the ON condition succeeded.  Instead, it has
to be a null-extended row added by the LEFT JOIN operator for a q row
that had no matches.  Therefore, this query counts all and only the
unmatched q rows, which is the same result as for the NOT EXISTS
formulation.

Since about 8.4, the Postgres planner recognizes either of these
idioms as anti-joins and will produce the same plan either way.
David's advice is applicable to pre-8.4 versions, in which the
plan produced for the NOT EXISTS formulation was often worse than
the one for the LEFT JOIN way.

            regards, tom lane

Re: replacing a subquery with an outer join?

From
Gavin Flower
Date:
On 22/07/11 13:20, Chris Curvey wrote:
in an earlier thread, the following query was submitted:

SELECT COUNT(*) FROM q
      WHERE NOT EXISTS (SELECT 1
                          FROM t AS t
                         WHERE t.mid = q.mid);

and as part of his answer, David Johnson responded


3. Also, you are using a correlated sub-query instead of a LEFT OUTER JOIN

This interests me.  How would you rewrite the correlated subquery above using the outer join? 

--
e-Mail is the equivalent of a postcard written in pencil.  This message may not have been sent by me, or intended for you.  It may have been read or even modified while in transit.  e-Mail disclaimers have the same force in law as a note passed in study hall.  If your corporate attorney says that you need an disclaimer in your signature, you need a new corporate attorney.


DROP TABLE IF EXISTS q;
DROP TABLE IF EXISTS t;


CREATE TABLE q
(
    mid int PRIMARY KEY,
    qnote text
);

CREATE TABLE t
(
    mid int PRIMARY KEY,
    tnote text
);

INSERT INTO q (mid, qnote) VALUES
    (1, 'first'),
    (2, 'second'),
    (4, 'fourth');

INSERT INTO t (mid, tnote) VALUES
    (1, 'first'),
    (3, 'third');
   
SELECT
    COUNT(*)
FROM
    q LEFT OUTER JOIN t ON (q.mid = t.mid)
WHERE
    t.mid IS NULL;