Equivalent queries and the planner - Mailing list pgsql-general

From John D. Burger
Subject Equivalent queries and the planner
Date
Msg-id d65fded7d3291a9437933573fcc1d347@mitre.org
Whole thread Raw
Responses Re: Equivalent queries and the planner
Re: Equivalent queries and the planner
List pgsql-general
Hi -

I have some questions about query equivalence.  The first query below
(actually a slightly more complicated one) took much longer to run than
I expected (actually, I killed it after it ran all night).  The second
one took a few minutes.

I believe these queries are exactly equivalent, but I presume the
planner doesn't know that.  (I suppose another explanation is that both
plans are considered for both queries, but the cost estimates come out
differently for some reason.)  I've heard some allusions to "the
rewriter", and thought it was responsible for handling (some of) these
kinds of equivalencies.  I just looked at some of the documentation,
though, which makes it seem like the rewriter just handles user-defined
rules and views.

In general, there are lots of ways to express the same abstract
information need in SQL, and I assumed that there were some set of
(probably incomplete) equivalencies encoded somewhere.  Is this so?
Can someone point me to any relevant documentation or discussion?

Thanks.

(By the way, gazPlaces.gazPlaceID below is a primary key, so should be
unique - I think that's required for these to be equivalent.  I tested
the two queries on small data sets, and they do indeed return the same
results.)

- John D. Burger
   MITRE


explain select gazPlaceID from gazPlaces
    where gazPlaceID not in (select gazPlaceID from gazContainers);
        QUERY PLAN
------------------------------------------------------------------------
--------
  Seq Scan on gazplaces  (cost=0.00..528652149268.95 rows=3278748
width=4)
    Filter: (NOT (subplan))
    SubPlan
      ->  Seq Scan on gazcontainers  (cost=0.00..138723.75 rows=9004875
width=4)


explain select gazPlaceID from gazPlaces
    except select gazPlaceID from gazContainers;
        QUERY PLAN
------------------------------------------------------------------------
------------------------
  SetOp Except  (cost=2882572.91..2960384.76 rows=1556237 width=4)
    ->  Sort  (cost=2882572.91..2921478.84 rows=15562371 width=4)
          Sort Key: gazplaceid
          ->  Append  (cost=0.00..419616.42 rows=15562371 width=4)
                ->  Subquery Scan "*SELECT* 1"  (cost=0.00..190843.92
rows=6557496 width=4)
                      ->  Seq Scan on gazplaces  (cost=0.00..125268.96
rows=6557496 width=4)
                ->  Subquery Scan "*SELECT* 2"  (cost=0.00..228772.50
rows=9004875 width=4)
                      ->  Seq Scan on gazcontainers
(cost=0.00..138723.75 rows=9004875 width=4)


pgsql-general by date:

Previous
From: Devrim GUNDUZ
Date:
Subject: New sets of PostgreSQL RPMS are available for download : 7.3.10, 8.0.4, 8.1beta3
Next
From: Alex Turner
Date:
Subject: Re: On "multi-master"