Thread: optimizing away join when querying view

optimizing away join when querying view

From
Jacob Costello
Date:
Postgres doesn't seem to optimize away unnecessary joins in a view
definition when the view is queried in such a way that the join need not
be executed.  In the example below, I define two tables, foo and bar,
with a foreign key on bar referencing foo, and a view on the natural
join of the tables.  The tables are defined so that the relationship
from bar to foo is allowed to be many to one, with the column of bar
referencing foo (column a) set NOT NULL, so that there must be exactly
one foo record for every bar record.  I then EXPLAIN selecting the "b"
column from bar, through the view and from bar directly.  The tables
have been ANALYZEd but have no data.  EXPLAIN shows the join actually
occurring when selecting b from the view quux.  If I understand
correctly (maybe I don't), this is guaranteed to be exactly the same as
the selecting b directly from the bar table.  The practical import of
this comes into play when views are provided to simplify queries for end
users, and those views use joins to include related data.  If the user
enters a query that is equivalent to a query on a base table, why should
the query pay a performance penalty ?

table foo:

Column |  Type   | Modifiers
--------+---------+-----------
a      | integer | not null
Indexes:
    "foo_pkey" PRIMARY KEY, btree (a)


table bar:

Column |  Type   | Modifiers
--------+---------+-----------
a      | integer | not null
b      | integer |
Foreign-key constraints:
    "bar_a_fkey" FOREIGN KEY (a) REFERENCES foo(a)


view quux:

Column |  Type   | Modifiers
--------+---------+-----------
a      | integer |
b      | integer |
View definition:
SELECT bar.a, bar.b
   FROM bar
NATURAL JOIN foo


EXPLAINed Queries:

explain select b from bar;

                    QUERY PLAN
---------------------------------------------------
Seq Scan on bar  (cost=0.00..1.00 rows=1 width=4)
(1 row)

explain select b from quux;

                                QUERY PLAN
--------------------------------------------------------------------------
Nested Loop  (cost=0.00..5.84 rows=1 width=4)
   ->  Seq Scan on bar  (cost=0.00..1.00 rows=1 width=8)
   ->  Index Scan using foo_pkey on foo  (cost=0.00..4.82 rows=1
width=4)
         Index Cond: ("outer".a = foo.a)
(4 rows)

--
Jacob Costello <jake@suntradingllc.com>
Sun Trading, LLC



Re: optimizing away join when querying view

From
Tom Lane
Date:
Jacob Costello <jake@suntradingllc.com> writes:
> Postgres doesn't seem to optimize away unnecessary joins

There is no such thing as an unnecessary join, unless you are willing to
stake the correctness of the query on constraints that could be dropped
after the query is planned.  Until we have some infrastructure to deal
with that situation, nothing like this is going to happen.

            regards, tom lane

Re: optimizing away join when querying view

From
Stephan Szabo
Date:
On Wed, 8 Feb 2006, Jacob Costello wrote:

> Postgres doesn't seem to optimize away unnecessary joins in a view
> definition when the view is queried in such a way that the join need not
> be executed.  In the example below, I define two tables, foo and bar,
> with a foreign key on bar referencing foo, and a view on the natural
> join of the tables.  The tables are defined so that the relationship
> from bar to foo is allowed to be many to one, with the column of bar
> referencing foo (column a) set NOT NULL, so that there must be exactly
> one foo record for every bar record.  I then EXPLAIN selecting the "b"
> column from bar, through the view and from bar directly.  The tables
> have been ANALYZEd but have no data.  EXPLAIN shows the join actually
> occurring when selecting b from the view quux.  If I understand
> correctly (maybe I don't), this is guaranteed to be exactly the same as
> the selecting b directly from the bar table.

AFAIK there are periods in which a foreign key does not guarantee that
there's one foo record for every bar record between an action and the
constraint check for that action at statement end so you'd probably have
to be careful in any case.