Re: optimizing constant quals within outer joins - Mailing list pgsql-hackers
From | Phil Frost |
---|---|
Subject | Re: optimizing constant quals within outer joins |
Date | |
Msg-id | 20060628162404.GB23509@unununium.org Whole thread Raw |
In response to | Re: optimizing constant quals within outer joins (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
On Wed, Jun 28, 2006 at 11:40:52AM -0400, Tom Lane wrote: > Phil Frost <indigo@bitglue.com> writes: > > The planner in fact can move the function around without changing the > > output. > > Not when it's within the nullable side of an outer join --- moving a > WHERE clause up out of that would make the difference between no row > out, and a null-extended row out, which are certainly not the same. > > I'm not sure why it's not pulling up from the left side of the left join > though. That might be a bug. What PG version is this exactly? > > Of course the real question is why is your app generating such poorly > phrased queries ;-) Sure it can't pull the condition to the root result node, but it can make an intermediate result node that is a child of the join and wraps the sequential scan. "offset 0" makes it do this. I'd like this: create table a(i int); create table b(i int); create function stable_function() returns bool language plpgsql stable as $$ begin return true; end $$; create view c as select * from b where stable_function(); explain select * from a left join c using (i); QUERY PLAN -----------------------------------------------------------------Merge Right Join (cost=220.32..338.32 rows=7629 width=4) Merge Cond: ("outer".i = "inner".i) -> Sort (cost=70.54..72.32 rows=713 width=4) Sort Key: b.i -> Seq Scan on b (cost=0.00..36.75 rows=713 width=4) Filter: stable_function() -> Sort (cost=149.78..155.13rows=2140 width=4) Sort Key: a.i -> Seq Scan on a (cost=0.00..31.40 rows=2140 width=4) to become this: QUERY PLAN -----------------------------------------------------------------Merge Right Join (cost=220.32..338.32 rows=7629 width=4) Merge Cond: ("outer".i = "inner".i) -> Sort (cost=70.54..72.32 rows=713 width=4) Sort Key: b.i -> Result One-Time Filter: stable_function() -> Seq Scan on b (cost=0.00..36.75 rows=713 width=4) Filter: stable_function() -> Sort (cost=149.78..155.13 rows=2140 width=4) Sort Key:a.i -> Seq Scan on a (cost=0.00..31.40 rows=2140 width=4) That will make the same results. Maybe there is something about the implementation that I don't understand that makes it hard, but the concept is simple: before you do a seq scan on b, you call stable_function(), and if it returns true, you just do the sequential scan without calling stable_function() for each row. If it returns false, you can not do the sequental scan at all, and return the empty set immediately. I wasn't aware my queries are "badly phrased". The application generates quite nice queries like "select * from saleorder_summary", which is a view along the lines of 'select * from "order" left join saleorder using (objectid)'. "order" and "saleorder" are views like "select * from private.order where i_have_global_priv(20)". The subqueries are in the examples I gave just to make it simpler to demonstrate. The only other way I can think of phrasing a query like that is perhaps select * from private.order left join purchaseorder on ( order.objectid = purchaseorder.objectid and i_have_global_priv(31) ) This of course would not only be hugely inconvinent, but would require that regular users have unrestricted access to the base tables, which totally defeats the purpose of using veil. Also, that too is not optimized as well as it could be: test=# explain select * from a left join b on (a.i = b.i and stable_function()); QUERY PLAN -----------------------------------------------------------------Merge Left Join (cost=299.56..710.97 rows=7633 width=8) Merge Cond: ("outer".i = "inner".i) Join Filter: stable_function() -> Sort (cost=149.78..155.13 rows=2140 width=4) Sort Key: a.i -> Seq Scan on a (cost=0.00..31.40 rows=2140 width=4) -> Sort (cost=149.78..155.13rows=2140 width=4) Sort Key: b.i -> Seq Scan on b (cost=0.00..31.40 rows=2140 width=4) stable_function() will still be called multiple times needlessly.
pgsql-hackers by date: