Thread: another optimizer nit

another optimizer nit

From
Greg Stark
Date:
[My machine oopsed sending this the first time. I don't think it got out but
 if it did I apologize for the duplicate]

In this plan the optimizer is reluctant to push the filter down into the view.
I think it should be safe to push it inside unique nodes as long as the where
clause uses only expressions in the distinctness clause.

I'm assuming you're interesting in seeing such points since they probably
represent small changes that can be made to the heuristics that would have a
big impact. This one in particular I'm finding frustrating since it would bite
me if I tried to abstract away a lot of complexity by creating a view. If I'm
sending too many of these things that's ok, I can calm down :)

slo=> explain select * from (select distinct on (x) x,y from ttt order by 1,2) as x where x = 1;
                             QUERY PLAN
---------------------------------------------------------------------
 Subquery Scan x  (cost=0.01..0.02 rows=1 width=8)
   Filter: (x = 1::double precision)
   ->  Unique  (cost=0.01..0.02 rows=1 width=8)
         ->  Sort  (cost=0.01..0.02 rows=1 width=8)
               Sort Key: x, y
               ->  Seq Scan on ttt  (cost=0.00..0.00 rows=1 width=8)


This seems to be done already for other aggregates:

slo=> explain select * from (select n,first(m) from (select n,m from m order by n,m) as y group by n) as x where n=1;
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Subquery Scan x  (cost=1.27..1.28 rows=1 width=8)
   ->  Aggregate  (cost=1.27..1.28 rows=1 width=8)
         ->  Group  (cost=1.27..1.28 rows=1 width=8)
               ->  Sort  (cost=1.27..1.28 rows=1 width=8)
                     Sort Key: n
                     ->  Subquery Scan y  (cost=1.26..1.26 rows=1 width=8)
                           ->  Sort  (cost=1.26..1.26 rows=1 width=8)
                                 Sort Key: n, m
                                 ->  Seq Scan on m  (cost=0.00..1.25 rows=1 width=8)
                                       Filter: (n = 1)



--
greg

Re: another optimizer nit

From
Greg Stark
Date:
Greg Stark <gsstark@MIT.EDU> writes:

> [My machine oopsed sending this the first time. I don't think it got out but
>  if it did I apologize for the duplicate]
>
> In this plan the optimizer is reluctant to push the filter down into the view.
> I think it should be safe to push it inside unique nodes as long as the where
> clause uses only expressions in the distinctness clause.

Ah, in fact the comments say the same thing.

 * 1. If the subquery has a LIMIT clause or a DISTINCT ON clause, we must
 * not push down any quals, since that could change the set of rows
 * returned.  (Actually, we could push down quals into a DISTINCT ON
 * subquery if they refer only to DISTINCT-ed output columns, but
 * checking that seems more work than it's worth.  In any case, a
 * plain DISTINCT is safe to push down past.)

I think it would be worth the work because it makes view much more useful if
you can apply where clauses to them and be sure they'll get pushed all the way
in to use the index on the original table. Queries that look stupid to the
database may mean the programmer gets to use much more expressive abstraction
layers at higher levels.

I'll have to look into just how much work this would be. I'm not sure poking
in the optimizer is a great place for a newbie to start making changes :)
However the same work already has to be going on to get the query below to
work so perhaps it will just be some copy/paste.

> This seems to be done already for other aggregates:
>
> slo=> explain select * from (select n,first(m) from (select n,m from m order by n,m) as y group by n) as x where n=1;
>                                      QUERY PLAN
> -------------------------------------------------------------------------------------
>  Subquery Scan x  (cost=1.27..1.28 rows=1 width=8)
>    ->  Aggregate  (cost=1.27..1.28 rows=1 width=8)
>          ->  Group  (cost=1.27..1.28 rows=1 width=8)
>                ->  Sort  (cost=1.27..1.28 rows=1 width=8)
>                      Sort Key: n
>                      ->  Subquery Scan y  (cost=1.26..1.26 rows=1 width=8)
>                            ->  Sort  (cost=1.26..1.26 rows=1 width=8)
>                                  Sort Key: n, m
>                                  ->  Seq Scan on m  (cost=0.00..1.25 rows=1 width=8)
>                                        Filter: (n = 1)

--
greg