Thread: View containing a recursive function

View containing a recursive function

From
Mathieu De Zutter
Date:
Hi all,

I have a recursive part in my database logic that I want to isolate and reuse as a view. I had found a blog that explained how move a function parameter into a view. The SQL is in attachment.
When I write a query based on that view with a fixed value (or values) for the (input) parameter, the planner does fine and only evaluates the function once.
However, when the value of the parameter should be deduced from something else, the planner doesn't understand that and will evaluate the function for each possible value.

Any pointers to what I'm doing wrong or on how to optimize it?

Attachment contains the queries and explain plans.

Thanks!

Kind regards,
Mathieu
Attachment

Re: View containing a recursive function

From
Tom Lane
Date:
Mathieu De Zutter <mathieu@dezutter.org> writes:
> I have a recursive part in my database logic that I want to isolate and
> reuse as a view. I had found a blog that explained how move a function
> parameter into a view. The SQL is in attachment.
> When I write a query based on that view with a fixed value (or values) for
> the (input) parameter, the planner does fine and only evaluates the
> function once.
> However, when the value of the parameter should be deduced from something
> else, the planner doesn't understand that and will evaluate the function
> for each possible value.

I do not think this has anything to do with whether the query inside the
function is recursive.  Rather, the problem is that the view has a
set-returning function in its targetlist, which prevents the view from
being flattened into the outer query, per this bit in is_simple_subquery():

    /*
     * Don't pull up a subquery that has any set-returning functions in its
     * targetlist.  Otherwise we might well wind up inserting set-returning
     * functions into places where they mustn't go, such as quals of higher
     * queries.  This also ensures deletion of an empty jointree is valid.
     */
    if (expression_returns_set((Node *) subquery->targetList))
        return false;

Lack of flattening disables a lot of join optimizations, including the
one you want.

Assuming you have a reasonably late-model PG, you could rewrite the
view with a lateral function call:

CREATE OR REPLACE VIEW covering_works_r AS
  SELECT
    w.id                    AS work_id,
    fn.f                    AS covering_work_id
  FROM work w, fn_covering_works(w.id) as fn(f);

which puts the SRF into FROM where the planner can deal with it much
better.

Another problem is that you let the function default to being VOLATILE,
which would have disabled view flattening even if this didn't.  I see
no reason for this function not to be marked STABLE.

Doing both of those things gives me a plan like this:

 Nested Loop  (cost=448.24..509.53 rows=1131 width=4)
   ->  Nested Loop  (cost=0.31..16.36 rows=1 width=8)
         ->  Index Scan using work_first_release_id_idx on work w  (cost=0.15..8.17 rows=1 width=4)
               Index Cond: (first_release_id = 4249)
         ->  Index Only Scan using work_pkey on work w_1  (cost=0.15..8.17 rows=1 width=4)
               Index Cond: (id = w.id)
   ->  CTE Scan on func  (cost=447.93..470.55 rows=1131 width=0)
         CTE func
           ->  Recursive Union  (cost=0.00..447.93 rows=1131 width=4)
                 ->  Result  (cost=0.00..0.01 rows=1 width=0)
                 ->  Hash Join  (cost=0.33..42.53 rows=113 width=4)
                       Hash Cond: (ad.original_id = f.work_id)
                       ->  Seq Scan on adaptation ad  (cost=0.00..32.60 rows=2260 width=8)
                       ->  Hash  (cost=0.20..0.20 rows=10 width=4)
                             ->  WorkTable Scan on func f  (cost=0.00..0.20 rows=10 width=4)

which looks hairier, but that's because the function has been inlined
which is usually what you want for a SQL-language function.  The join
is happening the way you want.

            regards, tom lane


Re: View containing a recursive function

From
Mathieu De Zutter
Date:
On Mon, 1 Feb 2016 at 10:45 Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mathieu De Zutter <mathieu@dezutter.org> writes:
Assuming you have a reasonably late-model PG, you could rewrite the
view with a lateral function call:

CREATE OR REPLACE VIEW covering_works_r AS
  SELECT
    w.id                    AS work_id,
    fn.f                    AS covering_work_id
  FROM work w, fn_covering_works(w.id) as fn(f);

which puts the SRF into FROM where the planner can deal with it much
better.
 
Thanks a lot. That fixes it! 

Another problem is that you let the function default to being VOLATILE,
which would have disabled view flattening even if this didn't.  I see
no reason for this function not to be marked STABLE.

By marking it STABLE, it ignores my row estimate of 1 - I guess because of the inlining. The number of results is usually just 1, though the number can go up to 10 in exceptional cases. That's still a lot better than the inexplicable estimate of the planner (101) when marked STABLE, which often leads to triggering a hash join instead of a nested loop in complex queries:

->  Recursive Union  (cost=0.00..795.53 rows=101 width=4) (actual time=0.001..0.009 rows=1 loops=4)
      ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=4)
      ->  Nested Loop  (cost=0.29..79.35 rows=10 width=4) (actual time=0.005..0.005 rows=0 loops=5)
            ->  WorkTable Scan on func f  (cost=0.00..0.20 rows=10 width=4) (actual time=0.000..0.000 rows=1 loops=5)
            ->  Index Scan using adaptation_adapted_idx on adaptation ad  (cost=0.29..7.91 rows=1 width=8) (actual time=0.004..0.004 rows=0 loops=5)
                  Index Cond: (adapted_id = f.work_id)


Thanks again,

Mathieu