Re: View containing a recursive function

From: Tom Lane
Subject: Re: View containing a recursive function
Date: ,
Msg-id: 68671.1454319876@sss.pgh.pa.us
(view: Whole thread, Raw)
In response to: View containing a recursive function  (Mathieu De Zutter)
Responses: Re: View containing a recursive function  (Mathieu De Zutter)
List: pgsql-performance

Mathieu De Zutter <> writes:
> I have a recursive part in my database logic that I want to isolate and
> reuse as a view. I had found a blog that explained how move a function
> parameter into a view. The SQL is in attachment.
> When I write a query based on that view with a fixed value (or values) for
> the (input) parameter, the planner does fine and only evaluates the
> function once.
> However, when the value of the parameter should be deduced from something
> else, the planner doesn't understand that and will evaluate the function
> for each possible value.

I do not think this has anything to do with whether the query inside the
function is recursive.  Rather, the problem is that the view has a
set-returning function in its targetlist, which prevents the view from
being flattened into the outer query, per this bit in is_simple_subquery():

    /*
     * Don't pull up a subquery that has any set-returning functions in its
     * targetlist.  Otherwise we might well wind up inserting set-returning
     * functions into places where they mustn't go, such as quals of higher
     * queries.  This also ensures deletion of an empty jointree is valid.
     */
    if (expression_returns_set((Node *) subquery->targetList))
        return false;

Lack of flattening disables a lot of join optimizations, including the
one you want.

Assuming you have a reasonably late-model PG, you could rewrite the
view with a lateral function call:

CREATE OR REPLACE VIEW covering_works_r AS
  SELECT
    w.id                    AS work_id,
    fn.f                    AS covering_work_id
  FROM work w, fn_covering_works(w.id) as fn(f);

which puts the SRF into FROM where the planner can deal with it much
better.

Another problem is that you let the function default to being VOLATILE,
which would have disabled view flattening even if this didn't.  I see
no reason for this function not to be marked STABLE.

Doing both of those things gives me a plan like this:

 Nested Loop  (cost=448.24..509.53 rows=1131 width=4)
   ->  Nested Loop  (cost=0.31..16.36 rows=1 width=8)
         ->  Index Scan using work_first_release_id_idx on work w  (cost=0.15..8.17 rows=1 width=4)
               Index Cond: (first_release_id = 4249)
         ->  Index Only Scan using work_pkey on work w_1  (cost=0.15..8.17 rows=1 width=4)
               Index Cond: (id = w.id)
   ->  CTE Scan on func  (cost=447.93..470.55 rows=1131 width=0)
         CTE func
           ->  Recursive Union  (cost=0.00..447.93 rows=1131 width=4)
                 ->  Result  (cost=0.00..0.01 rows=1 width=0)
                 ->  Hash Join  (cost=0.33..42.53 rows=113 width=4)
                       Hash Cond: (ad.original_id = f.work_id)
                       ->  Seq Scan on adaptation ad  (cost=0.00..32.60 rows=2260 width=8)
                       ->  Hash  (cost=0.20..0.20 rows=10 width=4)
                             ->  WorkTable Scan on func f  (cost=0.00..0.20 rows=10 width=4)

which looks hairier, but that's because the function has been inlined
which is usually what you want for a SQL-language function.  The join
is happening the way you want.

            regards, tom lane



pgsql-performance by date:

From: Albe Laurenz
Date:
Subject: Re: Hash join gets slower as work_mem increases?
From: Jérôme Augé
Date:
Subject: Understanding ANALYZE memory usage with "big" tsvector columns