Re: function calls optimization - Mailing list pgsql-hackers

From Andres Freund
Subject Re: function calls optimization
Date
Msg-id 20191106232608.hjc557rjje2plo4i@alap3.anarazel.de
Whole thread Raw
In response to function calls optimization  (Andrzej Barszcz <abusinf@gmail.com>)
List pgsql-hackers
Hi,

On 2019-11-03 21:56:31 +0100, Andrzej Barszcz wrote:
> Main goal of this patch is to avoid repeated calls of immutable/stable
> functions.
> This patch is against version 10.10.
> I guess same logic could be implemented up till version 12.

If you want actual development feedback, you're more likely to get that
when posting patches against the master branch.


> --- src/include/nodes/execnodes.h    2019-08-05 23:16:54.000000000 +0200
> +++ src/include/nodes/execnodes.h    2019-11-03 20:05:34.338305825 +0100
> @@ -882,6 +883,39 @@ typedef struct PlanState
>      TupleTableSlot *ps_ResultTupleSlot; /* slot for my result tuples */
>      ExprContext *ps_ExprContext;    /* node's expression-evaluation context */
>      ProjectionInfo *ps_ProjInfo;    /* info for doing tuple projection */
> +#ifdef OPTFUNCALLS
> +    /* was_called - list of  ExprEvalStep* or FuncExpr* depending on execution stage
> +     * 
> +     * Stage I. ExecInitExprRec()
> +     *    List gathers all not volatile, not set returning, not window FuncExpr*,
> +     *    equal nodes occupy one position in the list. Position in this list ( counting from 1 )
> +     *    and planstate are remembered in actual ExprEvalStep*
> +     *
> +     *     For query: select f(n),f(n) from t  - was_called->length will be 1 and ptr_value 
> +     *         will be FuncExpr* node of f(n)
> +     *
> +     *     For query: select f(n),g(n),f(n) from t - list->length == 2
> +     *
> +     * Stage II. ExecProcnode()
> +     *    For every planstate->was_called list changes its interpretation - from now on
> +     *    it is a list of ExprEvalStep* . Before executing real execProcnode
> +     *    every element of this list ( ptr_value ) is set to NULL. We don't know which
> +     *    function will be called first
> +     *
> +     * Stage III. ExecInterpExpr() case EEOP_FUNCEXPR
> +     *    ExprEvalStep.position > 0  means that in planstate->was_called could be ExprEvalStep*
> +     *    which was done yet or NULL.
> +     *
> +     *    NULL means that eval step is entered first time and:
> +     *        1. real function must be called
> +     *        2. ExprEvalStep has to be remembered in planstate->was_called at position
> +     *        step->position - 1
> +     *
> +     *    NOT NULL means that in planstate->was_called is ExprEvalStep* with ready result, so
> +     *    there is no need to call function
> +     */
> +    List *was_called;
> +#endif
>  } PlanState;

I don't think the above describes a way to do this that is
acceptable. For one, I think this needs to happen at plan time, not for
every single execution of the statement. Nor do I think is it ok to make
ExecProcNode() etc slower for this feature - that's a very crucial
routine (similar with EEOP_FUNCEXPR checking the cache, but there we
could just have a different step type doing so).

Have you looked any of the previous work on such caching? I strongly
suggest doing so if you're interested in getting such a feature into
postgres. E.g. there's a fair bit of relevant discussion in
https://www.postgresql.org/message-id/da87bb6a014e029176a04f6e50033cfb%40postgrespro.ru
e.g. between Tom and me further down.


>  /* ----------------
> --- src/include/executor/execExpr.h    2019-08-05 23:16:54.000000000 +0200
> +++ src/include/executor/execExpr.h    2019-11-03 20:04:03.739025142 +0100
> @@ -561,6 +561,10 @@ typedef struct ExprEvalStep
>              AlternativeSubPlanState *asstate;
>          }            alternative_subplan;
>      }            d;
> +#ifdef OPTFUNCALLS
> +    PlanState *planstate;    /* parent PlanState for this expression */
> +    int position;        /* position in planstate->was_called counted from 1 */
> +#endif
>  } ExprEvalStep;

This is not ok. We cannot just make every single ExprEvalStep larger for
this feature. Nor is clear why this is even needed - every ExprEvalStep
is associated with an ExprState, and ExprState already has a reference
to the parent planstate.


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Paul A Jungwirth
Date:
Subject: Re: range_agg
Next
From: Grigory Smolkin
Date:
Subject: Re: [proposal] recovery_target "latest"