Thread: PL/pgSQL and SPI

PL/pgSQL and SPI

From
jwieck@debis.com (Jan Wieck)
Date:
Hi,

    1.  I've  just committed some changes to PL/pgSQL and the SPI
        manager.

        It's  a  speedup  of  PL/pgSQL   execution   by   calling
        ExecEvalExpr()   in  the  executor  directly  for  simple
        expressions that return one single Datum.

        For the speed test I've removed all the setup stuff  from
        the  plpgsql regression and ran the normal queries all in
        one transaction.  There are  196  query  plans  generated
        during  the regression and only 37 are left now for which
        PL/pgSQL really calls SPI_execp().

        This saves 30% of total execution time! I don't know  how
        much of the whole execution time is spent in PL/pgSQL and
        how much is consumed by the normal query processing.

        In another test I used a silly add function  that  simply
        does  a  "return  $1 + $2" and built a sum() aggregate on
        top of  it.  In  that  case  65%  of  execution  time  to
        summarize  20000  int4  values  where  saved.  This  is a
        speedup by factor 3.

        To be able to do so I've moved some of  the  declarations
        from  spi.c  into  a new header spi_priv.h so someone has
        access to the  _SPI_plan  structure  for  past  preparing
        plan-/querytree   analysis.  And  I've  added  two  silly
        functions   SPI_push()   and   SPI_pop()   that    simply
        increment/decrement   the   _SPI_curid   value.  This  is
        required for calling ExecEvalExpr(), because there  could
        be   functions   evaluated  that  use  SPI  themself  and
        otherwise they could not connect to the SPI manager. They
        are  dangerous  and  I'm  in  doubt if we should document
        them.

    2.  While doing the above I've encountered some  bad  details
        of  the  SPI  manager and the executor. The Func and Oper
        nodes point to a function cache, which is initially  NULL
        and is not copied by copyNode().

        For every call of SPI_execp() to execute a prepared plan,
        the whole plan is copied into the current memory context.
        Since  this clears out the fcache, the executor has to do
        several syscache lookups for every function  or  operator
        hit during execution of the plan.

        Unfortunately  I  haven't  found  a  way yet to avoid it.
        Anything I tried so  far  ended  in  coredumps  or  other
        misbehaviour. Maybe someone else has an idea.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] PL/pgSQL and SPI

From
Vadim Mikheev
Date:
Jan Wieck wrote:
> 
>         ...  And  I've  added  two  silly
>         functions   SPI_push()   and   SPI_pop()   that    simply
>         increment/decrement   the   _SPI_curid   value.  This  is
>         required for calling ExecEvalExpr(), because there  could
>         be   functions   evaluated  that  use  SPI  themself  and
>         otherwise they could not connect to the SPI manager. They
>         are  dangerous  and  I'm  in  doubt if we should document              ^^^^^^^^^
No more than improper call of SPI_finish()...

>         them.
> 
>     2.  While doing the above I've encountered some  bad  details
>         of  the  SPI  manager and the executor. The Func and Oper
>         nodes point to a function cache, which is initially  NULL
>         and is not copied by copyNode().
> 
>         For every call of SPI_execp() to execute a prepared plan,
>         the whole plan is copied into the current memory context.
>         Since  this clears out the fcache, the executor has to do
>         several syscache lookups for every function  or  operator
>         hit during execution of the plan.
> 
>         Unfortunately  I  haven't  found  a  way yet to avoid it.
>         Anything I tried so  far  ended  in  coredumps  or  other
>         misbehaviour. Maybe someone else has an idea.

Could we fill most of FunctionCache while parsing query ?!
We can do this for 
   int         typlen;         /* length of the return type */   int         typbyval;       /* true if return type is
passby value */
 
...   Oid         foid;           /* oid of the function in pg_proc */   Oid         language;       /* oid of the
languagein pg_language */   int         nargs;          /* number of arguments */
 
   Oid        *argOidVect;     /* oids of all the arguments */
...   bool        istrusted;      /* trusted fn? */

and may be others too.

Vadim


Re: [HACKERS] PL/pgSQL and SPI

From
jwieck@debis.com (Jan Wieck)
Date:
Vadim wrote:

>
> Jan Wieck wrote:
> >     2.  While doing the above I've encountered some  bad  details
> >         of  the  SPI  manager and the executor. The Func and Oper
> >         nodes point to a function cache, which is initially  NULL
> >         and is not copied by copyNode().
> >
> >         For every call of SPI_execp() to execute a prepared plan,
> >         the whole plan is copied into the current memory context.
> >         Since  this clears out the fcache, the executor has to do
> >         several syscache lookups for every function  or  operator
> >         hit during execution of the plan.
> >
> >         Unfortunately  I  haven't  found  a  way yet to avoid it.
> >         Anything I tried so  far  ended  in  coredumps  or  other
> >         misbehaviour. Maybe someone else has an idea.
>
> Could we fill most of FunctionCache while parsing query ?!
> We can do this for
>
>     int         typlen;         /* length of the return type */
>     int         typbyval;       /* true if return type is pass by value */
> ...
>     Oid         foid;           /* oid of the function in pg_proc */
>     Oid         language;       /* oid of the language in pg_language */
>     int         nargs;          /* number of arguments */
>
>     Oid        *argOidVect;     /* oids of all the arguments */
> ...
>     bool        istrusted;      /* trusted fn? */
>
> and may be others too.

    And  then  letting  copyNode()  copy  the  fcache too so it's
    allocated in the same memory context.

    Will require a flag in the fcache that is used to  tell  that
    setFcache()  must  be  called to fill in the remaining fields
    (there are some things taken from the actual executor state).
    This  flag  is  then  cleared by copyNode() and the fields in
    question left uncopied.

    This  might  also  let  us  get  rid  of  the  tree  copy  in
    SPI_execp(),  if we form another tree-traversal function that
    resets the flag in all Func and Oper nodes of the whole tree,
    so the prepared/saved plan can be used directly.

    I'll give it a try some time.

    Thanks for the kick, Vadim.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] PL/pgSQL and SPI

From
Vadim Mikheev
Date:
Jan Wieck wrote:
> 
> >
> > Could we fill most of FunctionCache while parsing query ?!
> 
>     And  then  letting  copyNode()  copy  the  fcache too so it's
>     allocated in the same memory context.

Or we could move these items from fcache struct to
Func/Oper node...

>     Will require a flag in the fcache that is used to  tell  that
>     setFcache()  must  be  called to fill in the remaining fields
>     (there are some things taken from the actual executor state).
>     This  flag  is  then  cleared by copyNode() and the fields in
>     question left uncopied.

I missed here, please explain. What fields are you talking about?
Note that to support READ COMMITTED level I copy execution plan
_after_ execution started and so nothing used to keep execution
states, but not handled (re-initialized) by ExecInitNode, 
must be copied.
Also, see below.

>     This  might  also  let  us  get  rid  of  the  tree  copy  in
>     SPI_execp(),  if we form another tree-traversal function that
>     resets the flag in all Func and Oper nodes of the whole tree,
>     so the prepared/saved plan can be used directly.
> 
>     I'll give it a try some time.

Maybe. But note that if executor will try to use/pfree something
allocated in previous execution (in another memory context)
then we'll get trouble.

Vadim


Re: [HACKERS] PL/pgSQL and SPI

From
Vadim Mikheev
Date:
Jan Wieck wrote:
> 
>     1.  I've  just committed some changes to PL/pgSQL and the SPI
>         manager.
> 
>         It's  a  speedup  of  PL/pgSQL   execution   by   calling
>         ExecEvalExpr()   in  the  executor  directly  for  simple
>         expressions that return one single Datum.
> 
...
> 
>         To be able to do so I've moved some of  the  declarations
>         from  spi.c  into  a new header spi_priv.h so someone has
>         access to the  _SPI_plan  structure  for  past  preparing
>         plan-/querytree   analysis.  And  I've  added  two  silly
>         functions   SPI_push()   and   SPI_pop()   that    simply
>         increment/decrement   the   _SPI_curid   value.  This  is
>         required for calling ExecEvalExpr(), because there  could
>         be   functions   evaluated  that  use  SPI  themself  and
>         otherwise they could not connect to the SPI manager. They
>         are  dangerous  and  I'm  in  doubt if we should document
>         them.

BTW, Jan, did you consider ability to add new function
for fast expression evaluation to SPI itself and than just
use this func in PL/pgSQL?
This function seems to be generally usefull.
And we could avoid SPI_push/SPI_pop...

Vadim


Re: [HACKERS] PL/pgSQL and SPI

From
jwieck@debis.com (Jan Wieck)
Date:
Vadim wrote:
>
> Jan Wieck wrote:
> >
> >     1.  I've  just committed some changes to PL/pgSQL and the SPI
> >         manager.
> >
> >         It's  a  speedup  of  PL/pgSQL   execution   by   calling
> >         ExecEvalExpr()   in  the  executor  directly  for  simple
> >         expressions that return one single Datum.
>
> BTW, Jan, did you consider ability to add new function
> for fast expression evaluation to SPI itself and than just
> use this func in PL/pgSQL?
> This function seems to be generally usefull.
> And we could avoid SPI_push/SPI_pop...

    Clarification:

        I'm  doing many tests on the SPI generated plan to ensure
        that it is so simple that ExecEvalExpr()  cannot  stumble
        over it. In detail it must be something that has only one
        targetentry, absolutely no qual, lefttree,  righttree  or
        something  else.  And all the nodes in the TLE expression
        must only be Expr (OP, FUNC, OR, AND, NOT only), Const or
        Param ones.

        This  is  required, because I have to fake an ExprContext
        that contains the values for  the  parameters  only.  The
        above  ensures,  that  ExecEvalExpr()  will  never  touch
        anything else than the ecxt_param_list_info and thus will
        not notice that it is a faked one.

    Well,  but  you're  right, I could add some smartness to SPI.
    First, it could do the same checks on the generated plan that
    ensure it really returns 1 (and only ever 1) Datum based only
    on function calls, constants or parameters. If  this  is  the
    case,  it  could  internally  call ExecEvalExpr() and build a
    faked heap tuple on  SPI_execp().  Someone  using  SPI_exec()
    isn't interested in speed, so I would leave it out there.

    And two new functions

        bool SPI_is_simple_expr(void *plan);
        Datum SPI_eval_simple_expr(void *plan,
                                   Datum *values,
                                   char *Nulls,
                                   bool *isNull,
                                   Oid *rettype);

    could gain more direct access to such expressions suppressing
    the need to diddle with the SPI tuple table for getting  just
    one Datum.

    Yes,  I think it would be a good enhancement. I'll go for it.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #