Re: PL/PgSQL discussion - Mailing list pgsql-hackers

From jwieck@debis.com (Jan Wieck)
Subject Re: PL/PgSQL discussion
Date
Msg-id m0yDX8G-000BFRC@orion.SAPserv.Hamburg.dsh.de
Whole thread Raw
In response to Re: PL/PgSQL discussion  (Hannu Krosing <hannu@trust.ee>)
Responses Re: [HACKERS] Re: PL/PgSQL discussion
List pgsql-hackers
Hannu Krosing wrote:
>
> jwieck@debis.com (Jan Wieck) wrote:
>
> >     Someone gave a hint about global variables existing during  a
> >     session.   What  is  a  session  than?  One  transaction? The
> >     backends lifetime?
>
> I think it is a time from connect to disconnect, which currently equals to backends
> lifetime, but may in future be shorter, if we will implement a backend pool for
> quick-starting servers.

    Hmmm  -  how  does  a language handler then notice that a new
    session began?

> >
> >         PL/pgSQL is a block oriented language. A block is defined as
> >
> >             [<<label>>]
> >             [DECLARE
> >                 -- declarations]
> >             BEGIN
> >                 -- statements
> >             END;
>
>
>
> >
> >
> >         There  can  be  any  number  of  subblocks  in the statements
> >         section of a block. Subblocks can be used to  hide  variables
> >         from  outside a block of statements (see Scope and visability
> >         below).
>
> I think that subblock should also be used as the extent of IF, FOR ... LOOP, WHILE and
> other such statements.
>
> Then we would not need the END IF, END LOOP etc.

    The LOOP ... END LOOP etc. syntax  is  just  what  I  saw  in
    Oracles   PL/SQL   documentation.  I  could  also  live  with
    BEGIN...END, but what is it good for to be different?

>
> >
> >
> >     Comments
> >
> >         There are two types of comments in PL/pgSQL.  A  double  dash
> >         '--'  starts a comment that extends to the end of the line. A
> >         '/*' starts a block comment that extends to  the  next  '*/'.
> >         Block comments cannot be nested, but double dash comments can
> >         be enclosed into a block comment.
>
> And vice versa : block comment delimiters can be commented out by --

    Right - works already :-)

>
> >
> >
> >     Declarations
> >
> >         All variables, rows and records  used  in  a  block  or  it's
> >         subblocks must be declared in the declarations section of the
> >         block.   The   parameters   given   to   the   function   are
> >         automatically  declared  with  the usual identifiers $n.
>
> How hard would it bet to have named parameters, or why must we use alias?

    That isn't subject to the PL  handler.  All  the  PL  handler
    knows about the function is in pg_proc and pg_type. There are
    no parameter names, and that's what the ALIAS idea came from.

    If we sometimes implement a new function call interface, this
    might be possible. Some details about what I have in mind:

        Add a field to pg_proc that tells the  backend  the  call
        interface the function uses.

        Create   a  new  catalog  pg_parameter,  that  holds  the
        parameter names  and  other  information  (like  notnull,
        atttypmod etc.). So a function can be defined to expect a
        VARCHAR(20) NOT NULL.

        The new call interface then hands out more information to
        the  function  than  now.  It's  the  functions  Oid, the
        parameter Datums, a character array telling which of  the
        Datums  are  NULL  and  the  usual bool pointer where the
        function can tell that it's return value is NULL.

> >         Returning from the function
> >
> >                 RETURN <expr>;
> >
> >             The function terminates and the value of <expr>  will  be
> >             returned  to  the  upper executor.  The return value of a
> >             function cannot be undefined.  If control reaches the end
> >             of  the  toplevel block of the function without hitting a
> >             RETURN statement, a runtime error will occur.
>
> What can <expr> be?
>
> Possibilities: null, single value, record, recordset
>
> AFAIK, recordsets are usually returned by more strange constructs, like haveing some
> kinds of breakpoints inside the loop that either returns a record or some EOF token.
>

    Currently  only  'null'  and  'single  value'.  The  executor
    doesn't  accept anything else for non-sql language functions.
    PL functions are treated by the executor like 'C'  functions.

> >                     A  conditional  loop  that  is  executed  as  long as the
> >             evaluation of <expr> returns true.
> >
> >                 [<<label>>]
> >                 FOR <name> IN [REVERSE] <expr>..<expr> LOOP
> >                     -- statements
>
> >                 END LOOP.
>
> Perhaps PL/PgSQL should have some notation for specifying immediate lists of other kinds
> of values as well,perhaps like ['aa','bb','cc'], so that one cold loop over not only
> integers. Perhaps even with type specifiers, like [::time '22.01', '13.47.05', '15.20']

    Good issue. Could be done.

>
> Keep up the nice work!
>
> Hannu
>
>

    Already  reached  the  point  of  no  return.  The first tiny
    function ran without problems:

        CREATE FUNCTION f1(int4, int4) RETURNS int4 AS '
            BEGIN
                RETURN $1 + $2;
            END;
        ' LANGUAGE 'plpgsql';

    I set up a little test table with 2  int4  fields  containing
    some  thousand  records. Then I wrote equivalent functions in
    'sql', 'plpgsql' and 'pltcl'.   The  execution  times  for  a
    query

        SELECT sum(func(a, b)) FROM t1;

    are:

        Builtin SQL language    100%
        PL/Tcl                  180%
        PL/pgSQL                230%

    PL/Tcl is slower than builtin SQL because the internals of it
    require the two parameters to be converted to their  external
    representation,  than  calling the Tcl interpreter who parses
    them back to numbers, calculates the result,  returns  it  as
    string and then it's parsed back to int4 internal value.

    In  the PL/pgSQL case I haven't expected that big performance
    loss.  The calculation is internally done with a saved  query
    plan (made on the first call) that does a

        SELECT $1 + $2

    with  two  int4  parameters.  This  is  exactly  what the SQL
    version of the above does!  And >95% of  the  execution  time
    for  the  function  call  are  spent  in  SPI_execp().  Since
    SPI_execp()  calls  ExecutorRun()  directly,  I   think   the
    querydesc  creation and/or plan copying on each invocation is
    the time  consuming  part.  I  assume  that  there  are  some
    optimizable corners in SPI where we can gain more speed. So I
    continue with PL/pgSQL as it is now and speed it up later  by
    tuning SPI.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

pgsql-hackers by date:

Previous
From: "Thomas G. Lockhart"
Date:
Subject: Re: [QUESTIONS] timespan
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Re: [QUESTIONS] using composite types