Thread: patch: plpgsql - access records with rec.(expr)

patch: plpgsql - access records with rec.(expr)

From

Matt

Date:

18 November 2004, 13:18:20

Hi,

I got extremely frustrated with having to create a temp table every time
I wanted to access an arbitrary column from a record plpgsql. After
seeing a note on the developers TODO about accessing plpgsql records
with a 'dot bracket' notation I started digging into the plpgsql source.

My diff (against 8beta4) is attached.

Warning: I Am Not a C Programmer! I haven't even written a hello world
in C before, and I knew nothing about Flex before yesterday. It was fun
figuring stuff out, I'm amazed it mostly works, but I'm really hoping
someone can point out my mistakes.

Goal:

Enable users to access fields in record variables using the following
syntax like the following:
  rec.(1)
  rec.('foo')
  rec.(myvar::int)
  rec.(myvar || '_id')

Files changed:

plpgsql.h
- added 'expr' member to PLpgSQL_recfield type for holding the
PLpgSQL_expr structure.

scan.l
- added match for {identifier}{space}*\.  AFAIK this should only match
if a longer expression doesn't?

pl_comp.c
- added plpgsql_parse_wordexpr() function called by above match. Ripped
off code from plpgsql_parse_word that deals with arg_v[expr] to find our
expression. Prob a dumb name for the function!

pl_exec.c
- hacked exec_assign_value() and exec_eval_datum() to use the expression
to get the field name/number.

Stuff I need help with:

1. It should recognise OLD.(1) as a field number, not a column name. I
think I've got to check the returned type from exec_eval_expr() then
exec_simple_cast_value() on it, but that seems beyond me.

2. Freeing stuff. As I explained, this is all pretty new to me, and the
comments about it around exec_eval_expr() et al just confused me :(
Please give some hints about what needs freeing!

3. Would probably be good to add check in pl_comp.c to see if the
expression actually needs to be evaluated at runtime (ie isn't just a
field name/number). How?

4. Make this also work for row.(expr), label.record.(expr) and
label.row.(expr) - but want to get the basics working first!

5. Because of the way the expression is parsed (looking for closing
parenth), this will choke if you try and put a function in there. Would
it be better to use curly braces '{expr}' or another character to mark
the expression?

I hope at this eventually leads to some really useful extra
functionality, particularly for writing generic triggers. And it's a
tribute to the existing code that a complete newbie can cut-and-paste
their way to a halfarsed solution in a (rather long) night!

Regards,

Matt

Attachment

record_dot_parenth.diff

Re: patch: plpgsql - access records with rec.(expr)

From

Matt

Date:

18 November 2004, 15:36:48

> 5. Because of the way the expression is parsed (looking for closing
> parenth), this will choke if you try and put a function in there. Would
> it be better to use curly braces '{expr}' or another character to mark
> the expression?

I lie! pgpgsql_read_expression() is smarter than that!

However, I do have another problem. If the value of the expr changes
inside a loop to a fieldname of a different type, it dies with the "type
of \"%s\" does not match that when preparing the plan" message, which is
quite true: it obviously doesn't.

Just setting expectedtypeoid to InvalidOid bombs the whole thing :(
Hrm.... the "best made plans" and all that...

Matt

Re: patch: plpgsql - access records with rec.(expr)

From

Matt

Date:

19 November 2004, 16:45:40

Hi all,

I've cleaned this up a bit so that assigning to a dynamic record field
now works - ie NEW.(myvar) := 'someval' - and accessing a field by
number works - ie myvar := OLD.(1)

It still doesn't work in a loop if the type of field referenced by the
expression changes - looking at it more I'm really not sure this is
possible, but that's a limitation I could live with. I'll also try
figuring out freeing stuff after casting values over the weekend.

But is this a worthwhile approach? If there are objections, please let
me know!

For myself, being able to pass a column name as an argument to a trigger
would make writing generic triggers a whole lot easier. And going back
through the archives on this list I see I'm not alone.

Regards,

Matt

On Thu, 2004-11-18 at 13:18, Matt wrote:
> Hi,
>
> I got extremely frustrated with having to create a temp table every time
> I wanted to access an arbitrary column from a record plpgsql. After
> seeing a note on the developers TODO about accessing plpgsql records
> with a 'dot bracket' notation I started digging into the plpgsql source.
>
> My diff (against 8beta4) is attached.
>
> Warning: I Am Not a C Programmer! I haven't even written a hello world
> in C before, and I knew nothing about Flex before yesterday. It was fun
> figuring stuff out, I'm amazed it mostly works, but I'm really hoping
> someone can point out my mistakes.
>
> Goal:
>
> Enable users to access fields in record variables using the following
> syntax like the following:
>   rec.(1)
>   rec.('foo')
>   rec.(myvar::int)
>   rec.(myvar || '_id')
>
> Files changed:
>
> plpgsql.h
> - added 'expr' member to PLpgSQL_recfield type for holding the
> PLpgSQL_expr structure.
>
> scan.l
> - added match for {identifier}{space}*\.  AFAIK this should only match
> if a longer expression doesn't?
>
> pl_comp.c
> - added plpgsql_parse_wordexpr() function called by above match. Ripped
> off code from plpgsql_parse_word that deals with arg_v[expr] to find our
> expression. Prob a dumb name for the function!
>
> pl_exec.c
> - hacked exec_assign_value() and exec_eval_datum() to use the expression
> to get the field name/number.
>
> Stuff I need help with:
>
> 1. It should recognise OLD.(1) as a field number, not a column name. I
> think I've got to check the returned type from exec_eval_expr() then
> exec_simple_cast_value() on it, but that seems beyond me.
>
> 2. Freeing stuff. As I explained, this is all pretty new to me, and the
> comments about it around exec_eval_expr() et al just confused me :(
> Please give some hints about what needs freeing!
>
> 3. Would probably be good to add check in pl_comp.c to see if the
> expression actually needs to be evaluated at runtime (ie isn't just a
> field name/number). How?
>
> 4. Make this also work for row.(expr), label.record.(expr) and
> label.row.(expr) - but want to get the basics working first!
>
> 5. Because of the way the expression is parsed (looking for closing
> parenth), this will choke if you try and put a function in there. Would
> it be better to use curly braces '{expr}' or another character to mark
> the expression?
>
> I hope at this eventually leads to some really useful extra
> functionality, particularly for writing generic triggers. And it's a
> tribute to the existing code that a complete newbie can cut-and-paste
> their way to a halfarsed solution in a (rather long) night!
>
> Regards,
>
> Matt
>
> ______________________________________________________________________
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match

Attachment

record_dot_parenth.diff

Re: patch: plpgsql - access records with rec.(expr)

From

Neil Conway

Date:

22 November 2004, 08:29:51

On Thu, 2004-11-18 at 13:18 +0000, Matt wrote:
> I got extremely frustrated with having to create a temp table every time
> I wanted to access an arbitrary column from a record plpgsql.

FYI, one thing I want to implement is an EVALUATE statement in plpgsql
(analogous to eval() in Perl, for example). If I understand your use
case, I think this will help somewhat, although of course it is still
clumsier than direct syntactic support.

> Warning: I Am Not a C Programmer! I haven't even written a hello world
> in C before, and I knew nothing about Flex before yesterday. It was fun
> figuring stuff out, I'm amazed it mostly works, but I'm really hoping
> someone can point out my mistakes.

An impressive first hack! :)

> Enable users to access fields in record variables using the following
> syntax like the following [...]

>   rec.(1)

Looks good.

>   rec.('foo')

I don't like this: it implicitly coerces a string literal into an
identifier (i.e. a column name). Treating data as code can be useful,
but I think we need to make it more obvious to the user. I think a
proper EVALUATE statement might be a better solution.

> 5. Because of the way the expression is parsed (looking for closing
> parenth), this will choke if you try and put a function in there. Would
> it be better to use curly braces '{expr}' or another character to mark
> the expression?

How much thought went into choosing parentheses? (i.e. is a similar
syntax used in the procedural languages in other DBs?)

In any case, I don't think switching switching to braces is the right
solution to the lexing problem. I _believe_ the patch is OK as is --
plpgsql_read_expression() keeps track of parenthesis nesting so it
should be able to figure out when it sees an "unmatched" parenthesis
(indicating the end of the expression).

BTW, attached is a trivial incremental patch that fixes a few compile
warnings (which are indicative of actual problems) in your patch.

-Neil

Attachment

record_dot_improv.patch

Re: patch: plpgsql - access records with rec.(expr)

From

Matt

Date:

22 November 2004, 10:07:16

Hi Neil,

Thanks for the comments. I've actually got (yet) another version ready
to go, which fixes the compile warnings and adds some sanity checks.
I'll post it as soon as I've got beta5 downloaded and tried out :)

> FYI, one thing I want to implement is an EVALUATE statement in plpgsql
> (analogous to eval() in Perl, for example). If I understand your use
> case, I think this will help somewhat, although of course it is still
> clumsier than direct syntactic support.

This would execute a string and pass back the result? I'm sure I'll find
a use for it at some point :)

> >   rec.('foo')
> 
> I don't like this: it implicitly coerces a string literal into an
> identifier (i.e. a column name). Treating data as code can be useful,
> but I think we need to make it more obvious to the user. I think a
> proper EVALUATE statement might be a better solution.

See your point. But what about NEW.($1)?

> > 5. Because of the way the expression is parsed (looking for closing
> > parenth), this will choke if you try and put a function in there. Would
> > it be better to use curly braces '{expr}' or another character to mark
> > the expression?
> 
> How much thought went into choosing parentheses? (i.e. is a similar
> syntax used in the procedural languages in other DBs?)

Only used them because of the small note I saw on the developer's TODO
about accessing cols by ordinal - the example there was rec.(1). But I
was wrong about functions not working there - plpgsql_read_expression()
is smarter than that, as you say.

OK, download is done. I've got some more general ideas which relate to
this. I'll post along with updated version.

Matt

Re: patch: plpgsql - access records with rec.(expr)

From

Matt

Date:

22 November 2004, 13:42:59

Hi,

Updated patch (now against beta5) attached. It now pfree's any converted
strings, avoids pointlessly casting an int4oid to int4oid, complies to
CS (uses tabs, not spaces) and works with label.record.(expression) too.
I'm still testing, it now does what I set out to achieve.

I haven't done anything to implement the same thing for row types. I'm
wondering if there's any point. The person writing the function is
implicitly going to know the structure of a row; the only use I can
foresee is iterating over the columns by ordinal, but that isn't
possible (at present, see below).

Neil's raised a coupe of good points:
  - is the parenthesis format the right way to go? How do other sp
languages do it?
  - if we have a more genaralised EVALUATE statement, do we still need
this?

I'm not at all fixated on the parenthesis idea. If there's a better way,
great. But if this stands a chance of making it in to the language one
day, it'd be good to know so I can start building stuff that uses the
format.

One thought I had about the format would be to use a kind of
javascript-ish syntax:
    rec.item(expr) - as above
This might later lend itself to other record-specific functions:
    rec.length() - returns number of columns
    rec.getColType(expr) - returns oid of given column
    rec.getColNumber(expr) - returns ordinal of given column
I can't see a crying need for any of these, mind... but length() would
be useful _if_ you could iterate over columns, so:

Loops:

The issue with loops is a nuisance - if the underlying column type
changes, my stuff throws a data type mismatch exception. I imagine this
could affect an EVALUATE expression inside a loop, too (or would their
plans never be saved, like EXECUTE?).

I might be totally wrong, but here's what I see:

To stop the mismatch you'd have to stop exec_eval_expr() saving the plan
for that expression. A flag in the expression struct could do that,
though you'd want to add a check that the function was defined as
volatile before allowing it.

The flag could be set during the function's compilation stage, but then
all functions using the rec.(expr) format anywhere would have to be
volatile.

A better approach would be to walk through the function's statement tree
just before execution, checking for troublesome expressions appearing
inside loops (and whose arguments are reassigned inside the loop?) and
marking them as dynamic.

Does that make any sense? Is it worth the work? Or should we just tell
anyone who actually needs it (I don't, at present) 'use another PL'?

Regards,

Matt
diff -u src.orig/pl_comp.c src/pl_comp.c
--- src.orig/pl_comp.c    2004-11-22 10:20:24.000000000 +0000
+++ src/pl_comp.c    2004-11-22 10:58:42.465929360 +0000
@@ -783,6 +783,72 @@
     return T_WORD;
 }

+/* ----------
+ * plpgsql_parse_wordexpr        Same lookup for word followed by dot.
+ *                Should only get here if it wasn't followed by
+ *                an identifier.
+ * ----------
+ */
+int
+plpgsql_parse_wordexpr(char *word)
+{
+    PLpgSQL_nsitem *ns;
+    char       *cp[1];
+    int            save_spacescanned = plpgsql_SpaceScanned;
+
+    /* Do case conversion and word separation */
+    /* drop dot to keep converter happy */
+    word[strlen(word) - 1] = '\0';
+    plpgsql_convert_ident(word, cp, 1);
+
+    /*
+     * Do a lookup on the compilers namestack
+     */
+    ns = plpgsql_ns_lookup(cp[0], NULL);
+    if (ns == NULL)
+    {
+        pfree(cp[0]);
+        return T_ERROR;
+    }
+    switch (ns->itemtype)
+    {
+        case PLPGSQL_NSTYPE_REC:
+            {
+                /*
+                 * First word is a record name, so the parenthesised expr that
+                 * follows it defines the field.
+                 */
+                PLpgSQL_recfield *new;
+
+                new = malloc(sizeof(PLpgSQL_recfield));
+                memset(new, 0, sizeof(PLpgSQL_recfield));
+
+                if (plpgsql_yylex() != '(')
+                    plpgsql_yyerror("expected identifier or \"(\"");
+                new->expr = plpgsql_read_expression(')', ")");
+                new->recparentno = ns->itemno;
+                new->isExpr = true;
+                new->dtype = PLPGSQL_DTYPE_RECFIELD;
+
+                plpgsql_adddatum((PLpgSQL_datum *) new);
+
+                plpgsql_yylval.scalar = (PLpgSQL_datum *) new;
+
+                plpgsql_SpaceScanned = save_spacescanned;
+
+                pfree(cp[0]);
+                return T_SCALAR;
+            }
+
+        /* TODO: deal with rows, too? */
+
+        default:
+            break;
+
+    }
+    pfree(cp[0]);
+    return T_ERROR;
+}

 /* ----------
  * plpgsql_parse_dblword        Same lookup for two words
@@ -855,6 +921,7 @@
                 new->dtype = PLPGSQL_DTYPE_RECFIELD;
                 new->fieldname = strdup(cp[1]);
                 new->recparentno = ns->itemno;
+                new->isExpr = false;

                 plpgsql_adddatum((PLpgSQL_datum *) new);

@@ -901,6 +968,89 @@
     return T_ERROR;
 }

+/* ----------
+ * plpgsql_parse_dblwordexpr        Same lookup for two words
+ *                    separated by a dot followed by a lone dot
+ * ----------
+ */
+int
+plpgsql_parse_dblwordexpr(char *word)
+{
+    PLpgSQL_nsitem *ns;
+    char       *cp[2];
+    int            save_spacescanned = plpgsql_SpaceScanned;
+
+    /* Do case conversion and word separation */
+    /* drop dot to keep converter happy */
+    word[strlen(word) - 1] = '\0';
+    plpgsql_convert_ident(word, cp, 2);
+
+    /*
+     * Lookup the first word
+     */
+    ns = plpgsql_ns_lookup(cp[0], NULL);
+    if (ns == NULL)
+    {
+        pfree(cp[0]);
+        pfree(cp[1]);
+        return T_ERROR;
+    }
+
+    switch (ns->itemtype)
+    {
+        case PLPGSQL_NSTYPE_LABEL:
+
+            /*
+             * First word is a label, second should be a record
+             */
+            ns = plpgsql_ns_lookup(cp[1], cp[0]);
+            pfree(cp[0]);
+            pfree(cp[1]);
+            if (ns == NULL)
+                return T_ERROR;
+            switch (ns->itemtype)
+            {
+                case PLPGSQL_NSTYPE_REC:
+                    {
+                    /*
+                     * First word is a record name, so parenthesised expr
+                     * following refers to field.
+                     */
+                    PLpgSQL_recfield *new;
+
+                    new = malloc(sizeof(PLpgSQL_recfield));
+                    memset(new, 0, sizeof(PLpgSQL_recfield));
+
+                    if (plpgsql_yylex() != '(')
+                        plpgsql_yyerror("expected identifier or \"(\"");
+                    new->expr = plpgsql_read_expression(')', ")");
+                    new->recparentno = ns->itemno;
+                    new->isExpr = true;
+                    new->dtype = PLPGSQL_DTYPE_RECFIELD;
+
+                    plpgsql_adddatum((PLpgSQL_datum *) new);
+
+                    plpgsql_yylval.scalar = (PLpgSQL_datum *) new;
+
+                    plpgsql_SpaceScanned = save_spacescanned;
+
+                    return T_SCALAR;
+                    }
+                default:
+                    return T_ERROR;
+            }
+            break;
+
+
+        default:
+            break;
+    }
+
+    pfree(cp[0]);
+    pfree(cp[1]);
+    return T_ERROR;
+}
+

 /* ----------
  * plpgsql_parse_tripword        Same lookup for three words
@@ -961,6 +1111,7 @@
                 new->dtype = PLPGSQL_DTYPE_RECFIELD;
                 new->fieldname = strdup(cp[2]);
                 new->recparentno = ns->itemno;
+                new->isExpr = false;

                 plpgsql_adddatum((PLpgSQL_datum *) new);

diff -u src.orig/pl_exec.c src/pl_exec.c
--- src.orig/pl_exec.c    2004-11-22 10:20:24.000000000 +0000
+++ src/pl_exec.c    2004-11-22 10:58:42.684896097 +0000
@@ -2956,6 +2956,8 @@
                 PLpgSQL_recfield *recfield = (PLpgSQL_recfield *) target;
                 PLpgSQL_rec *rec;
                 int            fno;
+                char        *fname;
+                bool        mustfreefname = false;
                 HeapTuple    newtup;
                 int            natts;
                 int            i;
@@ -2980,20 +2982,78 @@
                               rec->refname),
                        errdetail("The tuple structure of a not-yet-assigned record is indeterminate.")));

+
+                /* Have we got an expr to deal with? */
+                fno = 0;
+                fname = recfield->fieldname;
+                if (recfield->isExpr)
+                {
+                    Datum        exprdatum;
+                    Oid            exprtype;
+                    bool        exprisnull;
+
+                    /* Evaluate expression */
+                    exprdatum = exec_eval_expr(estate, recfield->expr,
+                                    &exprisnull, &exprtype);
+
+                    /* If we've got an integer, it's a field number, otherwise
+                     * it's a fieldname
+                     */
+                    if (exprtype == INT4OID) {
+                        fno = DatumGetInt32(exprdatum);
+                        if (fno < 1) {
+                            ereport(ERROR,
+                                    (errcode(ERRCODE_UNDEFINED_COLUMN),
+                                     errmsg("field expression evaluates to a number less than 1")));
+                        }
+                    }
+                    else {
+                        fname = convert_value_to_string(
+                                                exprdatum, exprtype);
+                        mustfreefname = true;
+                        if (*fname == '\0') {
+                            ereport(ERROR,
+                                    (errcode(ERRCODE_UNDEFINED_COLUMN),
+                                     errmsg("field expression evaluates to an empty string")));
+                        }
+                    }
+                }
+
                 /*
                  * Get the number of the records field to change and the
-                 * number of attributes in the tuple.
+                 * number of attributes in the tuple, if we didn't get it
+                 * above.
                  */
-                fno = SPI_fnumber(rec->tupdesc, recfield->fieldname);
-                if (fno == SPI_ERROR_NOATTRIBUTE)
-                    ereport(ERROR,
-                            (errcode(ERRCODE_UNDEFINED_COLUMN),
-                             errmsg("record \"%s\" has no field \"%s\"",
-                                    rec->refname, recfield->fieldname)));
+                if (fno < 1)
+                {
+                    fno = SPI_fnumber(rec->tupdesc, fname);
+                    if (fno == SPI_ERROR_NOATTRIBUTE)
+                        ereport(ERROR,
+                                (errcode(ERRCODE_UNDEFINED_COLUMN),
+                                 errmsg("record \"%s\" has no field \"%s\"",
+                                        rec->refname, fname)));
+                }
+
+                /* Free fname, if required */
+                if (mustfreefname) {
+                    pfree(fname);
+                }
+
                 fno--;
                 natts = rec->tupdesc->natts;

                 /*
+                 * Check that fno is within range
+                 */
+                if (fno >= natts) {
+                    ereport(ERROR,
+                            (errcode(ERRCODE_UNDEFINED_COLUMN),
+                             errmsg("record \"%s\" has no field number \"%d\"",
+                                    rec->refname, fno)));
+
+                }
+
+                /*
                  * Set up values/datums arrays for heap_formtuple.    For
                  * all the attributes except the one we want to replace,
                  * use the value that's in the old tuple.
@@ -3288,6 +3348,8 @@
                 PLpgSQL_recfield *recfield = (PLpgSQL_recfield *) datum;
                 PLpgSQL_rec *rec;
                 int            fno;
+                char        *fname;
+                bool        mustfreefname = false;

                 rec = (PLpgSQL_rec *) (estate->datums[recfield->recparentno]);
                 if (!HeapTupleIsValid(rec->tup))
@@ -3296,19 +3358,76 @@
                        errmsg("record \"%s\" is not assigned yet",
                               rec->refname),
                        errdetail("The tuple structure of a not-yet-assigned record is indeterminate.")));
-                fno = SPI_fnumber(rec->tupdesc, recfield->fieldname);
-                if (fno == SPI_ERROR_NOATTRIBUTE)
-                    ereport(ERROR,
-                            (errcode(ERRCODE_UNDEFINED_COLUMN),
-                             errmsg("record \"%s\" has no field \"%s\"",
-                                    rec->refname, recfield->fieldname)));
+
+                /* Have we got an expr to deal with? */
+                fno = 0;
+                fname = recfield->fieldname;
+                if (recfield->isExpr)
+                {
+                    Datum        exprdatum;
+                    Oid            exprtype;
+                    bool        exprisnull;
+
+                    /* Evaluate expression */
+                    exprdatum = exec_eval_expr(estate, recfield->expr,
+                                    &exprisnull, &exprtype);
+
+
+                    /* If we've got an integer, it's a field number, otherwise
+                     * it's a fieldname
+                     */
+                    if (exprtype == INT4OID) {
+                        fno = DatumGetInt32(exprdatum);
+                        if (fno < 1) {
+                            ereport(ERROR,
+                                    (errcode(ERRCODE_UNDEFINED_COLUMN),
+                                     errmsg("field expression evaluates to a number less than 1")));
+                        }
+                    }
+                    else {
+                        fname = convert_value_to_string(
+                                                exprdatum, exprtype);
+                        mustfreefname = true;
+                        if (*fname == '\0') {
+                            ereport(ERROR,
+                                    (errcode(ERRCODE_UNDEFINED_COLUMN),
+                                     errmsg("field expression evaluates to an empty string")));
+                        }
+                    }
+                }
+
+                /*
+                 * Get the number of the records field to change and the
+                 * number of attributes in the tuple.
+                 */
+                if (fno < 1)
+                {
+                    fno = SPI_fnumber(rec->tupdesc, fname);
+                    if (fno == SPI_ERROR_NOATTRIBUTE)
+                        ereport(ERROR,
+                                (errcode(ERRCODE_UNDEFINED_COLUMN),
+                                 errmsg("record \"%s\" has no field \"%s\"",
+                                        rec->refname, fname)));
+                }
+
+                /* Free up fname if required */
+                if (mustfreefname) {
+                    pfree(fname);
+                }
+
                 *typeid = SPI_gettypeid(rec->tupdesc, fno);
                 *value = SPI_getbinval(rec->tup, rec->tupdesc, fno, isnull);
+                if (*value == SPI_ERROR_NOATTRIBUTE)
+                    ereport(ERROR,
+                            (errcode(ERRCODE_UNDEFINED_COLUMN),
+                             errmsg("record \"%s\" has no field number \"%d\"",
+                                    rec->refname, fno)));
+
                 if (expectedtypeid != InvalidOid && expectedtypeid != *typeid)
                     ereport(ERROR,
                             (errcode(ERRCODE_DATATYPE_MISMATCH),
                              errmsg("type of \"%s.%s\" does not match that when preparing the plan",
-                                    rec->refname, recfield->fieldname)));
+                                    rec->refname, fname)));
                 break;
             }

Only in src: pl_exec.c.orig
diff -u src.orig/plpgsql.h src/plpgsql.h
--- src.orig/plpgsql.h    2004-11-22 10:20:24.000000000 +0000
+++ src/plpgsql.h    2004-11-22 10:58:42.737888047 +0000
@@ -269,6 +269,8 @@
     int            dtype;
     int            rfno;
     char       *fieldname;
+    bool        isExpr;            /* Use expression for field name/number */
+    PLpgSQL_expr *expr;
     int            recparentno;    /* dno of parent record */
 } PLpgSQL_recfield;

@@ -675,7 +677,9 @@
 extern PLpgSQL_function *plpgsql_compile(FunctionCallInfo fcinfo,
                 bool forValidator);
 extern int    plpgsql_parse_word(char *word);
+extern int    plpgsql_parse_wordexpr(char *word);
 extern int    plpgsql_parse_dblword(char *word);
+extern int    plpgsql_parse_dblwordexpr(char *word);
 extern int    plpgsql_parse_tripword(char *word);
 extern int    plpgsql_parse_wordtype(char *word);
 extern int    plpgsql_parse_dblwordtype(char *word);
diff -u src.orig/scan.l src/scan.l
--- src.orig/scan.l    2004-11-22 10:20:24.000000000 +0000
+++ src/scan.l    2004-11-22 10:58:42.781881363 +0000
@@ -195,9 +195,15 @@
 {identifier}                    {
     plpgsql_error_lineno = plpgsql_scanner_lineno();
     return plpgsql_parse_word(yytext); }
+{identifier}{space}*\.        {
+    plpgsql_error_lineno = plpgsql_scanner_lineno();
+    return plpgsql_parse_wordexpr(yytext); }
 {identifier}{space}*\.{space}*{identifier}    {
     plpgsql_error_lineno = plpgsql_scanner_lineno();
     return plpgsql_parse_dblword(yytext); }
+{identifier}{space}*\.{space}*{identifier}*\.        {
+    plpgsql_error_lineno = plpgsql_scanner_lineno();
+    return plpgsql_parse_dblwordexpr(yytext); }
 {identifier}{space}*\.{space}*{identifier}{space}*\.{space}*{identifier}    {
     plpgsql_error_lineno = plpgsql_scanner_lineno();
     return plpgsql_parse_tripword(yytext); }

Re: patch: plpgsql - access records with rec.(expr)

From

Tom Lane

Date:

22 November 2004, 15:58:29

Neil Conway <neilc@samurai.com> writes:
> FYI, one thing I want to implement is an EVALUATE statement in plpgsql
> (analogous to eval() in Perl, for example).

I'm confused.  How/why is this different from EXECUTE?
        regards, tom lane

Re: patch: plpgsql - access records with rec.(expr)

From

Tom Lane

Date:

22 November 2004, 16:20:38

Matt <matt@kynx.org> writes:
> Does that make any sense? Is it worth the work? Or should we just tell
> anyone who actually needs it (I don't, at present) 'use another PL'?

I don't really see this going anywhere --- it's contorting the semantics
of plpgsql too much for too little gain.  The typical use-case I've
heard of for this sort of thing is "I want to write a generic trigger,
so I need to iterate over the columns of NEW.* without knowing their
names or data types in advance".  Your proposal doesn't address the
issue of how the function would find out the column names in order to
make use of the proposed notation; and as you noted there's still a
serious problem with varying datatypes.

Either plperl or pltcl (probably also plpython but I'm not familiar
with that language) is a better choice for writing such triggers,
because those languages already have answers for all these issues.
        regards, tom lane

Re: patch: plpgsql - access records with rec.(expr)

From

Matt

Date:

22 November 2004, 17:31:48

Hi Tom,

> > Does that make any sense? Is it worth the work? Or should we just tell
> > anyone who actually needs it (I don't, at present) 'use another PL'?
> 
> I don't really see this going anywhere --- it's contorting the semantics
> of plpgsql too much for too little gain.  

Yup, the last bit was definitely contortionist! Prob should have split
it into two separate posts.

What I'm most looking for comments on what's actually in the patch: with
it I can use NEW.($1). _That_ is a big gain.

On the rest...

> Your proposal doesn't address the
> issue of how the function would find out the column names in order to
> make use of the proposed notation;

It doesn't need them. The posted patch already allows:

for i in 1..3 loop myvar := rec.(i);
end loop;

...so long as all cols have the same datatype (note that col numbers are
1-based, like in the spi funcs). 

>  and as you noted there's still a
> serious problem with varying datatypes.

My contortions were an attempt to think my way around the varying
datatypes problem. Obviously my thinking got tied in knots ;) I agree,
that part is too much work for too little gain.

> Either plperl or pltcl (probably also plpython but I'm not familiar
> with that language) is a better choice for writing such triggers,
> because those languages already have answers for all these issues.

I didn't think plperl could be used to write triggers. Is this new in 8?

As an application developer, though, it'd be nice to keep everything in
one pl language, and ideally one guaranteed available (or easily
installable without other dependencies) on every postgres db out there. 

That's pretty much why I wrote the patch: to add the one missing feature
I need to write my generic triggers in plpgsql.

Regards,

Matt

Re: patch: plpgsql - access records with rec.(expr)

From

Neil Conway

Date:

22 November 2004, 23:23:04

On Mon, 2004-11-22 at 10:57 -0500, Tom Lane wrote:
> I'm confused.  How/why is this different from EXECUTE?

EVALUATE would take a string and evaluate it as a PL/PgSQL statement;
EXECUTE takes a string and executes it as a SQL statement. We've
discussed this before (although I may not have called it "EVALUATE" at
the time):

http://archives.postgresql.org/pgsql-bugs/2004-10/msg00005.php

-Neil

Re: patch: plpgsql - access records with rec.(expr)

From

Neil Conway

Date:

23 November 2004, 00:08:34

On Mon, 2004-11-22 at 10:06 +0000, Matt wrote:
> This would execute a string and pass back the result?

It would evaluate a string as a PL/PgSQL statement (which means you
could construct any PL/PgSQL statement dynamically, including access to
fields of a RECORD determined at runtime).

> > I don't like this: it implicitly coerces a string literal into an
> > identifier (i.e. a column name). Treating data as code can be useful,
> > but I think we need to make it more obvious to the user. I think a
> > proper EVALUATE statement might be a better solution.
> 
> See your point. But what about NEW.($1)?

I don't follow -- what do you mean?

(BTW, I think my comment also applies to variables of type "text" and
similar -- I think the patch would be a lot simpler if you just
implement access to record fields by ordinal position, and don't
implement access by field name.)

-Neil

Re: patch: plpgsql - access records with rec.(expr)

From

Matt

Date:

23 November 2004, 09:03:36

> > See your point. But what about NEW.($1)?
> 
> I don't follow -- what do you mean?

I want to be able to be able to write a trigger function that accesses a
column passed as an argument to the function in the row that caused the
trigger. This is my use case.

I guess that would actually written NEW.(TG_ARGV[1]).

> (BTW, I think my comment also applies to variables of type "text" and
> similar -- I think the patch would be a lot simpler if you just
> implement access to record fields by ordinal position, and don't
> implement access by field name.)

Yes, it would be marginally simpler: I'd still have to call
exec_eval_datum() on the variable and check whether it could be
evaluated to an integer (trigger args are all text AFAIK). The only
difference would be throwing an error if it wasn't, instead of making
use of the value... and a slightly less readable 'create trigger'
statement.

It would be a good idea to check that the variable was either a constant
or a trigger arg. This would stop the looping problem, since the type of
the underlying field couldn't change.

But I've somehow got the feeling that this sort of thing isn't the
issue. The issue is whether we want to allow dynamic access to columns
in any syntax at all. A simple yes or no would do :)

Matt

BTW: here's the original post adding the rec.(3) syntax to the TODO:
http://archives.postgresql.org/pgsql-hackers/2003-07/msg00425.php
here's someone else who tried something very similar:
http://archives.postgresql.org/pgsql-hackers/2003-09/msg00533.php

Re: patch: plpgsql - access records with rec.(expr)

From

Mike Rylander

Date:

23 November 2004, 12:59:53

Just to put in my .02$, I would absolutely love to see this
functionality included in plpgsql.  With some extra error checking for
the know changing datatype failure, and docs that mention that
limitation, I'd say this is a great extension to the language. 
plpgsql feels quicker than the interpreted PLs and it's far easier
than C to work with for writing triggers, so this patch makes plpgsql
a much more attractive target for general purpose stored procs.  And
my gut feeling is that an EVALUATE statement would be significantly
slower than this.

In any case, thanks for the great work, Matt.  Please, CORE, include this one!

As an alternative, what would be the possibility of creating a new PL
as a contrib module, say PLPGSQL_NG, to move forward with extensions
like this and perhaps EVALUATE?

-- 
Mike Rylander
mrylander@gmail.com
GPLS -- PINES Development
Database Developer

On 23 Nov 2004 09:03:10 +0000, Matt <matt@kynx.org> wrote:
> > > See your point. But what about NEW.($1)?
> >
> > I don't follow -- what do you mean?
> 
> I want to be able to be able to write a trigger function that accesses a
> column passed as an argument to the function in the row that caused the
> trigger. This is my use case.
> 
> I guess that would actually written NEW.(TG_ARGV[1]).
> 
> > (BTW, I think my comment also applies to variables of type "text" and
> > similar -- I think the patch would be a lot simpler if you just
> > implement access to record fields by ordinal position, and don't
> > implement access by field name.)
> 
> Yes, it would be marginally simpler: I'd still have to call
> exec_eval_datum() on the variable and check whether it could be
> evaluated to an integer (trigger args are all text AFAIK). The only
> difference would be throwing an error if it wasn't, instead of making
> use of the value... and a slightly less readable 'create trigger'
> statement.
> 
> It would be a good idea to check that the variable was either a constant
> or a trigger arg. This would stop the looping problem, since the type of
> the underlying field couldn't change.
> 
> But I've somehow got the feeling that this sort of thing isn't the
> issue. The issue is whether we want to allow dynamic access to columns
> in any syntax at all. A simple yes or no would do :)
> 
> Matt
> 
> BTW: here's the original post adding the rec.(3) syntax to the TODO:
> http://archives.postgresql.org/pgsql-hackers/2003-07/msg00425.php
> here's someone else who tried something very similar:
> http://archives.postgresql.org/pgsql-hackers/2003-09/msg00533.php
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

Re: patch: plpgsql - access records with rec.(expr)

From

Tom Lane

Date:

23 November 2004, 15:34:12

Matt <matt@kynx.org> writes:
> It would be a good idea to check that the variable was either a constant
> or a trigger arg. This would stop the looping problem, since the type of
> the underlying field couldn't change.

What aboutfor i in ...    ... new.(tg_argv[i]) ...

> But I've somehow got the feeling that this sort of thing isn't the
> issue. The issue is whether we want to allow dynamic access to columns
> in any syntax at all. A simple yes or no would do :)

MHO: this is a really ugly wart on the language, and it does not solve
the problems people would want to solve.  It might solve *your* problem
but that's not enough to justify a wart of this size.

We do need to do something about the fact that EXECUTE can't access
plpgsql variables, though I'm afraid that fixing that is going to
require a rather complete overhaul of plpgsql :-(.  But it needs one
anyway.
        regards, tom lane

Re: patch: plpgsql - access records with rec.(expr)

From

Matt

Date:

23 November 2004, 16:58:06

> What about
>     for i in ...
>         ... new.(tg_argv[i]) ...

Ooof! <clutches chest, falls to ground groaning> Constants or digits or
nothing, then <gets up, dusts himself off>

> MHO: this is a really ugly wart on the language, and it does not solve
> the problems people would want to solve.  It might solve *your* problem
> but that's not enough to justify a wart of this size.

<falls down again> But my warts are beautiful!

OK, fair enough. I had to try.

> We do need to do something about the fact that EXECUTE can't access
> plpgsql variables, though I'm afraid that fixing that is going to
> require a rather complete overhaul of plpgsql :-(.  But it needs one
> anyway.

Look forward to seeing it.

Matt

Re: patch: plpgsql - access records with rec.(expr)

From

Neil Conway

Date:

24 November 2004, 10:06:47

Mike Rylander wrote:
> As an alternative, what would be the possibility of creating a new PL
> as a contrib module, say PLPGSQL_NG, to move forward with extensions
> like this and perhaps EVALUATE?

I think the idea of rewriting PL/PgSQL from scratch has merit (and it's 
something that I think would be well worth doing). IMHO it's not really 
worth the trouble to fork the existing code base and add new features to 
something that, hopefully, has a limited life span.

-Neil

Re: patch: plpgsql - access records with rec.(expr)

From

Joachim Wieland

Date:

24 November 2004, 11:15:37

On Tue, Nov 23, 2004 at 10:33:50AM -0500, Tom Lane wrote:
> We do need to do something about the fact that EXECUTE can't access
> plpgsql variables, though I'm afraid that fixing that is going to
> require a rather complete overhaul of plpgsql :-(.  But it needs one
> anyway.

Why do you think that it would be difficult to do it with the existing code?
Actually I wanted to implement this for 8.1. I've already had a quick look
at it. My idea was to allow something like "EXECUTE SELECT INTO var1, var2
col1 col2 FROM ...". The code would have to parse the list of variables and
check if they match pl/pgsql variables, execute the query (without the INTO
stuff of course) via SPI and copy back the results, checking type mismatches
or a mismatch in the number of columns. I haven't thought about more complex
types however. Did I miss something?

Which other limitations are there in the current implementation of pl/pgsql?

Joachim

Re: patch: plpgsql - access records with rec.(expr)

From

Matt

Date:

24 November 2004, 13:12:18

> I think the idea of rewriting PL/PgSQL from scratch has merit (and it's 
> something that I think would be well worth doing). IMHO it's not really 
> worth the trouble to fork the existing code base and add new features to 
> something that, hopefully, has a limited life span.

I dunno, I kind of like the idea.

There's always going to be the age old conflict between people who are
basically users like me, and want to see a needed feature asap, and
developers who want to see it done right. And of course doing it right
is only way to go long term.

But so long as I can be fairly sure the syntax is agreed (EXECUTE SELECT
INTO ... in this case) and eventually will make it into the main code
base, I'd be willing to live on the edge for a while.

There'd have to be big 'experimental, everything may change' warnings
all over the contrib version. My only concern is if it would actually
delay the discussed rewrite of plpgsql by splitting the effort.

That's my two one hundreths of a euro, anyway.

Matt

Re: patch: plpgsql - access records with rec.(expr)

From

Tom Lane

Date:

24 November 2004, 15:08:28

Joachim Wieland <joe@mcknight.de> writes:
> On Tue, Nov 23, 2004 at 10:33:50AM -0500, Tom Lane wrote:
>> We do need to do something about the fact that EXECUTE can't access
>> plpgsql variables, though I'm afraid that fixing that is going to
>> require a rather complete overhaul of plpgsql :-(.  But it needs one
>> anyway.

> Why do you think that it would be difficult to do it with the existing code?

Because the parsing context all gets thrown away, in particular the
namespace stack; so you can't tell what set of variables ought to be
visible to the EXECUTE.

> Which other limitations are there in the current implementation of pl/pgsql?

Its memory management really sucks.  Function parsetrees ought to be put
into private contexts, not malloc'd, so that they can reasonably be
freed when the function is deleted or redefined.  Likewise for the
cached query plans.  Also we need a way to discard cached query plans
when relevant object definitions change (though this is not plpgsql's
fault in particular, it's a generic problem).

Another issue is that we need a better solution for recursive plpgsql
functions.  Function parsetrees have to be read-only (with the
exception of adding cached plans that weren't there before) during
execution, else recursive functions don't work right.  We have a really
ugly and inefficient answer to this, and I'm not even sure that it
covers 100% of the cases (at one time it definitely didn't, but I can't
recall right now if that got fixed completely).  I'm not entirely sure
what a cleaner answer would look like, but I know I don't like it now.
        regards, tom lane