Re: patch: plpgsql - access records with rec.(expr) - Mailing list pgsql-hackers

From Matt
Subject Re: patch: plpgsql - access records with rec.(expr)
Date
Msg-id 1101130540.4229.243.camel@matt.kynx.org
Whole thread Raw
In response to Re: patch: plpgsql - access records with rec.(expr)  (Neil Conway <neilc@samurai.com>)
Responses Re: patch: plpgsql - access records with rec.(expr)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

Updated patch (now against beta5) attached. It now pfree's any converted
strings, avoids pointlessly casting an int4oid to int4oid, complies to
CS (uses tabs, not spaces) and works with label.record.(expression) too.
I'm still testing, it now does what I set out to achieve.

I haven't done anything to implement the same thing for row types. I'm
wondering if there's any point. The person writing the function is
implicitly going to know the structure of a row; the only use I can
foresee is iterating over the columns by ordinal, but that isn't
possible (at present, see below).

Neil's raised a coupe of good points:
  - is the parenthesis format the right way to go? How do other sp
languages do it?
  - if we have a more genaralised EVALUATE statement, do we still need
this?

I'm not at all fixated on the parenthesis idea. If there's a better way,
great. But if this stands a chance of making it in to the language one
day, it'd be good to know so I can start building stuff that uses the
format.

One thought I had about the format would be to use a kind of
javascript-ish syntax:
    rec.item(expr) - as above
This might later lend itself to other record-specific functions:
    rec.length() - returns number of columns
    rec.getColType(expr) - returns oid of given column
    rec.getColNumber(expr) - returns ordinal of given column
I can't see a crying need for any of these, mind... but length() would
be useful _if_ you could iterate over columns, so:

Loops:

The issue with loops is a nuisance - if the underlying column type
changes, my stuff throws a data type mismatch exception. I imagine this
could affect an EVALUATE expression inside a loop, too (or would their
plans never be saved, like EXECUTE?).

I might be totally wrong, but here's what I see:

To stop the mismatch you'd have to stop exec_eval_expr() saving the plan
for that expression. A flag in the expression struct could do that,
though you'd want to add a check that the function was defined as
volatile before allowing it.

The flag could be set during the function's compilation stage, but then
all functions using the rec.(expr) format anywhere would have to be
volatile.

A better approach would be to walk through the function's statement tree
just before execution, checking for troublesome expressions appearing
inside loops (and whose arguments are reassigned inside the loop?) and
marking them as dynamic.

Does that make any sense? Is it worth the work? Or should we just tell
anyone who actually needs it (I don't, at present) 'use another PL'?

Regards,

Matt
diff -u src.orig/pl_comp.c src/pl_comp.c
--- src.orig/pl_comp.c    2004-11-22 10:20:24.000000000 +0000
+++ src/pl_comp.c    2004-11-22 10:58:42.465929360 +0000
@@ -783,6 +783,72 @@
     return T_WORD;
 }

+/* ----------
+ * plpgsql_parse_wordexpr        Same lookup for word followed by dot.
+ *                Should only get here if it wasn't followed by
+ *                an identifier.
+ * ----------
+ */
+int
+plpgsql_parse_wordexpr(char *word)
+{
+    PLpgSQL_nsitem *ns;
+    char       *cp[1];
+    int            save_spacescanned = plpgsql_SpaceScanned;
+
+    /* Do case conversion and word separation */
+    /* drop dot to keep converter happy */
+    word[strlen(word) - 1] = '\0';
+    plpgsql_convert_ident(word, cp, 1);
+
+    /*
+     * Do a lookup on the compilers namestack
+     */
+    ns = plpgsql_ns_lookup(cp[0], NULL);
+    if (ns == NULL)
+    {
+        pfree(cp[0]);
+        return T_ERROR;
+    }
+    switch (ns->itemtype)
+    {
+        case PLPGSQL_NSTYPE_REC:
+            {
+                /*
+                 * First word is a record name, so the parenthesised expr that
+                 * follows it defines the field.
+                 */
+                PLpgSQL_recfield *new;
+
+                new = malloc(sizeof(PLpgSQL_recfield));
+                memset(new, 0, sizeof(PLpgSQL_recfield));
+
+                if (plpgsql_yylex() != '(')
+                    plpgsql_yyerror("expected identifier or \"(\"");
+                new->expr = plpgsql_read_expression(')', ")");
+                new->recparentno = ns->itemno;
+                new->isExpr = true;
+                new->dtype = PLPGSQL_DTYPE_RECFIELD;
+
+                plpgsql_adddatum((PLpgSQL_datum *) new);
+
+                plpgsql_yylval.scalar = (PLpgSQL_datum *) new;
+
+                plpgsql_SpaceScanned = save_spacescanned;
+
+                pfree(cp[0]);
+                return T_SCALAR;
+            }
+
+        /* TODO: deal with rows, too? */
+
+        default:
+            break;
+
+    }
+    pfree(cp[0]);
+    return T_ERROR;
+}

 /* ----------
  * plpgsql_parse_dblword        Same lookup for two words
@@ -855,6 +921,7 @@
                 new->dtype = PLPGSQL_DTYPE_RECFIELD;
                 new->fieldname = strdup(cp[1]);
                 new->recparentno = ns->itemno;
+                new->isExpr = false;

                 plpgsql_adddatum((PLpgSQL_datum *) new);

@@ -901,6 +968,89 @@
     return T_ERROR;
 }

+/* ----------
+ * plpgsql_parse_dblwordexpr        Same lookup for two words
+ *                    separated by a dot followed by a lone dot
+ * ----------
+ */
+int
+plpgsql_parse_dblwordexpr(char *word)
+{
+    PLpgSQL_nsitem *ns;
+    char       *cp[2];
+    int            save_spacescanned = plpgsql_SpaceScanned;
+
+    /* Do case conversion and word separation */
+    /* drop dot to keep converter happy */
+    word[strlen(word) - 1] = '\0';
+    plpgsql_convert_ident(word, cp, 2);
+
+    /*
+     * Lookup the first word
+     */
+    ns = plpgsql_ns_lookup(cp[0], NULL);
+    if (ns == NULL)
+    {
+        pfree(cp[0]);
+        pfree(cp[1]);
+        return T_ERROR;
+    }
+
+    switch (ns->itemtype)
+    {
+        case PLPGSQL_NSTYPE_LABEL:
+
+            /*
+             * First word is a label, second should be a record
+             */
+            ns = plpgsql_ns_lookup(cp[1], cp[0]);
+            pfree(cp[0]);
+            pfree(cp[1]);
+            if (ns == NULL)
+                return T_ERROR;
+            switch (ns->itemtype)
+            {
+                case PLPGSQL_NSTYPE_REC:
+                    {
+                    /*
+                     * First word is a record name, so parenthesised expr
+                     * following refers to field.
+                     */
+                    PLpgSQL_recfield *new;
+
+                    new = malloc(sizeof(PLpgSQL_recfield));
+                    memset(new, 0, sizeof(PLpgSQL_recfield));
+
+                    if (plpgsql_yylex() != '(')
+                        plpgsql_yyerror("expected identifier or \"(\"");
+                    new->expr = plpgsql_read_expression(')', ")");
+                    new->recparentno = ns->itemno;
+                    new->isExpr = true;
+                    new->dtype = PLPGSQL_DTYPE_RECFIELD;
+
+                    plpgsql_adddatum((PLpgSQL_datum *) new);
+
+                    plpgsql_yylval.scalar = (PLpgSQL_datum *) new;
+
+                    plpgsql_SpaceScanned = save_spacescanned;
+
+                    return T_SCALAR;
+                    }
+                default:
+                    return T_ERROR;
+            }
+            break;
+
+
+        default:
+            break;
+    }
+
+    pfree(cp[0]);
+    pfree(cp[1]);
+    return T_ERROR;
+}
+

 /* ----------
  * plpgsql_parse_tripword        Same lookup for three words
@@ -961,6 +1111,7 @@
                 new->dtype = PLPGSQL_DTYPE_RECFIELD;
                 new->fieldname = strdup(cp[2]);
                 new->recparentno = ns->itemno;
+                new->isExpr = false;

                 plpgsql_adddatum((PLpgSQL_datum *) new);

diff -u src.orig/pl_exec.c src/pl_exec.c
--- src.orig/pl_exec.c    2004-11-22 10:20:24.000000000 +0000
+++ src/pl_exec.c    2004-11-22 10:58:42.684896097 +0000
@@ -2956,6 +2956,8 @@
                 PLpgSQL_recfield *recfield = (PLpgSQL_recfield *) target;
                 PLpgSQL_rec *rec;
                 int            fno;
+                char        *fname;
+                bool        mustfreefname = false;
                 HeapTuple    newtup;
                 int            natts;
                 int            i;
@@ -2980,20 +2982,78 @@
                               rec->refname),
                        errdetail("The tuple structure of a not-yet-assigned record is indeterminate.")));

+
+                /* Have we got an expr to deal with? */
+                fno = 0;
+                fname = recfield->fieldname;
+                if (recfield->isExpr)
+                {
+                    Datum        exprdatum;
+                    Oid            exprtype;
+                    bool        exprisnull;
+
+                    /* Evaluate expression */
+                    exprdatum = exec_eval_expr(estate, recfield->expr,
+                                    &exprisnull, &exprtype);
+
+                    /* If we've got an integer, it's a field number, otherwise
+                     * it's a fieldname
+                     */
+                    if (exprtype == INT4OID) {
+                        fno = DatumGetInt32(exprdatum);
+                        if (fno < 1) {
+                            ereport(ERROR,
+                                    (errcode(ERRCODE_UNDEFINED_COLUMN),
+                                     errmsg("field expression evaluates to a number less than 1")));
+                        }
+                    }
+                    else {
+                        fname = convert_value_to_string(
+                                                exprdatum, exprtype);
+                        mustfreefname = true;
+                        if (*fname == '\0') {
+                            ereport(ERROR,
+                                    (errcode(ERRCODE_UNDEFINED_COLUMN),
+                                     errmsg("field expression evaluates to an empty string")));
+                        }
+                    }
+                }
+
                 /*
                  * Get the number of the records field to change and the
-                 * number of attributes in the tuple.
+                 * number of attributes in the tuple, if we didn't get it
+                 * above.
                  */
-                fno = SPI_fnumber(rec->tupdesc, recfield->fieldname);
-                if (fno == SPI_ERROR_NOATTRIBUTE)
-                    ereport(ERROR,
-                            (errcode(ERRCODE_UNDEFINED_COLUMN),
-                             errmsg("record \"%s\" has no field \"%s\"",
-                                    rec->refname, recfield->fieldname)));
+                if (fno < 1)
+                {
+                    fno = SPI_fnumber(rec->tupdesc, fname);
+                    if (fno == SPI_ERROR_NOATTRIBUTE)
+                        ereport(ERROR,
+                                (errcode(ERRCODE_UNDEFINED_COLUMN),
+                                 errmsg("record \"%s\" has no field \"%s\"",
+                                        rec->refname, fname)));
+                }
+
+                /* Free fname, if required */
+                if (mustfreefname) {
+                    pfree(fname);
+                }
+
                 fno--;
                 natts = rec->tupdesc->natts;

                 /*
+                 * Check that fno is within range
+                 */
+                if (fno >= natts) {
+                    ereport(ERROR,
+                            (errcode(ERRCODE_UNDEFINED_COLUMN),
+                             errmsg("record \"%s\" has no field number \"%d\"",
+                                    rec->refname, fno)));
+
+                }
+
+                /*
                  * Set up values/datums arrays for heap_formtuple.    For
                  * all the attributes except the one we want to replace,
                  * use the value that's in the old tuple.
@@ -3288,6 +3348,8 @@
                 PLpgSQL_recfield *recfield = (PLpgSQL_recfield *) datum;
                 PLpgSQL_rec *rec;
                 int            fno;
+                char        *fname;
+                bool        mustfreefname = false;

                 rec = (PLpgSQL_rec *) (estate->datums[recfield->recparentno]);
                 if (!HeapTupleIsValid(rec->tup))
@@ -3296,19 +3358,76 @@
                        errmsg("record \"%s\" is not assigned yet",
                               rec->refname),
                        errdetail("The tuple structure of a not-yet-assigned record is indeterminate.")));
-                fno = SPI_fnumber(rec->tupdesc, recfield->fieldname);
-                if (fno == SPI_ERROR_NOATTRIBUTE)
-                    ereport(ERROR,
-                            (errcode(ERRCODE_UNDEFINED_COLUMN),
-                             errmsg("record \"%s\" has no field \"%s\"",
-                                    rec->refname, recfield->fieldname)));
+
+                /* Have we got an expr to deal with? */
+                fno = 0;
+                fname = recfield->fieldname;
+                if (recfield->isExpr)
+                {
+                    Datum        exprdatum;
+                    Oid            exprtype;
+                    bool        exprisnull;
+
+                    /* Evaluate expression */
+                    exprdatum = exec_eval_expr(estate, recfield->expr,
+                                    &exprisnull, &exprtype);
+
+
+                    /* If we've got an integer, it's a field number, otherwise
+                     * it's a fieldname
+                     */
+                    if (exprtype == INT4OID) {
+                        fno = DatumGetInt32(exprdatum);
+                        if (fno < 1) {
+                            ereport(ERROR,
+                                    (errcode(ERRCODE_UNDEFINED_COLUMN),
+                                     errmsg("field expression evaluates to a number less than 1")));
+                        }
+                    }
+                    else {
+                        fname = convert_value_to_string(
+                                                exprdatum, exprtype);
+                        mustfreefname = true;
+                        if (*fname == '\0') {
+                            ereport(ERROR,
+                                    (errcode(ERRCODE_UNDEFINED_COLUMN),
+                                     errmsg("field expression evaluates to an empty string")));
+                        }
+                    }
+                }
+
+                /*
+                 * Get the number of the records field to change and the
+                 * number of attributes in the tuple.
+                 */
+                if (fno < 1)
+                {
+                    fno = SPI_fnumber(rec->tupdesc, fname);
+                    if (fno == SPI_ERROR_NOATTRIBUTE)
+                        ereport(ERROR,
+                                (errcode(ERRCODE_UNDEFINED_COLUMN),
+                                 errmsg("record \"%s\" has no field \"%s\"",
+                                        rec->refname, fname)));
+                }
+
+                /* Free up fname if required */
+                if (mustfreefname) {
+                    pfree(fname);
+                }
+
                 *typeid = SPI_gettypeid(rec->tupdesc, fno);
                 *value = SPI_getbinval(rec->tup, rec->tupdesc, fno, isnull);
+                if (*value == SPI_ERROR_NOATTRIBUTE)
+                    ereport(ERROR,
+                            (errcode(ERRCODE_UNDEFINED_COLUMN),
+                             errmsg("record \"%s\" has no field number \"%d\"",
+                                    rec->refname, fno)));
+
                 if (expectedtypeid != InvalidOid && expectedtypeid != *typeid)
                     ereport(ERROR,
                             (errcode(ERRCODE_DATATYPE_MISMATCH),
                              errmsg("type of \"%s.%s\" does not match that when preparing the plan",
-                                    rec->refname, recfield->fieldname)));
+                                    rec->refname, fname)));
                 break;
             }

Only in src: pl_exec.c.orig
diff -u src.orig/plpgsql.h src/plpgsql.h
--- src.orig/plpgsql.h    2004-11-22 10:20:24.000000000 +0000
+++ src/plpgsql.h    2004-11-22 10:58:42.737888047 +0000
@@ -269,6 +269,8 @@
     int            dtype;
     int            rfno;
     char       *fieldname;
+    bool        isExpr;            /* Use expression for field name/number */
+    PLpgSQL_expr *expr;
     int            recparentno;    /* dno of parent record */
 } PLpgSQL_recfield;

@@ -675,7 +677,9 @@
 extern PLpgSQL_function *plpgsql_compile(FunctionCallInfo fcinfo,
                 bool forValidator);
 extern int    plpgsql_parse_word(char *word);
+extern int    plpgsql_parse_wordexpr(char *word);
 extern int    plpgsql_parse_dblword(char *word);
+extern int    plpgsql_parse_dblwordexpr(char *word);
 extern int    plpgsql_parse_tripword(char *word);
 extern int    plpgsql_parse_wordtype(char *word);
 extern int    plpgsql_parse_dblwordtype(char *word);
diff -u src.orig/scan.l src/scan.l
--- src.orig/scan.l    2004-11-22 10:20:24.000000000 +0000
+++ src/scan.l    2004-11-22 10:58:42.781881363 +0000
@@ -195,9 +195,15 @@
 {identifier}                    {
     plpgsql_error_lineno = plpgsql_scanner_lineno();
     return plpgsql_parse_word(yytext); }
+{identifier}{space}*\.        {
+    plpgsql_error_lineno = plpgsql_scanner_lineno();
+    return plpgsql_parse_wordexpr(yytext); }
 {identifier}{space}*\.{space}*{identifier}    {
     plpgsql_error_lineno = plpgsql_scanner_lineno();
     return plpgsql_parse_dblword(yytext); }
+{identifier}{space}*\.{space}*{identifier}*\.        {
+    plpgsql_error_lineno = plpgsql_scanner_lineno();
+    return plpgsql_parse_dblwordexpr(yytext); }
 {identifier}{space}*\.{space}*{identifier}{space}*\.{space}*{identifier}    {
     plpgsql_error_lineno = plpgsql_scanner_lineno();
     return plpgsql_parse_tripword(yytext); }

pgsql-hackers by date:

Previous
From: Thomas Hallgren
Date:
Subject: Re: Preventing some SQL commands
Next
From: Robert Treat
Date:
Subject: Re: Test database for new installs?