Thread: patch: plpgsql - access records with rec.(expr)
Hi, I got extremely frustrated with having to create a temp table every time I wanted to access an arbitrary column from a record plpgsql. After seeing a note on the developers TODO about accessing plpgsql records with a 'dot bracket' notation I started digging into the plpgsql source. My diff (against 8beta4) is attached. Warning: I Am Not a C Programmer! I haven't even written a hello world in C before, and I knew nothing about Flex before yesterday. It was fun figuring stuff out, I'm amazed it mostly works, but I'm really hoping someone can point out my mistakes. Goal: Enable users to access fields in record variables using the following syntax like the following: rec.(1) rec.('foo') rec.(myvar::int) rec.(myvar || '_id') Files changed: plpgsql.h - added 'expr' member to PLpgSQL_recfield type for holding the PLpgSQL_expr structure. scan.l - added match for {identifier}{space}*\. AFAIK this should only match if a longer expression doesn't? pl_comp.c - added plpgsql_parse_wordexpr() function called by above match. Ripped off code from plpgsql_parse_word that deals with arg_v[expr] to find our expression. Prob a dumb name for the function! pl_exec.c - hacked exec_assign_value() and exec_eval_datum() to use the expression to get the field name/number. Stuff I need help with: 1. It should recognise OLD.(1) as a field number, not a column name. I think I've got to check the returned type from exec_eval_expr() then exec_simple_cast_value() on it, but that seems beyond me. 2. Freeing stuff. As I explained, this is all pretty new to me, and the comments about it around exec_eval_expr() et al just confused me :( Please give some hints about what needs freeing! 3. Would probably be good to add check in pl_comp.c to see if the expression actually needs to be evaluated at runtime (ie isn't just a field name/number). How? 4. Make this also work for row.(expr), label.record.(expr) and label.row.(expr) - but want to get the basics working first! 5. Because of the way the expression is parsed (looking for closing parenth), this will choke if you try and put a function in there. Would it be better to use curly braces '{expr}' or another character to mark the expression? I hope at this eventually leads to some really useful extra functionality, particularly for writing generic triggers. And it's a tribute to the existing code that a complete newbie can cut-and-paste their way to a halfarsed solution in a (rather long) night! Regards, Matt
Attachment
> 5. Because of the way the expression is parsed (looking for closing > parenth), this will choke if you try and put a function in there. Would > it be better to use curly braces '{expr}' or another character to mark > the expression? I lie! pgpgsql_read_expression() is smarter than that! However, I do have another problem. If the value of the expr changes inside a loop to a fieldname of a different type, it dies with the "type of \"%s\" does not match that when preparing the plan" message, which is quite true: it obviously doesn't. Just setting expectedtypeoid to InvalidOid bombs the whole thing :( Hrm.... the "best made plans" and all that... Matt
Hi all, I've cleaned this up a bit so that assigning to a dynamic record field now works - ie NEW.(myvar) := 'someval' - and accessing a field by number works - ie myvar := OLD.(1) It still doesn't work in a loop if the type of field referenced by the expression changes - looking at it more I'm really not sure this is possible, but that's a limitation I could live with. I'll also try figuring out freeing stuff after casting values over the weekend. But is this a worthwhile approach? If there are objections, please let me know! For myself, being able to pass a column name as an argument to a trigger would make writing generic triggers a whole lot easier. And going back through the archives on this list I see I'm not alone. Regards, Matt On Thu, 2004-11-18 at 13:18, Matt wrote: > Hi, > > I got extremely frustrated with having to create a temp table every time > I wanted to access an arbitrary column from a record plpgsql. After > seeing a note on the developers TODO about accessing plpgsql records > with a 'dot bracket' notation I started digging into the plpgsql source. > > My diff (against 8beta4) is attached. > > Warning: I Am Not a C Programmer! I haven't even written a hello world > in C before, and I knew nothing about Flex before yesterday. It was fun > figuring stuff out, I'm amazed it mostly works, but I'm really hoping > someone can point out my mistakes. > > Goal: > > Enable users to access fields in record variables using the following > syntax like the following: > rec.(1) > rec.('foo') > rec.(myvar::int) > rec.(myvar || '_id') > > Files changed: > > plpgsql.h > - added 'expr' member to PLpgSQL_recfield type for holding the > PLpgSQL_expr structure. > > scan.l > - added match for {identifier}{space}*\. AFAIK this should only match > if a longer expression doesn't? > > pl_comp.c > - added plpgsql_parse_wordexpr() function called by above match. Ripped > off code from plpgsql_parse_word that deals with arg_v[expr] to find our > expression. Prob a dumb name for the function! > > pl_exec.c > - hacked exec_assign_value() and exec_eval_datum() to use the expression > to get the field name/number. > > Stuff I need help with: > > 1. It should recognise OLD.(1) as a field number, not a column name. I > think I've got to check the returned type from exec_eval_expr() then > exec_simple_cast_value() on it, but that seems beyond me. > > 2. Freeing stuff. As I explained, this is all pretty new to me, and the > comments about it around exec_eval_expr() et al just confused me :( > Please give some hints about what needs freeing! > > 3. Would probably be good to add check in pl_comp.c to see if the > expression actually needs to be evaluated at runtime (ie isn't just a > field name/number). How? > > 4. Make this also work for row.(expr), label.record.(expr) and > label.row.(expr) - but want to get the basics working first! > > 5. Because of the way the expression is parsed (looking for closing > parenth), this will choke if you try and put a function in there. Would > it be better to use curly braces '{expr}' or another character to mark > the expression? > > I hope at this eventually leads to some really useful extra > functionality, particularly for writing generic triggers. And it's a > tribute to the existing code that a complete newbie can cut-and-paste > their way to a halfarsed solution in a (rather long) night! > > Regards, > > Matt > > ______________________________________________________________________ > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match
Attachment
On Thu, 2004-11-18 at 13:18 +0000, Matt wrote: > I got extremely frustrated with having to create a temp table every time > I wanted to access an arbitrary column from a record plpgsql. FYI, one thing I want to implement is an EVALUATE statement in plpgsql (analogous to eval() in Perl, for example). If I understand your use case, I think this will help somewhat, although of course it is still clumsier than direct syntactic support. > Warning: I Am Not a C Programmer! I haven't even written a hello world > in C before, and I knew nothing about Flex before yesterday. It was fun > figuring stuff out, I'm amazed it mostly works, but I'm really hoping > someone can point out my mistakes. An impressive first hack! :) > Enable users to access fields in record variables using the following > syntax like the following [...] > rec.(1) Looks good. > rec.('foo') I don't like this: it implicitly coerces a string literal into an identifier (i.e. a column name). Treating data as code can be useful, but I think we need to make it more obvious to the user. I think a proper EVALUATE statement might be a better solution. > 5. Because of the way the expression is parsed (looking for closing > parenth), this will choke if you try and put a function in there. Would > it be better to use curly braces '{expr}' or another character to mark > the expression? How much thought went into choosing parentheses? (i.e. is a similar syntax used in the procedural languages in other DBs?) In any case, I don't think switching switching to braces is the right solution to the lexing problem. I _believe_ the patch is OK as is -- plpgsql_read_expression() keeps track of parenthesis nesting so it should be able to figure out when it sees an "unmatched" parenthesis (indicating the end of the expression). BTW, attached is a trivial incremental patch that fixes a few compile warnings (which are indicative of actual problems) in your patch. -Neil
Attachment
Hi Neil, Thanks for the comments. I've actually got (yet) another version ready to go, which fixes the compile warnings and adds some sanity checks. I'll post it as soon as I've got beta5 downloaded and tried out :) > FYI, one thing I want to implement is an EVALUATE statement in plpgsql > (analogous to eval() in Perl, for example). If I understand your use > case, I think this will help somewhat, although of course it is still > clumsier than direct syntactic support. This would execute a string and pass back the result? I'm sure I'll find a use for it at some point :) > > rec.('foo') > > I don't like this: it implicitly coerces a string literal into an > identifier (i.e. a column name). Treating data as code can be useful, > but I think we need to make it more obvious to the user. I think a > proper EVALUATE statement might be a better solution. See your point. But what about NEW.($1)? > > 5. Because of the way the expression is parsed (looking for closing > > parenth), this will choke if you try and put a function in there. Would > > it be better to use curly braces '{expr}' or another character to mark > > the expression? > > How much thought went into choosing parentheses? (i.e. is a similar > syntax used in the procedural languages in other DBs?) Only used them because of the small note I saw on the developer's TODO about accessing cols by ordinal - the example there was rec.(1). But I was wrong about functions not working there - plpgsql_read_expression() is smarter than that, as you say. OK, download is done. I've got some more general ideas which relate to this. I'll post along with updated version. Matt
Hi, Updated patch (now against beta5) attached. It now pfree's any converted strings, avoids pointlessly casting an int4oid to int4oid, complies to CS (uses tabs, not spaces) and works with label.record.(expression) too. I'm still testing, it now does what I set out to achieve. I haven't done anything to implement the same thing for row types. I'm wondering if there's any point. The person writing the function is implicitly going to know the structure of a row; the only use I can foresee is iterating over the columns by ordinal, but that isn't possible (at present, see below). Neil's raised a coupe of good points: - is the parenthesis format the right way to go? How do other sp languages do it? - if we have a more genaralised EVALUATE statement, do we still need this? I'm not at all fixated on the parenthesis idea. If there's a better way, great. But if this stands a chance of making it in to the language one day, it'd be good to know so I can start building stuff that uses the format. One thought I had about the format would be to use a kind of javascript-ish syntax: rec.item(expr) - as above This might later lend itself to other record-specific functions: rec.length() - returns number of columns rec.getColType(expr) - returns oid of given column rec.getColNumber(expr) - returns ordinal of given column I can't see a crying need for any of these, mind... but length() would be useful _if_ you could iterate over columns, so: Loops: The issue with loops is a nuisance - if the underlying column type changes, my stuff throws a data type mismatch exception. I imagine this could affect an EVALUATE expression inside a loop, too (or would their plans never be saved, like EXECUTE?). I might be totally wrong, but here's what I see: To stop the mismatch you'd have to stop exec_eval_expr() saving the plan for that expression. A flag in the expression struct could do that, though you'd want to add a check that the function was defined as volatile before allowing it. The flag could be set during the function's compilation stage, but then all functions using the rec.(expr) format anywhere would have to be volatile. A better approach would be to walk through the function's statement tree just before execution, checking for troublesome expressions appearing inside loops (and whose arguments are reassigned inside the loop?) and marking them as dynamic. Does that make any sense? Is it worth the work? Or should we just tell anyone who actually needs it (I don't, at present) 'use another PL'? Regards, Matt diff -u src.orig/pl_comp.c src/pl_comp.c --- src.orig/pl_comp.c 2004-11-22 10:20:24.000000000 +0000 +++ src/pl_comp.c 2004-11-22 10:58:42.465929360 +0000 @@ -783,6 +783,72 @@ return T_WORD; } +/* ---------- + * plpgsql_parse_wordexpr Same lookup for word followed by dot. + * Should only get here if it wasn't followed by + * an identifier. + * ---------- + */ +int +plpgsql_parse_wordexpr(char *word) +{ + PLpgSQL_nsitem *ns; + char *cp[1]; + int save_spacescanned = plpgsql_SpaceScanned; + + /* Do case conversion and word separation */ + /* drop dot to keep converter happy */ + word[strlen(word) - 1] = '\0'; + plpgsql_convert_ident(word, cp, 1); + + /* + * Do a lookup on the compilers namestack + */ + ns = plpgsql_ns_lookup(cp[0], NULL); + if (ns == NULL) + { + pfree(cp[0]); + return T_ERROR; + } + switch (ns->itemtype) + { + case PLPGSQL_NSTYPE_REC: + { + /* + * First word is a record name, so the parenthesised expr that + * follows it defines the field. + */ + PLpgSQL_recfield *new; + + new = malloc(sizeof(PLpgSQL_recfield)); + memset(new, 0, sizeof(PLpgSQL_recfield)); + + if (plpgsql_yylex() != '(') + plpgsql_yyerror("expected identifier or \"(\""); + new->expr = plpgsql_read_expression(')', ")"); + new->recparentno = ns->itemno; + new->isExpr = true; + new->dtype = PLPGSQL_DTYPE_RECFIELD; + + plpgsql_adddatum((PLpgSQL_datum *) new); + + plpgsql_yylval.scalar = (PLpgSQL_datum *) new; + + plpgsql_SpaceScanned = save_spacescanned; + + pfree(cp[0]); + return T_SCALAR; + } + + /* TODO: deal with rows, too? */ + + default: + break; + + } + pfree(cp[0]); + return T_ERROR; +} /* ---------- * plpgsql_parse_dblword Same lookup for two words @@ -855,6 +921,7 @@ new->dtype = PLPGSQL_DTYPE_RECFIELD; new->fieldname = strdup(cp[1]); new->recparentno = ns->itemno; + new->isExpr = false; plpgsql_adddatum((PLpgSQL_datum *) new); @@ -901,6 +968,89 @@ return T_ERROR; } +/* ---------- + * plpgsql_parse_dblwordexpr Same lookup for two words + * separated by a dot followed by a lone dot + * ---------- + */ +int +plpgsql_parse_dblwordexpr(char *word) +{ + PLpgSQL_nsitem *ns; + char *cp[2]; + int save_spacescanned = plpgsql_SpaceScanned; + + /* Do case conversion and word separation */ + /* drop dot to keep converter happy */ + word[strlen(word) - 1] = '\0'; + plpgsql_convert_ident(word, cp, 2); + + /* + * Lookup the first word + */ + ns = plpgsql_ns_lookup(cp[0], NULL); + if (ns == NULL) + { + pfree(cp[0]); + pfree(cp[1]); + return T_ERROR; + } + + switch (ns->itemtype) + { + case PLPGSQL_NSTYPE_LABEL: + + /* + * First word is a label, second should be a record + */ + ns = plpgsql_ns_lookup(cp[1], cp[0]); + pfree(cp[0]); + pfree(cp[1]); + if (ns == NULL) + return T_ERROR; + switch (ns->itemtype) + { + case PLPGSQL_NSTYPE_REC: + { + /* + * First word is a record name, so parenthesised expr + * following refers to field. + */ + PLpgSQL_recfield *new; + + new = malloc(sizeof(PLpgSQL_recfield)); + memset(new, 0, sizeof(PLpgSQL_recfield)); + + if (plpgsql_yylex() != '(') + plpgsql_yyerror("expected identifier or \"(\""); + new->expr = plpgsql_read_expression(')', ")"); + new->recparentno = ns->itemno; + new->isExpr = true; + new->dtype = PLPGSQL_DTYPE_RECFIELD; + + plpgsql_adddatum((PLpgSQL_datum *) new); + + plpgsql_yylval.scalar = (PLpgSQL_datum *) new; + + plpgsql_SpaceScanned = save_spacescanned; + + return T_SCALAR; + } + default: + return T_ERROR; + } + break; + + + default: + break; + } + + pfree(cp[0]); + pfree(cp[1]); + return T_ERROR; +} + /* ---------- * plpgsql_parse_tripword Same lookup for three words @@ -961,6 +1111,7 @@ new->dtype = PLPGSQL_DTYPE_RECFIELD; new->fieldname = strdup(cp[2]); new->recparentno = ns->itemno; + new->isExpr = false; plpgsql_adddatum((PLpgSQL_datum *) new); diff -u src.orig/pl_exec.c src/pl_exec.c --- src.orig/pl_exec.c 2004-11-22 10:20:24.000000000 +0000 +++ src/pl_exec.c 2004-11-22 10:58:42.684896097 +0000 @@ -2956,6 +2956,8 @@ PLpgSQL_recfield *recfield = (PLpgSQL_recfield *) target; PLpgSQL_rec *rec; int fno; + char *fname; + bool mustfreefname = false; HeapTuple newtup; int natts; int i; @@ -2980,20 +2982,78 @@ rec->refname), errdetail("The tuple structure of a not-yet-assigned record is indeterminate."))); + + /* Have we got an expr to deal with? */ + fno = 0; + fname = recfield->fieldname; + if (recfield->isExpr) + { + Datum exprdatum; + Oid exprtype; + bool exprisnull; + + /* Evaluate expression */ + exprdatum = exec_eval_expr(estate, recfield->expr, + &exprisnull, &exprtype); + + /* If we've got an integer, it's a field number, otherwise + * it's a fieldname + */ + if (exprtype == INT4OID) { + fno = DatumGetInt32(exprdatum); + if (fno < 1) { + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_COLUMN), + errmsg("field expression evaluates to a number less than 1"))); + } + } + else { + fname = convert_value_to_string( + exprdatum, exprtype); + mustfreefname = true; + if (*fname == '\0') { + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_COLUMN), + errmsg("field expression evaluates to an empty string"))); + } + } + } + /* * Get the number of the records field to change and the - * number of attributes in the tuple. + * number of attributes in the tuple, if we didn't get it + * above. */ - fno = SPI_fnumber(rec->tupdesc, recfield->fieldname); - if (fno == SPI_ERROR_NOATTRIBUTE) - ereport(ERROR, - (errcode(ERRCODE_UNDEFINED_COLUMN), - errmsg("record \"%s\" has no field \"%s\"", - rec->refname, recfield->fieldname))); + if (fno < 1) + { + fno = SPI_fnumber(rec->tupdesc, fname); + if (fno == SPI_ERROR_NOATTRIBUTE) + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_COLUMN), + errmsg("record \"%s\" has no field \"%s\"", + rec->refname, fname))); + } + + /* Free fname, if required */ + if (mustfreefname) { + pfree(fname); + } + fno--; natts = rec->tupdesc->natts; /* + * Check that fno is within range + */ + if (fno >= natts) { + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_COLUMN), + errmsg("record \"%s\" has no field number \"%d\"", + rec->refname, fno))); + + } + + /* * Set up values/datums arrays for heap_formtuple. For * all the attributes except the one we want to replace, * use the value that's in the old tuple. @@ -3288,6 +3348,8 @@ PLpgSQL_recfield *recfield = (PLpgSQL_recfield *) datum; PLpgSQL_rec *rec; int fno; + char *fname; + bool mustfreefname = false; rec = (PLpgSQL_rec *) (estate->datums[recfield->recparentno]); if (!HeapTupleIsValid(rec->tup)) @@ -3296,19 +3358,76 @@ errmsg("record \"%s\" is not assigned yet", rec->refname), errdetail("The tuple structure of a not-yet-assigned record is indeterminate."))); - fno = SPI_fnumber(rec->tupdesc, recfield->fieldname); - if (fno == SPI_ERROR_NOATTRIBUTE) - ereport(ERROR, - (errcode(ERRCODE_UNDEFINED_COLUMN), - errmsg("record \"%s\" has no field \"%s\"", - rec->refname, recfield->fieldname))); + + /* Have we got an expr to deal with? */ + fno = 0; + fname = recfield->fieldname; + if (recfield->isExpr) + { + Datum exprdatum; + Oid exprtype; + bool exprisnull; + + /* Evaluate expression */ + exprdatum = exec_eval_expr(estate, recfield->expr, + &exprisnull, &exprtype); + + + /* If we've got an integer, it's a field number, otherwise + * it's a fieldname + */ + if (exprtype == INT4OID) { + fno = DatumGetInt32(exprdatum); + if (fno < 1) { + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_COLUMN), + errmsg("field expression evaluates to a number less than 1"))); + } + } + else { + fname = convert_value_to_string( + exprdatum, exprtype); + mustfreefname = true; + if (*fname == '\0') { + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_COLUMN), + errmsg("field expression evaluates to an empty string"))); + } + } + } + + /* + * Get the number of the records field to change and the + * number of attributes in the tuple. + */ + if (fno < 1) + { + fno = SPI_fnumber(rec->tupdesc, fname); + if (fno == SPI_ERROR_NOATTRIBUTE) + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_COLUMN), + errmsg("record \"%s\" has no field \"%s\"", + rec->refname, fname))); + } + + /* Free up fname if required */ + if (mustfreefname) { + pfree(fname); + } + *typeid = SPI_gettypeid(rec->tupdesc, fno); *value = SPI_getbinval(rec->tup, rec->tupdesc, fno, isnull); + if (*value == SPI_ERROR_NOATTRIBUTE) + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_COLUMN), + errmsg("record \"%s\" has no field number \"%d\"", + rec->refname, fno))); + if (expectedtypeid != InvalidOid && expectedtypeid != *typeid) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("type of \"%s.%s\" does not match that when preparing the plan", - rec->refname, recfield->fieldname))); + rec->refname, fname))); break; } Only in src: pl_exec.c.orig diff -u src.orig/plpgsql.h src/plpgsql.h --- src.orig/plpgsql.h 2004-11-22 10:20:24.000000000 +0000 +++ src/plpgsql.h 2004-11-22 10:58:42.737888047 +0000 @@ -269,6 +269,8 @@ int dtype; int rfno; char *fieldname; + bool isExpr; /* Use expression for field name/number */ + PLpgSQL_expr *expr; int recparentno; /* dno of parent record */ } PLpgSQL_recfield; @@ -675,7 +677,9 @@ extern PLpgSQL_function *plpgsql_compile(FunctionCallInfo fcinfo, bool forValidator); extern int plpgsql_parse_word(char *word); +extern int plpgsql_parse_wordexpr(char *word); extern int plpgsql_parse_dblword(char *word); +extern int plpgsql_parse_dblwordexpr(char *word); extern int plpgsql_parse_tripword(char *word); extern int plpgsql_parse_wordtype(char *word); extern int plpgsql_parse_dblwordtype(char *word); diff -u src.orig/scan.l src/scan.l --- src.orig/scan.l 2004-11-22 10:20:24.000000000 +0000 +++ src/scan.l 2004-11-22 10:58:42.781881363 +0000 @@ -195,9 +195,15 @@ {identifier} { plpgsql_error_lineno = plpgsql_scanner_lineno(); return plpgsql_parse_word(yytext); } +{identifier}{space}*\. { + plpgsql_error_lineno = plpgsql_scanner_lineno(); + return plpgsql_parse_wordexpr(yytext); } {identifier}{space}*\.{space}*{identifier} { plpgsql_error_lineno = plpgsql_scanner_lineno(); return plpgsql_parse_dblword(yytext); } +{identifier}{space}*\.{space}*{identifier}*\. { + plpgsql_error_lineno = plpgsql_scanner_lineno(); + return plpgsql_parse_dblwordexpr(yytext); } {identifier}{space}*\.{space}*{identifier}{space}*\.{space}*{identifier} { plpgsql_error_lineno = plpgsql_scanner_lineno(); return plpgsql_parse_tripword(yytext); }
Neil Conway <neilc@samurai.com> writes: > FYI, one thing I want to implement is an EVALUATE statement in plpgsql > (analogous to eval() in Perl, for example). I'm confused. How/why is this different from EXECUTE? regards, tom lane
Matt <matt@kynx.org> writes: > Does that make any sense? Is it worth the work? Or should we just tell > anyone who actually needs it (I don't, at present) 'use another PL'? I don't really see this going anywhere --- it's contorting the semantics of plpgsql too much for too little gain. The typical use-case I've heard of for this sort of thing is "I want to write a generic trigger, so I need to iterate over the columns of NEW.* without knowing their names or data types in advance". Your proposal doesn't address the issue of how the function would find out the column names in order to make use of the proposed notation; and as you noted there's still a serious problem with varying datatypes. Either plperl or pltcl (probably also plpython but I'm not familiar with that language) is a better choice for writing such triggers, because those languages already have answers for all these issues. regards, tom lane
Hi Tom, > > Does that make any sense? Is it worth the work? Or should we just tell > > anyone who actually needs it (I don't, at present) 'use another PL'? > > I don't really see this going anywhere --- it's contorting the semantics > of plpgsql too much for too little gain. Yup, the last bit was definitely contortionist! Prob should have split it into two separate posts. What I'm most looking for comments on what's actually in the patch: with it I can use NEW.($1). _That_ is a big gain. On the rest... > Your proposal doesn't address the > issue of how the function would find out the column names in order to > make use of the proposed notation; It doesn't need them. The posted patch already allows: for i in 1..3 loop myvar := rec.(i); end loop; ...so long as all cols have the same datatype (note that col numbers are 1-based, like in the spi funcs). > and as you noted there's still a > serious problem with varying datatypes. My contortions were an attempt to think my way around the varying datatypes problem. Obviously my thinking got tied in knots ;) I agree, that part is too much work for too little gain. > Either plperl or pltcl (probably also plpython but I'm not familiar > with that language) is a better choice for writing such triggers, > because those languages already have answers for all these issues. I didn't think plperl could be used to write triggers. Is this new in 8? As an application developer, though, it'd be nice to keep everything in one pl language, and ideally one guaranteed available (or easily installable without other dependencies) on every postgres db out there. That's pretty much why I wrote the patch: to add the one missing feature I need to write my generic triggers in plpgsql. Regards, Matt
On Mon, 2004-11-22 at 10:57 -0500, Tom Lane wrote: > I'm confused. How/why is this different from EXECUTE? EVALUATE would take a string and evaluate it as a PL/PgSQL statement; EXECUTE takes a string and executes it as a SQL statement. We've discussed this before (although I may not have called it "EVALUATE" at the time): http://archives.postgresql.org/pgsql-bugs/2004-10/msg00005.php -Neil
On Mon, 2004-11-22 at 10:06 +0000, Matt wrote: > This would execute a string and pass back the result? It would evaluate a string as a PL/PgSQL statement (which means you could construct any PL/PgSQL statement dynamically, including access to fields of a RECORD determined at runtime). > > I don't like this: it implicitly coerces a string literal into an > > identifier (i.e. a column name). Treating data as code can be useful, > > but I think we need to make it more obvious to the user. I think a > > proper EVALUATE statement might be a better solution. > > See your point. But what about NEW.($1)? I don't follow -- what do you mean? (BTW, I think my comment also applies to variables of type "text" and similar -- I think the patch would be a lot simpler if you just implement access to record fields by ordinal position, and don't implement access by field name.) -Neil
> > See your point. But what about NEW.($1)? > > I don't follow -- what do you mean? I want to be able to be able to write a trigger function that accesses a column passed as an argument to the function in the row that caused the trigger. This is my use case. I guess that would actually written NEW.(TG_ARGV[1]). > (BTW, I think my comment also applies to variables of type "text" and > similar -- I think the patch would be a lot simpler if you just > implement access to record fields by ordinal position, and don't > implement access by field name.) Yes, it would be marginally simpler: I'd still have to call exec_eval_datum() on the variable and check whether it could be evaluated to an integer (trigger args are all text AFAIK). The only difference would be throwing an error if it wasn't, instead of making use of the value... and a slightly less readable 'create trigger' statement. It would be a good idea to check that the variable was either a constant or a trigger arg. This would stop the looping problem, since the type of the underlying field couldn't change. But I've somehow got the feeling that this sort of thing isn't the issue. The issue is whether we want to allow dynamic access to columns in any syntax at all. A simple yes or no would do :) Matt BTW: here's the original post adding the rec.(3) syntax to the TODO: http://archives.postgresql.org/pgsql-hackers/2003-07/msg00425.php here's someone else who tried something very similar: http://archives.postgresql.org/pgsql-hackers/2003-09/msg00533.php
Just to put in my .02$, I would absolutely love to see this functionality included in plpgsql. With some extra error checking for the know changing datatype failure, and docs that mention that limitation, I'd say this is a great extension to the language. plpgsql feels quicker than the interpreted PLs and it's far easier than C to work with for writing triggers, so this patch makes plpgsql a much more attractive target for general purpose stored procs. And my gut feeling is that an EVALUATE statement would be significantly slower than this. In any case, thanks for the great work, Matt. Please, CORE, include this one! As an alternative, what would be the possibility of creating a new PL as a contrib module, say PLPGSQL_NG, to move forward with extensions like this and perhaps EVALUATE? -- Mike Rylander mrylander@gmail.com GPLS -- PINES Development Database Developer On 23 Nov 2004 09:03:10 +0000, Matt <matt@kynx.org> wrote: > > > See your point. But what about NEW.($1)? > > > > I don't follow -- what do you mean? > > I want to be able to be able to write a trigger function that accesses a > column passed as an argument to the function in the row that caused the > trigger. This is my use case. > > I guess that would actually written NEW.(TG_ARGV[1]). > > > (BTW, I think my comment also applies to variables of type "text" and > > similar -- I think the patch would be a lot simpler if you just > > implement access to record fields by ordinal position, and don't > > implement access by field name.) > > Yes, it would be marginally simpler: I'd still have to call > exec_eval_datum() on the variable and check whether it could be > evaluated to an integer (trigger args are all text AFAIK). The only > difference would be throwing an error if it wasn't, instead of making > use of the value... and a slightly less readable 'create trigger' > statement. > > It would be a good idea to check that the variable was either a constant > or a trigger arg. This would stop the looping problem, since the type of > the underlying field couldn't change. > > But I've somehow got the feeling that this sort of thing isn't the > issue. The issue is whether we want to allow dynamic access to columns > in any syntax at all. A simple yes or no would do :) > > Matt > > BTW: here's the original post adding the rec.(3) syntax to the TODO: > http://archives.postgresql.org/pgsql-hackers/2003-07/msg00425.php > here's someone else who tried something very similar: > http://archives.postgresql.org/pgsql-hackers/2003-09/msg00533.php > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
Matt <matt@kynx.org> writes: > It would be a good idea to check that the variable was either a constant > or a trigger arg. This would stop the looping problem, since the type of > the underlying field couldn't change. What aboutfor i in ... ... new.(tg_argv[i]) ... > But I've somehow got the feeling that this sort of thing isn't the > issue. The issue is whether we want to allow dynamic access to columns > in any syntax at all. A simple yes or no would do :) MHO: this is a really ugly wart on the language, and it does not solve the problems people would want to solve. It might solve *your* problem but that's not enough to justify a wart of this size. We do need to do something about the fact that EXECUTE can't access plpgsql variables, though I'm afraid that fixing that is going to require a rather complete overhaul of plpgsql :-(. But it needs one anyway. regards, tom lane
> What about > for i in ... > ... new.(tg_argv[i]) ... Ooof! <clutches chest, falls to ground groaning> Constants or digits or nothing, then <gets up, dusts himself off> > MHO: this is a really ugly wart on the language, and it does not solve > the problems people would want to solve. It might solve *your* problem > but that's not enough to justify a wart of this size. <falls down again> But my warts are beautiful! OK, fair enough. I had to try. > We do need to do something about the fact that EXECUTE can't access > plpgsql variables, though I'm afraid that fixing that is going to > require a rather complete overhaul of plpgsql :-(. But it needs one > anyway. Look forward to seeing it. Matt
Mike Rylander wrote: > As an alternative, what would be the possibility of creating a new PL > as a contrib module, say PLPGSQL_NG, to move forward with extensions > like this and perhaps EVALUATE? I think the idea of rewriting PL/PgSQL from scratch has merit (and it's something that I think would be well worth doing). IMHO it's not really worth the trouble to fork the existing code base and add new features to something that, hopefully, has a limited life span. -Neil
On Tue, Nov 23, 2004 at 10:33:50AM -0500, Tom Lane wrote: > We do need to do something about the fact that EXECUTE can't access > plpgsql variables, though I'm afraid that fixing that is going to > require a rather complete overhaul of plpgsql :-(. But it needs one > anyway. Why do you think that it would be difficult to do it with the existing code? Actually I wanted to implement this for 8.1. I've already had a quick look at it. My idea was to allow something like "EXECUTE SELECT INTO var1, var2 col1 col2 FROM ...". The code would have to parse the list of variables and check if they match pl/pgsql variables, execute the query (without the INTO stuff of course) via SPI and copy back the results, checking type mismatches or a mismatch in the number of columns. I haven't thought about more complex types however. Did I miss something? Which other limitations are there in the current implementation of pl/pgsql? Joachim
> I think the idea of rewriting PL/PgSQL from scratch has merit (and it's > something that I think would be well worth doing). IMHO it's not really > worth the trouble to fork the existing code base and add new features to > something that, hopefully, has a limited life span. I dunno, I kind of like the idea. There's always going to be the age old conflict between people who are basically users like me, and want to see a needed feature asap, and developers who want to see it done right. And of course doing it right is only way to go long term. But so long as I can be fairly sure the syntax is agreed (EXECUTE SELECT INTO ... in this case) and eventually will make it into the main code base, I'd be willing to live on the edge for a while. There'd have to be big 'experimental, everything may change' warnings all over the contrib version. My only concern is if it would actually delay the discussed rewrite of plpgsql by splitting the effort. That's my two one hundreths of a euro, anyway. Matt
Joachim Wieland <joe@mcknight.de> writes: > On Tue, Nov 23, 2004 at 10:33:50AM -0500, Tom Lane wrote: >> We do need to do something about the fact that EXECUTE can't access >> plpgsql variables, though I'm afraid that fixing that is going to >> require a rather complete overhaul of plpgsql :-(. But it needs one >> anyway. > Why do you think that it would be difficult to do it with the existing code? Because the parsing context all gets thrown away, in particular the namespace stack; so you can't tell what set of variables ought to be visible to the EXECUTE. > Which other limitations are there in the current implementation of pl/pgsql? Its memory management really sucks. Function parsetrees ought to be put into private contexts, not malloc'd, so that they can reasonably be freed when the function is deleted or redefined. Likewise for the cached query plans. Also we need a way to discard cached query plans when relevant object definitions change (though this is not plpgsql's fault in particular, it's a generic problem). Another issue is that we need a better solution for recursive plpgsql functions. Function parsetrees have to be read-only (with the exception of adding cached plans that weren't there before) during execution, else recursive functions don't work right. We have a really ugly and inefficient answer to this, and I'm not even sure that it covers 100% of the cases (at one time it definitely didn't, but I can't recall right now if that got fixed completely). I'm not entirely sure what a cleaner answer would look like, but I know I don't like it now. regards, tom lane