Thread: [Bug Fix] ECPG: could not use set xxx to default statement
Hi, I found ECPG's bug which SET xxx = DEFAULT statement could not be used. [PROBLEM] When the source code (*.pgc) has "EXEC SQL set xxx = default;", created c program by ECPG has no " = default". For example, "EXEC SQL set search_path = default;" in *.pgc will be converted to "{ ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal,"set search_path", ECPGt_EOIT, ECPGt_EORT);}" in c program. [Investigation] gram.y lacks ";" in the end of section "generic_set", so preproc.y's syntax is broken. src/backend/parser/gram.y ------------------------------------------- generic_set: | var_name '=' DEFAULT { VariableSetStmt *n = makeNode(VariableSetStmt); n->kind = VAR_SET_DEFAULT; n->name = $1; $$ = n; } set_rest_more: /* Generic SET syntaxes: */ ------------------------------------------- src/interfaces/ecpg/preproc/preproc.y ------------------------------------------- generic_set: | var_name TO DEFAULT { $$ = cat_str(2,$1,mm_strdup("to default")); } | var_name '=' DEFAULT set_rest_more: generic_set { $$ = $1; } ------------------------------------------- I attached the patch which has ";" in the end of section "generic_set" and regression. Regards, Daisuke, Higuchi
Attachment
"Higuchi, Daisuke" <higuchi.daisuke@jp.fujitsu.com> writes: > [ missing semicolon in gram.y breaks ecpg parsing of same construct ] That's pretty nasty. The fix in gram.y is certainly needed, but I'm unexcited by the regression test additions you propose. What I really want to know is why a syntax error in gram.y wasn't detected by any of the tools we use, and whether we can do something about that. Otherwise the next bug of the same kind may go just as undetected; in fact, I've got little confidence there aren't other such omissions already :-( regards, tom lane
I wrote: > "Higuchi, Daisuke" <higuchi.daisuke@jp.fujitsu.com> writes: >> [ missing semicolon in gram.y breaks ecpg parsing of same construct ] > That's pretty nasty. The fix in gram.y is certainly needed, but I'm > unexcited by the regression test additions you propose. What I really > want to know is why a syntax error in gram.y wasn't detected by any > of the tools we use, Ugh ... the Bison NEWS file has this: * Changes in version 1.875, 2003-01-01: ... - Semicolons are once again optional at the end of grammar rules. This reverts to the behavior of Bison 1.33 and earlier, and improves compatibility with Yacc. I'd remembered how we had to run around and insert semicolons to satisfy Bison 1.3-something, and supposed that that restriction still held. But it doesn't. It seems though that our conversion script for creating preproc.y depends on there being semicolons. I think we need to fix that script to either cope with missing semicolons, or at least complain about them. Too tired to look into how, right now. regards, tom lane
On Tue, 2019-02-19 at 00:05 -0500, Tom Lane wrote: > I wrote: > > "Higuchi, Daisuke" <higuchi.daisuke@jp.fujitsu.com> writes: > > > [ missing semicolon in gram.y breaks ecpg parsing of same > > > construct ] > > That's pretty nasty. The fix in gram.y is certainly needed, but > > I'm > > unexcited by the regression test additions you propose. What I > > really > > want to know is why a syntax error in gram.y wasn't detected by any > > of the tools we use, I'm actually surprised it only shows by one incorrectly working rule and did not mangle the whole file by combining to rules. I guess that's because bison now finds the end of the rule somehow even without the semicolon. > But it doesn't. It seems though that our conversion script for > creating > preproc.y depends on there being semicolons. Yes, it does. There has to be a way for the script to find the end of a rule and I wonder if bison's way can be used in such a simple perl script too. > I think we need to fix that script to either cope with missing > semicolons, > or at least complain about them. Too tired to look into how, right > now. If we can identify a missing semicolon we probably can also figure out where it had to be. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Meskes at (Debian|Postgresql) dot Org Jabber: michael at xmpp dot meskes dot org VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL
Hi, > I think we need to fix that script to either cope with missing semicolons, > or at least complain about them. Too tired to look into how, right now. I attached the patch which cope with missing semicolons. Previous parse.pl find semicolon and dump data to buffer. When attached patch's parse.pl find new tokens before finding asemicolon, it also dumps data to buffer. Regards, Daisuke, Higuchi
Attachment
Higuchi-san, > I attached the patch which cope with missing semicolons. > Previous parse.pl find semicolon and dump data to buffer. When > attached patch's parse.pl find new tokens before finding a semicolon, > it also dumps data to buffer. Now this seems to be much easier than I expected. Thanks. My first test show two "missing" semicolons in gram.y. :) Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Meskes at (Debian|Postgresql) dot Org Jabber: michael at xmpp dot meskes dot org VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL
Higuchi-san, > I attached the patch which cope with missing semicolons. > Previous parse.pl find semicolon and dump data to buffer. When > attached patch's parse.pl find new tokens before finding a semicolon, > it also dumps data to buffer. It just occurred to me that check_rules.pl probably uses the same logic to identify each rule and thus needs to be changed, too. Also, IIRC bison allows blanks between the symbol name and the colon, or in other words "generic_set:" is equal to "generic_set :". If this happens after a "missing" semicolon I think your patch does not notice the end of the rule. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Meskes at (Debian|Postgresql) dot Org Jabber: michael at xmpp dot meskes dot org VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL
On 2/19/19 6:21 AM, Michael Meskes wrote: > Higuchi-san, > >> I attached the patch which cope with missing semicolons. >> Previous parse.pl find semicolon and dump data to buffer. When >> attached patch's parse.pl find new tokens before finding a semicolon, >> it also dumps data to buffer. > It just occurred to me that check_rules.pl probably uses the same logic > to identify each rule and thus needs to be changed, too. > > Also, IIRC bison allows blanks between the symbol name and the colon, > or in other words "generic_set:" is equal to "generic_set :". If this > happens after a "missing" semicolon I think your patch does not notice > the end of the rule. > Yeah, it also seems too possibly liberal about where it matches a rule name. AUIU this should be the first token on a line; is that right? OTOH, it won't handle any case where an action block is not the last thing in the rule, since it only sets $fill_semicolon on seeing a closing brace (on its own). I just looked at the bison manual at gnu.org and also at `info bison` on my local machine, and couldn't see any reference to semicolons being optional at the end of a rule. Under the heading "Syntax of Grammar Rules" it says this: A Bison grammar rule has the following general form: RESULT: COMPONENTS...; Making it optional without putting that in the manual is just awful. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: > I just looked at the bison manual at gnu.org and also at `info bison` on > my local machine, and couldn't see any reference to semicolons being > optional at the end of a rule. Under the heading "Syntax of Grammar > Rules" it says this: > A Bison grammar rule has the following general form: > RESULT: COMPONENTS...; > Making it optional without putting that in the manual is just awful. Yeah. I wonder if they removed that info in 1.34 and failed to put it back in 1.875? Anyway, I'm of the opinion that omitting the semi is poor style and our tools should insist on it even if Bison does not. Thus, I think the correct fix is for the scripts to complain about a missing semi, not cope. My initial look at parse.pl last night left me feeling pretty disheartened about its robustness in general --- for example, it looks like { } /* or */ inside a string literal or Bison character token would break it completely, because it wouldn't distinguish those cases from the same things outside a string. It's just luck we haven't broken it yet (or, perhaps, we have and nobody exercised the relevant productions yet?). Probably, somebody who's a better Perl programmer than me ought to take point on improving that. regards, tom lane
On 2/19/19 9:29 AM, Tom Lane wrote: > Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: >> I just looked at the bison manual at gnu.org and also at `info bison` on >> my local machine, and couldn't see any reference to semicolons being >> optional at the end of a rule. Under the heading "Syntax of Grammar >> Rules" it says this: >> A Bison grammar rule has the following general form: >> RESULT: COMPONENTS...; >> Making it optional without putting that in the manual is just awful. > Yeah. I wonder if they removed that info in 1.34 and failed to > put it back in 1.875? > > Anyway, I'm of the opinion that omitting the semi is poor style > and our tools should insist on it even if Bison does not. Thus, > I think the correct fix is for the scripts to complain about a > missing semi, not cope. Yeah, agreed. > > My initial look at parse.pl last night left me feeling pretty > disheartened about its robustness in general --- for example, > it looks like { } /* or */ inside a string literal or Bison > character token would break it completely, because it wouldn't > distinguish those cases from the same things outside a string. > It's just luck we haven't broken it yet (or, perhaps, we have > and nobody exercised the relevant productions yet?). > > Probably, somebody who's a better Perl programmer than me > ought to take point on improving that. > > Agreed. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: > On 2/19/19 9:29 AM, Tom Lane wrote: >> Probably, somebody who's a better Perl programmer than me >> ought to take point on improving that. > Agreed. Not seeing any motion on this, here's a draft patch to make these scripts complain about missing semicolons. Against the current gram.y (which contains 2 such errors, as Michael noted) you get output like '/usr/bin/perl' ./parse.pl . < ../../../backend/parser/gram.y > preproc.y unterminated rule at ./parse.pl line 370, <> line 1469. make: *** [preproc.y] Error 255 make: *** Deleting file `preproc.y' That's not *super* friendly, but it does give you the right line number to look at in gram.y. We could adjust the script (and the Makefile) further so that the message would cite the gram.y filename, but I'm not sure if it's worth the trouble. Thoughts? regards, tom lane diff --git a/src/interfaces/ecpg/preproc/check_rules.pl b/src/interfaces/ecpg/preproc/check_rules.pl index 1388d05..3daff88 100644 --- a/src/interfaces/ecpg/preproc/check_rules.pl +++ b/src/interfaces/ecpg/preproc/check_rules.pl @@ -1,7 +1,7 @@ #!/usr/bin/perl # src/interfaces/ecpg/preproc/check_rules.pl # test parser generator for ecpg -# call with backend parser as stdin +# call with backend grammar as stdin # # Copyright (c) 2009-2019, PostgreSQL Global Development Group # @@ -47,6 +47,7 @@ my %replace_line = ( my $block = ''; my $yaccmode = 0; +my $in_rule = 0; my $brace_indent = 0; my (@arr, %found); my $comment = 0; @@ -131,10 +132,13 @@ while (<$parser_fh>) $found{$block} = 1; $cc++; $block = ''; + $in_rule = 0 if $arr[$fieldIndexer] eq ';'; } elsif (($arr[$fieldIndexer] =~ '[A-Za-z0-9]+:') || $arr[ $fieldIndexer + 1 ] eq ':') { + die "unterminated rule" if $in_rule; + $in_rule = 1; $non_term_id = $arr[$fieldIndexer]; $non_term_id =~ tr/://d; } @@ -145,6 +149,8 @@ while (<$parser_fh>) } } +die "unterminated rule" if $in_rule; + close $parser_fh; if ($verbose) { diff --git a/src/interfaces/ecpg/preproc/parse.pl b/src/interfaces/ecpg/preproc/parse.pl index 6dee500..e219398 100644 --- a/src/interfaces/ecpg/preproc/parse.pl +++ b/src/interfaces/ecpg/preproc/parse.pl @@ -22,6 +22,7 @@ $path = "." unless $path; my $copymode = 0; my $brace_indent = 0; my $yaccmode = 0; +my $in_rule = 0; my $header_included = 0; my $feature_not_supported = 0; my $tokenmode = 0; @@ -288,6 +289,7 @@ sub main @fields = (); $infield = 0; $line = ''; + $in_rule = 0; next; } @@ -365,6 +367,8 @@ sub main $line = ''; @fields = (); $infield = 1; + die "unterminated rule" if $in_rule; + $in_rule = 1; next; } elsif ($copymode) @@ -415,6 +419,7 @@ sub main } } } + die "unterminated rule" if $in_rule; return; }
> Not seeing any motion on this, here's a draft patch to make these > scripts complain about missing semicolons. Against the current > gram.y (which contains 2 such errors, as Michael noted) you > get output like Thanks Tom for looking into this. Are we agreed then that we want gram.y to have semicolons? > '/usr/bin/perl' ./parse.pl . < ../../../backend/parser/gram.y > > preproc.y > unterminated rule at ./parse.pl line 370, <> line 1469. > make: *** [preproc.y] Error 255 > make: *** Deleting file `preproc.y' > > That's not *super* friendly, but it does give you the right line > number > to look at in gram.y. We could adjust the script (and the Makefile) > further so that the message would cite the gram.y filename, but I'm > not > sure if it's worth the trouble. Thoughts? IMO it's not worth it. We all know where the grammar is and that the ecpg tools only parse that one file. Why putting effort into writing it down too? Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Meskes at (Debian|Postgresql) dot Org Jabber: michael at xmpp dot meskes dot org VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL
Michael Meskes <meskes@postgresql.org> writes: >> Not seeing any motion on this, here's a draft patch to make these >> scripts complain about missing semicolons. Against the current >> gram.y (which contains 2 such errors, as Michael noted) you >> get output like > Thanks Tom for looking into this. Are we agreed then that we want > gram.y to have semicolons? Hearing no objections, I pushed it that way. >> That's not *super* friendly, but it does give you the right line >> number to look at in gram.y. We could adjust the script (and the Makefile) >> further so that the message would cite the gram.y filename, but I'm >> not sure if it's worth the trouble. Thoughts? > IMO it's not worth it. We all know where the grammar is and that the > ecpg tools only parse that one file. Why putting effort into writing it > down too? I did manage to fix the "die" commands so that you get something like unterminated rule at grammar line 1461 without the extraneous detail about the script's internals. That seems clear enough from here. I'm still disturbed by the scripts' ability to get fooled by braces or comment markers within quoted strings. However, that's not something I have good ideas about how to fix, and there's not any evidence that it's a live problem at the moment. regards, tom lane