Thread: [Bug Fix] ECPG: could not use set xxx to default statement

[Bug Fix] ECPG: could not use set xxx to default statement

From
"Higuchi, Daisuke"
Date:
Hi, 

I found ECPG's bug which SET xxx = DEFAULT statement could not be used. 

[PROBLEM]
When the source code (*.pgc) has "EXEC SQL set xxx = default;", created c program by ECPG has no " = default". 
For example, "EXEC SQL set search_path = default;" in *.pgc will be converted to "{ ECPGdo(__LINE__, 0, 1, NULL, 0,
ECPGst_normal,"set search_path", ECPGt_EOIT, ECPGt_EORT);}" in c program.
 

[Investigation]
gram.y lacks ";" in the end of section "generic_set", so preproc.y's syntax is broken. 

src/backend/parser/gram.y
-------------------------------------------
generic_set:

                        | var_name '=' DEFAULT
                                {
                                        VariableSetStmt *n = makeNode(VariableSetStmt);
                                        n->kind = VAR_SET_DEFAULT;
                                        n->name = $1;
                                        $$ = n;
                                }

set_rest_more:  /* Generic SET syntaxes: */
-------------------------------------------

src/interfaces/ecpg/preproc/preproc.y
-------------------------------------------
 generic_set:

|  var_name TO DEFAULT
 {
 $$ = cat_str(2,$1,mm_strdup("to default"));
}
|  var_name '=' DEFAULT set_rest_more:
 generic_set
 {
 $$ = $1;
}
-------------------------------------------

I attached the patch which has ";" in the end of section "generic_set" and regression. 

Regards, 
Daisuke, Higuchi


Attachment

Re: [Bug Fix] ECPG: could not use set xxx to default statement

From
Tom Lane
Date:
"Higuchi, Daisuke" <higuchi.daisuke@jp.fujitsu.com> writes:
> [ missing semicolon in gram.y breaks ecpg parsing of same construct ]

That's pretty nasty.  The fix in gram.y is certainly needed, but I'm
unexcited by the regression test additions you propose.  What I really
want to know is why a syntax error in gram.y wasn't detected by any
of the tools we use, and whether we can do something about that.
Otherwise the next bug of the same kind may go just as undetected;
in fact, I've got little confidence there aren't other such omissions
already :-(

            regards, tom lane


Re: [Bug Fix] ECPG: could not use set xxx to default statement

From
Tom Lane
Date:
I wrote:
> "Higuchi, Daisuke" <higuchi.daisuke@jp.fujitsu.com> writes:
>> [ missing semicolon in gram.y breaks ecpg parsing of same construct ]

> That's pretty nasty.  The fix in gram.y is certainly needed, but I'm
> unexcited by the regression test additions you propose.  What I really
> want to know is why a syntax error in gram.y wasn't detected by any
> of the tools we use,

Ugh ... the Bison NEWS file has this:

* Changes in version 1.875, 2003-01-01:
  ...
  - Semicolons are once again optional at the end of grammar rules.
    This reverts to the behavior of Bison 1.33 and earlier, and improves
    compatibility with Yacc.

I'd remembered how we had to run around and insert semicolons to satisfy
Bison 1.3-something, and supposed that that restriction still held.
But it doesn't.  It seems though that our conversion script for creating
preproc.y depends on there being semicolons.

I think we need to fix that script to either cope with missing semicolons,
or at least complain about them.  Too tired to look into how, right now.

            regards, tom lane


Re: [Bug Fix] ECPG: could not use set xxx to default statement

From
Michael Meskes
Date:
On Tue, 2019-02-19 at 00:05 -0500, Tom Lane wrote:
> I wrote:
> > "Higuchi, Daisuke" <higuchi.daisuke@jp.fujitsu.com> writes:
> > > [ missing semicolon in gram.y breaks ecpg parsing of same
> > > construct ]
> > That's pretty nasty.  The fix in gram.y is certainly needed, but
> > I'm
> > unexcited by the regression test additions you propose.  What I
> > really
> > want to know is why a syntax error in gram.y wasn't detected by any
> > of the tools we use,

I'm actually surprised it only shows by one incorrectly working rule
and did not mangle the whole file by combining to rules. I guess that's
because bison now finds the end of the rule somehow even without the
semicolon.

> But it doesn't.  It seems though that our conversion script for
> creating
> preproc.y depends on there being semicolons.

Yes, it does. There has to be a way for the script to find the end of a
rule and I wonder if bison's way can be used in such a simple perl
script too.

> I think we need to fix that script to either cope with missing
> semicolons,
> or at least complain about them.  Too tired to look into how, right
> now.

If we can identify a missing semicolon we probably can also figure out
where it had to be.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL



RE: [Bug Fix] ECPG: could not use set xxx to default statement

From
"Higuchi, Daisuke"
Date:
Hi, 

> I think we need to fix that script to either cope with missing semicolons,
> or at least complain about them.  Too tired to look into how, right now.

I attached the patch which cope with missing semicolons. 
Previous parse.pl find semicolon and dump data to buffer. When attached patch's parse.pl find new tokens before finding
asemicolon, it also dumps data to buffer.
 

Regards, 
Daisuke, Higuchi


Attachment

Re: [Bug Fix] ECPG: could not use set xxx to default statement

From
Michael Meskes
Date:
Higuchi-san,

> I attached the patch which cope with missing semicolons. 
> Previous parse.pl find semicolon and dump data to buffer. When
> attached patch's parse.pl find new tokens before finding a semicolon,
> it also dumps data to buffer.

Now this seems to be much easier than I expected. Thanks. My first test
show two "missing" semicolons in gram.y. :)

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL



Re: [Bug Fix] ECPG: could not use set xxx to default statement

From
Michael Meskes
Date:
Higuchi-san,

> I attached the patch which cope with missing semicolons. 
> Previous parse.pl find semicolon and dump data to buffer. When
> attached patch's parse.pl find new tokens before finding a semicolon,
> it also dumps data to buffer.

It just occurred to me that check_rules.pl probably uses the same logic
to identify each rule and thus needs to be changed, too. 

Also, IIRC bison allows blanks between the symbol name and the colon,
or in other words "generic_set:" is equal to "generic_set :". If this
happens after a "missing" semicolon I think your patch does not notice
the end of the rule.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL



Re: [Bug Fix] ECPG: could not use set xxx to default statement

From
Andrew Dunstan
Date:
On 2/19/19 6:21 AM, Michael Meskes wrote:
> Higuchi-san,
>
>> I attached the patch which cope with missing semicolons. 
>> Previous parse.pl find semicolon and dump data to buffer. When
>> attached patch's parse.pl find new tokens before finding a semicolon,
>> it also dumps data to buffer.
> It just occurred to me that check_rules.pl probably uses the same logic
> to identify each rule and thus needs to be changed, too. 
>
> Also, IIRC bison allows blanks between the symbol name and the colon,
> or in other words "generic_set:" is equal to "generic_set :". If this
> happens after a "missing" semicolon I think your patch does not notice
> the end of the rule.
>



Yeah, it also seems too possibly liberal about where it matches a rule
name. AUIU this should be the first token on a line; is that right?
OTOH, it won't handle any case where an action block is not the last
thing in the rule, since it only sets $fill_semicolon on seeing a
closing brace (on its own).

I just looked at the bison manual at gnu.org and also at `info bison` on
my local machine, and couldn't see any reference to semicolons being
optional at the end of a rule. Under the heading "Syntax of Grammar
Rules" it says this:

    A Bison grammar rule has the following general form:

         RESULT: COMPONENTS...;

Making it optional without putting that in the manual is just awful.


cheers


andrew



-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [Bug Fix] ECPG: could not use set xxx to default statement

From
Tom Lane
Date:
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
> I just looked at the bison manual at gnu.org and also at `info bison` on
> my local machine, and couldn't see any reference to semicolons being
> optional at the end of a rule. Under the heading "Syntax of Grammar
> Rules" it says this:
>     A Bison grammar rule has the following general form:
>          RESULT: COMPONENTS...;
> Making it optional without putting that in the manual is just awful.

Yeah.  I wonder if they removed that info in 1.34 and failed to
put it back in 1.875?

Anyway, I'm of the opinion that omitting the semi is poor style
and our tools should insist on it even if Bison does not.  Thus,
I think the correct fix is for the scripts to complain about a
missing semi, not cope.

My initial look at parse.pl last night left me feeling pretty
disheartened about its robustness in general --- for example,
it looks like { } /* or */ inside a string literal or Bison
character token would break it completely, because it wouldn't
distinguish those cases from the same things outside a string.
It's just luck we haven't broken it yet (or, perhaps, we have
and nobody exercised the relevant productions yet?).

Probably, somebody who's a better Perl programmer than me
ought to take point on improving that.

            regards, tom lane


Re: [Bug Fix] ECPG: could not use set xxx to default statement

From
Andrew Dunstan
Date:
On 2/19/19 9:29 AM, Tom Lane wrote:
> Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
>> I just looked at the bison manual at gnu.org and also at `info bison` on
>> my local machine, and couldn't see any reference to semicolons being
>> optional at the end of a rule. Under the heading "Syntax of Grammar
>> Rules" it says this:
>>     A Bison grammar rule has the following general form:
>>          RESULT: COMPONENTS...;
>> Making it optional without putting that in the manual is just awful.
> Yeah.  I wonder if they removed that info in 1.34 and failed to
> put it back in 1.875?
>
> Anyway, I'm of the opinion that omitting the semi is poor style
> and our tools should insist on it even if Bison does not.  Thus,
> I think the correct fix is for the scripts to complain about a
> missing semi, not cope.


Yeah, agreed.



>
> My initial look at parse.pl last night left me feeling pretty
> disheartened about its robustness in general --- for example,
> it looks like { } /* or */ inside a string literal or Bison
> character token would break it completely, because it wouldn't
> distinguish those cases from the same things outside a string.
> It's just luck we haven't broken it yet (or, perhaps, we have
> and nobody exercised the relevant productions yet?).
>
> Probably, somebody who's a better Perl programmer than me
> ought to take point on improving that.
>
>             


Agreed.


cheers


andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [Bug Fix] ECPG: could not use set xxx to default statement

From
Tom Lane
Date:
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
> On 2/19/19 9:29 AM, Tom Lane wrote:
>> Probably, somebody who's a better Perl programmer than me
>> ought to take point on improving that.

> Agreed.

Not seeing any motion on this, here's a draft patch to make these
scripts complain about missing semicolons.  Against the current
gram.y (which contains 2 such errors, as Michael noted) you
get output like

'/usr/bin/perl' ./parse.pl . < ../../../backend/parser/gram.y > preproc.y
unterminated rule at ./parse.pl line 370, <> line 1469.
make: *** [preproc.y] Error 255
make: *** Deleting file `preproc.y'

That's not *super* friendly, but it does give you the right line number
to look at in gram.y.  We could adjust the script (and the Makefile)
further so that the message would cite the gram.y filename, but I'm not
sure if it's worth the trouble.  Thoughts?

            regards, tom lane

diff --git a/src/interfaces/ecpg/preproc/check_rules.pl b/src/interfaces/ecpg/preproc/check_rules.pl
index 1388d05..3daff88 100644
--- a/src/interfaces/ecpg/preproc/check_rules.pl
+++ b/src/interfaces/ecpg/preproc/check_rules.pl
@@ -1,7 +1,7 @@
 #!/usr/bin/perl
 # src/interfaces/ecpg/preproc/check_rules.pl
 # test parser generator for ecpg
-# call with backend parser as stdin
+# call with backend grammar as stdin
 #
 # Copyright (c) 2009-2019, PostgreSQL Global Development Group
 #
@@ -47,6 +47,7 @@ my %replace_line = (

 my $block        = '';
 my $yaccmode     = 0;
+my $in_rule      = 0;
 my $brace_indent = 0;
 my (@arr, %found);
 my $comment     = 0;
@@ -131,10 +132,13 @@ while (<$parser_fh>)
             $found{$block} = 1;
             $cc++;
             $block = '';
+            $in_rule = 0 if $arr[$fieldIndexer] eq ';';
         }
         elsif (($arr[$fieldIndexer] =~ '[A-Za-z0-9]+:')
             || $arr[ $fieldIndexer + 1 ] eq ':')
         {
+            die "unterminated rule" if $in_rule;
+            $in_rule     = 1;
             $non_term_id = $arr[$fieldIndexer];
             $non_term_id =~ tr/://d;
         }
@@ -145,6 +149,8 @@ while (<$parser_fh>)
     }
 }

+die "unterminated rule" if $in_rule;
+
 close $parser_fh;
 if ($verbose)
 {
diff --git a/src/interfaces/ecpg/preproc/parse.pl b/src/interfaces/ecpg/preproc/parse.pl
index 6dee500..e219398 100644
--- a/src/interfaces/ecpg/preproc/parse.pl
+++ b/src/interfaces/ecpg/preproc/parse.pl
@@ -22,6 +22,7 @@ $path = "." unless $path;
 my $copymode              = 0;
 my $brace_indent          = 0;
 my $yaccmode              = 0;
+my $in_rule               = 0;
 my $header_included       = 0;
 my $feature_not_supported = 0;
 my $tokenmode             = 0;
@@ -288,6 +289,7 @@ sub main
                 @fields  = ();
                 $infield = 0;
                 $line    = '';
+                $in_rule = 0;
                 next;
             }

@@ -365,6 +367,8 @@ sub main
                 $line    = '';
                 @fields  = ();
                 $infield = 1;
+                die "unterminated rule" if $in_rule;
+                $in_rule = 1;
                 next;
             }
             elsif ($copymode)
@@ -415,6 +419,7 @@ sub main
             }
         }
     }
+    die "unterminated rule" if $in_rule;
     return;
 }


Re: [Bug Fix] ECPG: could not use set xxx to default statement

From
Michael Meskes
Date:
> Not seeing any motion on this, here's a draft patch to make these
> scripts complain about missing semicolons.  Against the current
> gram.y (which contains 2 such errors, as Michael noted) you
> get output like

Thanks Tom for looking into this. Are we agreed then that we want
gram.y to have semicolons? 

> '/usr/bin/perl' ./parse.pl . < ../../../backend/parser/gram.y >
> preproc.y
> unterminated rule at ./parse.pl line 370, <> line 1469.
> make: *** [preproc.y] Error 255
> make: *** Deleting file `preproc.y'
> 
> That's not *super* friendly, but it does give you the right line
> number
> to look at in gram.y.  We could adjust the script (and the Makefile)
> further so that the message would cite the gram.y filename, but I'm
> not
> sure if it's worth the trouble.  Thoughts?

IMO it's not worth it. We all know where the grammar is and that the
ecpg tools only parse that one file. Why putting effort into writing it
down too?

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL



Re: [Bug Fix] ECPG: could not use set xxx to default statement

From
Tom Lane
Date:
Michael Meskes <meskes@postgresql.org> writes:
>> Not seeing any motion on this, here's a draft patch to make these
>> scripts complain about missing semicolons.  Against the current
>> gram.y (which contains 2 such errors, as Michael noted) you
>> get output like

> Thanks Tom for looking into this. Are we agreed then that we want
> gram.y to have semicolons? 

Hearing no objections, I pushed it that way.

>> That's not *super* friendly, but it does give you the right line
>> number to look at in gram.y.  We could adjust the script (and the Makefile)
>> further so that the message would cite the gram.y filename, but I'm
>> not sure if it's worth the trouble.  Thoughts?

> IMO it's not worth it. We all know where the grammar is and that the
> ecpg tools only parse that one file. Why putting effort into writing it
> down too?

I did manage to fix the "die" commands so that you get something like

unterminated rule at grammar line 1461

without the extraneous detail about the script's internals.
That seems clear enough from here.

I'm still disturbed by the scripts' ability to get fooled by
braces or comment markers within quoted strings.  However, that's
not something I have good ideas about how to fix, and there's not
any evidence that it's a live problem at the moment.

            regards, tom lane