Thread: [9.3 bug fix] ECPG does not escape backslashes

[9.3 bug fix] ECPG does not escape backslashes

From
"MauMau"
Date:
Hello,

I happened to find a trivial bug of ECPG while experimenting with 9.3 beta
2.  Please find attached the patch to fix this.  This is not specific to
9.3.  Could you commit and backport this?


[Bug description]
Running "ecpg c:\command\a.pgc" produces the following line in a.c:

#line 1 "c:\command\a.pgc"

Then, compiling the resulting a.c with Visual Studio (cl.exe) issues the
warning:

a.c(8) : warning C4129: 'c' : unrecognized character escape sequence

This is because ecpg doesn't escape \ in the #line string.


[How to fix]
Escape \ in the input file name like this:

#line 1 "c:\\command\\a.pgc"

This is necessary not only on Windows but also on UNIX/Linux.  For your
information, running "gcc -E di\\r/a.c" escapes \ and outputs the line:

# 1 "di\\r/a.c"


Regards
MauMau

Attachment

Re: [9.3 bug fix] ECPG does not escape backslashes

From
Michael Meskes
Date:
On Wed, Jul 03, 2013 at 07:22:48PM +0900, MauMau wrote:
> I happened to find a trivial bug of ECPG while experimenting with
> 9.3 beta 2.  Please find attached the patch to fix this.  This is
> not specific to 9.3.  Could you commit and backport this?

This appears to be Windows specific. I don't have a Windows system to test
with. How does Visusal Studio handle #line entries with full path names? Are
they all escaped? Or better do they have to be?

> This is necessary not only on Windows but also on UNIX/Linux.  For
> your information, running "gcc -E di\\r/a.c" escapes \ and outputs
> the line:
> 
> # 1 "di\\r/a.c"

Now this statement surprises me:

michael@feivel:~$ ecpg test\\\\a/init.pgc 
michael@feivel:~$ grep line test\\\\a/init.c |head -1
#line 1 "test\\a/init.pgc"
michael@feivel:~$ gcc -o i test\\\\a/init.c -I /usr/include/postgresql/ -l ecpg
michael@feivel:~$ 

This seems to suggest that it works nicely on Linux. 

So what do ou mean when saying the problem also occurs on Linux?

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at gmail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL



Re: [9.3 bug fix] ECPG does not escape backslashes

From
Andrew Dunstan
Date:
On 07/04/2013 07:04 AM, Michael Meskes wrote:
> On Wed, Jul 03, 2013 at 07:22:48PM +0900, MauMau wrote:
>> I happened to find a trivial bug of ECPG while experimenting with
>> 9.3 beta 2.  Please find attached the patch to fix this.  This is
>> not specific to 9.3.  Could you commit and backport this?
> This appears to be Windows specific. I don't have a Windows system to test
> with. How does Visusal Studio handle #line entries with full path names? Are
> they all escaped? Or better do they have to be?
>
>> This is necessary not only on Windows but also on UNIX/Linux.  For
>> your information, running "gcc -E di\\r/a.c" escapes \ and outputs
>> the line:
>>
>> # 1 "di\\r/a.c"
> Now this statement surprises me:
>
> michael@feivel:~$ ecpg test\\\\a/init.pgc
> michael@feivel:~$ grep line test\\\\a/init.c |head -1
> #line 1 "test\\a/init.pgc"
> michael@feivel:~$ gcc -o i test\\\\a/init.c -I /usr/include/postgresql/ -l ecpg
> michael@feivel:~$
>
> This seems to suggest that it works nicely on Linux.
>

Really? I'd expect to see 4 backslashes in the #line directive, I think.

cheers

andrew




Re: [9.3 bug fix] ECPG does not escape backslashes

From
Michael Meskes
Date:
On Thu, Jul 04, 2013 at 07:58:39AM -0400, Andrew Dunstan wrote:
> >michael@feivel:~$ grep line test\\\\a/init.c |head -1
> >#line 1 "test\\a/init.pgc"
> ...
> 
> Really? I'd expect to see 4 backslashes in the #line directive, I think.

Eh, why? The four backslashes come are two that are escaped for shell usage.
The directory name is in my example was "test\\a". What did I miss?

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at gmail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL



Re: [9.3 bug fix] ECPG does not escape backslashes

From
Andrew Dunstan
Date:
On 07/04/2013 08:31 AM, Michael Meskes wrote:
> On Thu, Jul 04, 2013 at 07:58:39AM -0400, Andrew Dunstan wrote:
>>> michael@feivel:~$ grep line test\\\\a/init.c |head -1
>>> #line 1 "test\\a/init.pgc"
>> ...
>>
>> Really? I'd expect to see 4 backslashes in the #line directive, I think.
> Eh, why? The four backslashes come are two that are escaped for shell usage.
> The directory name is in my example was "test\\a". What did I miss?
>

Isn't the argument to #line a C string literal in which one would expect 
backslashes to be escaped? If not, how would it show a filename 
containing a '"' character?
   [andrew@emma inst.92.5701]$ bin/ecpg x\\\"a/y.pgc   [andrew@emma inst.92.5701]$ grep line  x\\\"a/y.c   #line 1
"x\"a/y.pgc"

This must surely be wrong.


cheers

andrew




Re: [9.3 bug fix] ECPG does not escape backslashes

From
Andres Freund
Date:
On 2013-07-04 08:50:34 -0400, Andrew Dunstan wrote:
>
> On 07/04/2013 08:31 AM, Michael Meskes wrote:
> >On Thu, Jul 04, 2013 at 07:58:39AM -0400, Andrew Dunstan wrote:
> >>>michael@feivel:~$ grep line test\\\\a/init.c |head -1
> >>>#line 1 "test\\a/init.pgc"
> >>...
> >>
> >>Really? I'd expect to see 4 backslashes in the #line directive, I think.
> >Eh, why? The four backslashes come are two that are escaped for shell usage.
> >The directory name is in my example was "test\\a". What did I miss?
> >
>
> Isn't the argument to #line a C string literal in which one would expect
> backslashes to be escaped? If not, how would it show a filename containing a
> '"' character?
>
>    [andrew@emma inst.92.5701]$ bin/ecpg x\\\"a/y.pgc
>    [andrew@emma inst.92.5701]$ grep line  x\\\"a/y.c
>    #line 1 "x\"a/y.pgc"
>
> This must surely be wrong.

I think it's correct. Quoting the gcc manual
(http://gcc.gnu.org/onlinedocs/gcc-4.8.1/cpp/Include-Syntax.html#Include-Syntax)
"However, if backslashes occur within file, they are considered ordinary
text characters, not escape characters. None of the character escape
sequences appropriate to string constants in C are processed. Thus,
#include "x\n\\y" specifies a filename containing three
backslashes. (Some systems interpret ‘\’ as a pathname separator. All of
these also interpret ‘/’ the same way. It is most portable to use only
‘/’.)"

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: [9.3 bug fix] ECPG does not escape backslashes

From
Andrew Dunstan
Date:
On 07/04/2013 08:58 AM, Andres Freund wrote:
> On 2013-07-04 08:50:34 -0400, Andrew Dunstan wrote:
>> On 07/04/2013 08:31 AM, Michael Meskes wrote:
>>> On Thu, Jul 04, 2013 at 07:58:39AM -0400, Andrew Dunstan wrote:
>>>>> michael@feivel:~$ grep line test\\\\a/init.c |head -1
>>>>> #line 1 "test\\a/init.pgc"
>>>> ...
>>>>
>>>> Really? I'd expect to see 4 backslashes in the #line directive, I think.
>>> Eh, why? The four backslashes come are two that are escaped for shell usage.
>>> The directory name is in my example was "test\\a". What did I miss?
>>>
>> Isn't the argument to #line a C string literal in which one would expect
>> backslashes to be escaped? If not, how would it show a filename containing a
>> '"' character?
>>
>>     [andrew@emma inst.92.5701]$ bin/ecpg x\\\"a/y.pgc
>>     [andrew@emma inst.92.5701]$ grep line  x\\\"a/y.c
>>     #line 1 "x\"a/y.pgc"
>>
>> This must surely be wrong.
> I think it's correct. Quoting the gcc manual
> (http://gcc.gnu.org/onlinedocs/gcc-4.8.1/cpp/Include-Syntax.html#Include-Syntax)
> "However, if backslashes occur within file, they are considered ordinary
> text characters, not escape characters. None of the character escape
> sequences appropriate to string constants in C are processed. Thus,
> #include "x\n\\y" specifies a filename containing three
> backslashes. (Some systems interpret ‘\’ as a pathname separator. All of
> these also interpret ‘/’ the same way. It is most portable to use only
> ‘/’.)"

Well, that refers to #include, but for the sake of argument I'll assume
the same rule applies to #line. So this just gets processed by stripping
the surrounding quotes? Well I guess I learn something every day.


cheers

andrew



Re: [9.3 bug fix] ECPG does not escape backslashes

From
Andres Freund
Date:
On 2013-07-04 09:12:37 -0400, Andrew Dunstan wrote:
>
> On 07/04/2013 08:58 AM, Andres Freund wrote:
> >On 2013-07-04 08:50:34 -0400, Andrew Dunstan wrote:
> >>On 07/04/2013 08:31 AM, Michael Meskes wrote:
> >>>On Thu, Jul 04, 2013 at 07:58:39AM -0400, Andrew Dunstan wrote:
> >>>>>michael@feivel:~$ grep line test\\\\a/init.c |head -1
> >>>>>#line 1 "test\\a/init.pgc"
> >>>>...
> >>>>
> >>>>Really? I'd expect to see 4 backslashes in the #line directive, I think.
> >>>Eh, why? The four backslashes come are two that are escaped for shell usage.
> >>>The directory name is in my example was "test\\a". What did I miss?
> >>>
> >>Isn't the argument to #line a C string literal in which one would expect
> >>backslashes to be escaped? If not, how would it show a filename containing a
> >>'"' character?
> >>
> >>    [andrew@emma inst.92.5701]$ bin/ecpg x\\\"a/y.pgc
> >>    [andrew@emma inst.92.5701]$ grep line  x\\\"a/y.c
> >>    #line 1 "x\"a/y.pgc"
> >>
> >>This must surely be wrong.
> >I think it's correct. Quoting the gcc manual
> >(http://gcc.gnu.org/onlinedocs/gcc-4.8.1/cpp/Include-Syntax.html#Include-Syntax)
> >"However, if backslashes occur within file, they are considered ordinary
> >text characters, not escape characters. None of the character escape
> >sequences appropriate to string constants in C are processed. Thus,
> >#include "x\n\\y" specifies a filename containing three
> >backslashes. (Some systems interpret ‘\’ as a pathname separator. All of
> >these also interpret ‘/’ the same way. It is most portable to use only
> >‘/’.)"
>
> Well, that refers to #include, but for the sake of argument I'll assume the
> same rule applies to #line. So this just gets processed by stripping the
> surrounding quotes? Well I guess I learn something every day.

Gah. You're right. I only remembered the rules for #include and thought
that would be applicable. But:
http://gcc.gnu.org/onlinedocs/gcc-4.8.1/cpp/Line-Control.html#Line-Control :
»filename is interpreted according to the normal rules for a string
constant: backslash escapes are interpreted. This is different from
‘#include’.

Previous versions of CPP did not interpret escapes in ‘#line’; we have
changed it because the standard requires they be interpreted, and most
other compilers do.«

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: [9.3 bug fix] ECPG does not escape backslashes

From
Michael Meskes
Date:
On Thu, Jul 04, 2013 at 03:18:12PM +0200, Andres Freund wrote:
> Previous versions of CPP did not interpret escapes in ‘#line’; we have
> changed it because the standard requires they be interpreted, and most
> other compilers do.«

So that means MauMau was right and backslashes have to be escaped in filenames
in #line directives, right? Apparently my examples were badly chosen as I
didn't see an error no matter how many backslashes I had.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at gmail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL



Re: [9.3 bug fix] ECPG does not escape backslashes

From
Michael Meskes
Date:
On Wed, Jul 03, 2013 at 07:22:48PM +0900, MauMau wrote:
> not specific to 9.3.  Could you commit and backport this?

Committed to 8.4, 9.0, 9.1, 9.2 and HEAD.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at gmail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL



Re: [9.3 bug fix] ECPG does not escape backslashes

From
"MauMau"
Date:
From: "Michael Meskes" <meskes@postgresql.org>
> So that means MauMau was right and backslashes have to be escaped in 
> filenames
> in #line directives, right? Apparently my examples were badly chosen as I
> didn't see an error no matter how many backslashes I had.

Yes, the below examples shows the case:

[maumau@myhost ~]$ touch ab\\c/a.pgc
[maumau@myhost ~]$ ecpg ab\\c/a.pgc
[maumau@myhost ~]$ gcc -c -I/tuna/pgsql/include ab\\c/a.c
ab\c/a.c:8:9: warning: unknown escape sequence '\c'
[maumau@myhost ~]$


> Committed to 8.4, 9.0, 9.1, 9.2 and HEAD.

Thank you very much for your quick support.

Regards
MauMau




Re: [9.3 bug fix] ECPG does not escape backslashes

From
Andrew Dunstan
Date:
On 07/05/2013 05:16 AM, Michael Meskes wrote:
> On Wed, Jul 03, 2013 at 07:22:48PM +0900, MauMau wrote:
>> not specific to 9.3.  Could you commit and backport this?
> Committed to 8.4, 9.0, 9.1, 9.2 and HEAD.
>

This looks incomplete. Surely just escaping backslashes alone is not 
enough. I suspect at least the " char and any chars below 0x20 should be 
quoted also.

cheers

andrew




Re: [9.3 bug fix] ECPG does not escape backslashes

From
Michael Meskes
Date:
On Fri, Jul 05, 2013 at 08:08:06AM -0400, Andrew Dunstan wrote:
> This looks incomplete. Surely just escaping backslashes alone is not
> enough. I suspect at least the " char and any chars below 0x20
> should be quoted also.

Right, this didn't even occur to me, but there are surely more characters that
need to be escaped. Gotta dig into this.

Thanks for pointing this out.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at gmail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL



Re: [9.3 bug fix] ECPG does not escape backslashes

From
Tom Lane
Date:
Michael Meskes <meskes@postgresql.org> writes:
> On Wed, Jul 03, 2013 at 07:22:48PM +0900, MauMau wrote:
>> not specific to 9.3.  Could you commit and backport this?

> Committed to 8.4, 9.0, 9.1, 9.2 and HEAD.

Um ... 9.3 is a separate branch now, please fix it there also.
        regards, tom lane



Re: [9.3 bug fix] ECPG does not escape backslashes

From
Michael Meskes
Date:
On Fri, Jul 05, 2013 at 09:41:26AM -0400, Tom Lane wrote:
> Um ... 9.3 is a separate branch now, please fix it there also.

Done. Seems I missed a new branch - yet again. Sorry and thanks for pointing it
out to me.

michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at gmail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL



Re: [9.3 bug fix] ECPG does not escape backslashes

From
Michael Meskes
Date:
On Fri, Jul 05, 2013 at 08:08:06AM -0400, Andrew Dunstan wrote:
> This looks incomplete. Surely just escaping backslashes alone is not
> enough. I suspect at least the " char and any chars below 0x20
> should be quoted also.

The " char I just added, however, my tests did bring up any problem with chars
below 0x20. If anybody sees another character breaking ecpg, please tell me and
I'll fix it.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at gmail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL