Thread: [9.3 bug fix] ECPG does not escape backslashes
Hello, I happened to find a trivial bug of ECPG while experimenting with 9.3 beta 2. Please find attached the patch to fix this. This is not specific to 9.3. Could you commit and backport this? [Bug description] Running "ecpg c:\command\a.pgc" produces the following line in a.c: #line 1 "c:\command\a.pgc" Then, compiling the resulting a.c with Visual Studio (cl.exe) issues the warning: a.c(8) : warning C4129: 'c' : unrecognized character escape sequence This is because ecpg doesn't escape \ in the #line string. [How to fix] Escape \ in the input file name like this: #line 1 "c:\\command\\a.pgc" This is necessary not only on Windows but also on UNIX/Linux. For your information, running "gcc -E di\\r/a.c" escapes \ and outputs the line: # 1 "di\\r/a.c" Regards MauMau
Attachment
On Wed, Jul 03, 2013 at 07:22:48PM +0900, MauMau wrote: > I happened to find a trivial bug of ECPG while experimenting with > 9.3 beta 2. Please find attached the patch to fix this. This is > not specific to 9.3. Could you commit and backport this? This appears to be Windows specific. I don't have a Windows system to test with. How does Visusal Studio handle #line entries with full path names? Are they all escaped? Or better do they have to be? > This is necessary not only on Windows but also on UNIX/Linux. For > your information, running "gcc -E di\\r/a.c" escapes \ and outputs > the line: > > # 1 "di\\r/a.c" Now this statement surprises me: michael@feivel:~$ ecpg test\\\\a/init.pgc michael@feivel:~$ grep line test\\\\a/init.c |head -1 #line 1 "test\\a/init.pgc" michael@feivel:~$ gcc -o i test\\\\a/init.c -I /usr/include/postgresql/ -l ecpg michael@feivel:~$ This seems to suggest that it works nicely on Linux. So what do ou mean when saying the problem also occurs on Linux? Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at gmail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
On 07/04/2013 07:04 AM, Michael Meskes wrote: > On Wed, Jul 03, 2013 at 07:22:48PM +0900, MauMau wrote: >> I happened to find a trivial bug of ECPG while experimenting with >> 9.3 beta 2. Please find attached the patch to fix this. This is >> not specific to 9.3. Could you commit and backport this? > This appears to be Windows specific. I don't have a Windows system to test > with. How does Visusal Studio handle #line entries with full path names? Are > they all escaped? Or better do they have to be? > >> This is necessary not only on Windows but also on UNIX/Linux. For >> your information, running "gcc -E di\\r/a.c" escapes \ and outputs >> the line: >> >> # 1 "di\\r/a.c" > Now this statement surprises me: > > michael@feivel:~$ ecpg test\\\\a/init.pgc > michael@feivel:~$ grep line test\\\\a/init.c |head -1 > #line 1 "test\\a/init.pgc" > michael@feivel:~$ gcc -o i test\\\\a/init.c -I /usr/include/postgresql/ -l ecpg > michael@feivel:~$ > > This seems to suggest that it works nicely on Linux. > Really? I'd expect to see 4 backslashes in the #line directive, I think. cheers andrew
On Thu, Jul 04, 2013 at 07:58:39AM -0400, Andrew Dunstan wrote: > >michael@feivel:~$ grep line test\\\\a/init.c |head -1 > >#line 1 "test\\a/init.pgc" > ... > > Really? I'd expect to see 4 backslashes in the #line directive, I think. Eh, why? The four backslashes come are two that are escaped for shell usage. The directory name is in my example was "test\\a". What did I miss? Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at gmail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
On 07/04/2013 08:31 AM, Michael Meskes wrote: > On Thu, Jul 04, 2013 at 07:58:39AM -0400, Andrew Dunstan wrote: >>> michael@feivel:~$ grep line test\\\\a/init.c |head -1 >>> #line 1 "test\\a/init.pgc" >> ... >> >> Really? I'd expect to see 4 backslashes in the #line directive, I think. > Eh, why? The four backslashes come are two that are escaped for shell usage. > The directory name is in my example was "test\\a". What did I miss? > Isn't the argument to #line a C string literal in which one would expect backslashes to be escaped? If not, how would it show a filename containing a '"' character? [andrew@emma inst.92.5701]$ bin/ecpg x\\\"a/y.pgc [andrew@emma inst.92.5701]$ grep line x\\\"a/y.c #line 1 "x\"a/y.pgc" This must surely be wrong. cheers andrew
On 2013-07-04 08:50:34 -0400, Andrew Dunstan wrote: > > On 07/04/2013 08:31 AM, Michael Meskes wrote: > >On Thu, Jul 04, 2013 at 07:58:39AM -0400, Andrew Dunstan wrote: > >>>michael@feivel:~$ grep line test\\\\a/init.c |head -1 > >>>#line 1 "test\\a/init.pgc" > >>... > >> > >>Really? I'd expect to see 4 backslashes in the #line directive, I think. > >Eh, why? The four backslashes come are two that are escaped for shell usage. > >The directory name is in my example was "test\\a". What did I miss? > > > > Isn't the argument to #line a C string literal in which one would expect > backslashes to be escaped? If not, how would it show a filename containing a > '"' character? > > [andrew@emma inst.92.5701]$ bin/ecpg x\\\"a/y.pgc > [andrew@emma inst.92.5701]$ grep line x\\\"a/y.c > #line 1 "x\"a/y.pgc" > > This must surely be wrong. I think it's correct. Quoting the gcc manual (http://gcc.gnu.org/onlinedocs/gcc-4.8.1/cpp/Include-Syntax.html#Include-Syntax) "However, if backslashes occur within file, they are considered ordinary text characters, not escape characters. None of the character escape sequences appropriate to string constants in C are processed. Thus, #include "x\n\\y" specifies a filename containing three backslashes. (Some systems interpret ‘\’ as a pathname separator. All of these also interpret ‘/’ the same way. It is most portable to use only ‘/’.)" Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 07/04/2013 08:58 AM, Andres Freund wrote: > On 2013-07-04 08:50:34 -0400, Andrew Dunstan wrote: >> On 07/04/2013 08:31 AM, Michael Meskes wrote: >>> On Thu, Jul 04, 2013 at 07:58:39AM -0400, Andrew Dunstan wrote: >>>>> michael@feivel:~$ grep line test\\\\a/init.c |head -1 >>>>> #line 1 "test\\a/init.pgc" >>>> ... >>>> >>>> Really? I'd expect to see 4 backslashes in the #line directive, I think. >>> Eh, why? The four backslashes come are two that are escaped for shell usage. >>> The directory name is in my example was "test\\a". What did I miss? >>> >> Isn't the argument to #line a C string literal in which one would expect >> backslashes to be escaped? If not, how would it show a filename containing a >> '"' character? >> >> [andrew@emma inst.92.5701]$ bin/ecpg x\\\"a/y.pgc >> [andrew@emma inst.92.5701]$ grep line x\\\"a/y.c >> #line 1 "x\"a/y.pgc" >> >> This must surely be wrong. > I think it's correct. Quoting the gcc manual > (http://gcc.gnu.org/onlinedocs/gcc-4.8.1/cpp/Include-Syntax.html#Include-Syntax) > "However, if backslashes occur within file, they are considered ordinary > text characters, not escape characters. None of the character escape > sequences appropriate to string constants in C are processed. Thus, > #include "x\n\\y" specifies a filename containing three > backslashes. (Some systems interpret ‘\’ as a pathname separator. All of > these also interpret ‘/’ the same way. It is most portable to use only > ‘/’.)" Well, that refers to #include, but for the sake of argument I'll assume the same rule applies to #line. So this just gets processed by stripping the surrounding quotes? Well I guess I learn something every day. cheers andrew
On 2013-07-04 09:12:37 -0400, Andrew Dunstan wrote: > > On 07/04/2013 08:58 AM, Andres Freund wrote: > >On 2013-07-04 08:50:34 -0400, Andrew Dunstan wrote: > >>On 07/04/2013 08:31 AM, Michael Meskes wrote: > >>>On Thu, Jul 04, 2013 at 07:58:39AM -0400, Andrew Dunstan wrote: > >>>>>michael@feivel:~$ grep line test\\\\a/init.c |head -1 > >>>>>#line 1 "test\\a/init.pgc" > >>>>... > >>>> > >>>>Really? I'd expect to see 4 backslashes in the #line directive, I think. > >>>Eh, why? The four backslashes come are two that are escaped for shell usage. > >>>The directory name is in my example was "test\\a". What did I miss? > >>> > >>Isn't the argument to #line a C string literal in which one would expect > >>backslashes to be escaped? If not, how would it show a filename containing a > >>'"' character? > >> > >> [andrew@emma inst.92.5701]$ bin/ecpg x\\\"a/y.pgc > >> [andrew@emma inst.92.5701]$ grep line x\\\"a/y.c > >> #line 1 "x\"a/y.pgc" > >> > >>This must surely be wrong. > >I think it's correct. Quoting the gcc manual > >(http://gcc.gnu.org/onlinedocs/gcc-4.8.1/cpp/Include-Syntax.html#Include-Syntax) > >"However, if backslashes occur within file, they are considered ordinary > >text characters, not escape characters. None of the character escape > >sequences appropriate to string constants in C are processed. Thus, > >#include "x\n\\y" specifies a filename containing three > >backslashes. (Some systems interpret ‘\’ as a pathname separator. All of > >these also interpret ‘/’ the same way. It is most portable to use only > >‘/’.)" > > Well, that refers to #include, but for the sake of argument I'll assume the > same rule applies to #line. So this just gets processed by stripping the > surrounding quotes? Well I guess I learn something every day. Gah. You're right. I only remembered the rules for #include and thought that would be applicable. But: http://gcc.gnu.org/onlinedocs/gcc-4.8.1/cpp/Line-Control.html#Line-Control : »filename is interpreted according to the normal rules for a string constant: backslash escapes are interpreted. This is different from ‘#include’. Previous versions of CPP did not interpret escapes in ‘#line’; we have changed it because the standard requires they be interpreted, and most other compilers do.« Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Jul 04, 2013 at 03:18:12PM +0200, Andres Freund wrote: > Previous versions of CPP did not interpret escapes in ‘#line’; we have > changed it because the standard requires they be interpreted, and most > other compilers do.« So that means MauMau was right and backslashes have to be escaped in filenames in #line directives, right? Apparently my examples were badly chosen as I didn't see an error no matter how many backslashes I had. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at gmail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
On Wed, Jul 03, 2013 at 07:22:48PM +0900, MauMau wrote: > not specific to 9.3. Could you commit and backport this? Committed to 8.4, 9.0, 9.1, 9.2 and HEAD. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at gmail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
From: "Michael Meskes" <meskes@postgresql.org> > So that means MauMau was right and backslashes have to be escaped in > filenames > in #line directives, right? Apparently my examples were badly chosen as I > didn't see an error no matter how many backslashes I had. Yes, the below examples shows the case: [maumau@myhost ~]$ touch ab\\c/a.pgc [maumau@myhost ~]$ ecpg ab\\c/a.pgc [maumau@myhost ~]$ gcc -c -I/tuna/pgsql/include ab\\c/a.c ab\c/a.c:8:9: warning: unknown escape sequence '\c' [maumau@myhost ~]$ > Committed to 8.4, 9.0, 9.1, 9.2 and HEAD. Thank you very much for your quick support. Regards MauMau
On 07/05/2013 05:16 AM, Michael Meskes wrote: > On Wed, Jul 03, 2013 at 07:22:48PM +0900, MauMau wrote: >> not specific to 9.3. Could you commit and backport this? > Committed to 8.4, 9.0, 9.1, 9.2 and HEAD. > This looks incomplete. Surely just escaping backslashes alone is not enough. I suspect at least the " char and any chars below 0x20 should be quoted also. cheers andrew
On Fri, Jul 05, 2013 at 08:08:06AM -0400, Andrew Dunstan wrote: > This looks incomplete. Surely just escaping backslashes alone is not > enough. I suspect at least the " char and any chars below 0x20 > should be quoted also. Right, this didn't even occur to me, but there are surely more characters that need to be escaped. Gotta dig into this. Thanks for pointing this out. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at gmail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
Michael Meskes <meskes@postgresql.org> writes: > On Wed, Jul 03, 2013 at 07:22:48PM +0900, MauMau wrote: >> not specific to 9.3. Could you commit and backport this? > Committed to 8.4, 9.0, 9.1, 9.2 and HEAD. Um ... 9.3 is a separate branch now, please fix it there also. regards, tom lane
On Fri, Jul 05, 2013 at 09:41:26AM -0400, Tom Lane wrote: > Um ... 9.3 is a separate branch now, please fix it there also. Done. Seems I missed a new branch - yet again. Sorry and thanks for pointing it out to me. michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at gmail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
On Fri, Jul 05, 2013 at 08:08:06AM -0400, Andrew Dunstan wrote: > This looks incomplete. Surely just escaping backslashes alone is not > enough. I suspect at least the " char and any chars below 0x20 > should be quoted also. The " char I just added, however, my tests did bring up any problem with chars below 0x20. If anybody sees another character breaking ecpg, please tell me and I'll fix it. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at gmail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL