Thread: Backslashes in string literals

Backslashes in string literals

From
"Kevin Grittner"
Date:
I've just been bitten by the "backslash in string literals" issue.  I
have reviewed the mailing lists and the TODO list.  I see that the
direction PostgreSQL is headed is to drop the nonstandard escapes,
unless an extended literal is explicitly used.  I've attached a patch
which supports this as a configure option, using a
--enable-standard-strings switch.  It's probably somewhat naive, but
maybe it can be helpful.

-Kevin



Attachment

Re: Backslashes in string literals

From
Bruce Momjian
Date:
I think we we will be turning on escape_string_warning in 8.2 and allow
standard_conforming_strings to be optionally turned on in that releaes.
I will keep the patch for us in completing that item.

This has been saved for the 8.2 release:
http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Kevin Grittner wrote:
> I've just been bitten by the "backslash in string literals" issue.  I
> have reviewed the mailing lists and the TODO list.  I see that the
> direction PostgreSQL is headed is to drop the nonstandard escapes,
> unless an extended literal is explicitly used.  I've attached a patch
> which supports this as a configure option, using a
> --enable-standard-strings switch.  It's probably somewhat naive, but
> maybe it can be helpful.
> 
> -Kevin
> 
> 

[ Attachment, skipping... ]

> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Backslashes in string literals

From
Peter Eisentraut
Date:
Kevin Grittner wrote:
> direction PostgreSQL is headed is to drop the nonstandard escapes,
> unless an extended literal is explicitly used.  I've attached a patch
> which supports this as a configure option, using a
> --enable-standard-strings switch.

There is already a run-time configuration option 
standard_conforming_strings which does what you seem to have in mind.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: Backslashes in string literals

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Kevin Grittner wrote:
> > direction PostgreSQL is headed is to drop the nonstandard escapes,
> > unless an extended literal is explicitly used.  I've attached a patch
> > which supports this as a configure option, using a
> > --enable-standard-strings switch.
> 
> There is already a run-time configuration option 
> standard_conforming_strings which does what you seem to have in mind.

Yes, but right now it is read-only.  We didn't have time to allow it to
be set to true in this release.  I think it has to wait for 8.2.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Backslashes in string literals

From
"Kevin Grittner"
Date:
>>> On Fri, Dec 9, 2005 at 11:24 am, in message
<200512091824.28760.peter_e@gmx.net>, Peter Eisentraut
<peter_e@gmx.net> wrote:

> Kevin Grittner wrote:
>> direction PostgreSQL is headed is to drop the nonstandard escapes,
>> unless an extended literal is explicitly used.  I've attached a
patch
>> which supports this as a configure option, using a
>> -- enable- standard- strings switch.
> 
> There is already a run- time configuration option 
> standard_conforming_strings which does what you seem to have in
mind.

As Bruce has mentioned, this is currently read-only, set to off.

I needed something fast, and I could see a way to do it quickly with a
configure switch, to compile it for standard behavior.  Since the
non-standard behavior is in the lexer, I couldn't see any reasonable way
to base it on a runtime switch.  I'm curious what is intended here.  Can
anyone give a one-paragraph explanation of how this configuration option
will work?

-Kevin




Re: Backslashes in string literals

From
Bruce Momjian
Date:
Kevin Grittner wrote:
> >>> On Fri, Dec 9, 2005 at 11:24 am, in message
> <200512091824.28760.peter_e@gmx.net>, Peter Eisentraut
> <peter_e@gmx.net> wrote:
> 
> > Kevin Grittner wrote:
> >> direction PostgreSQL is headed is to drop the nonstandard escapes,
> >> unless an extended literal is explicitly used.  I've attached a
> patch
> >> which supports this as a configure option, using a
> >> -- enable- standard- strings switch.
> > 
> > There is already a run- time configuration option 
> > standard_conforming_strings which does what you seem to have in
> mind.
> 
> As Bruce has mentioned, this is currently read-only, set to off.
> 
> I needed something fast, and I could see a way to do it quickly with a
> configure switch, to compile it for standard behavior.  Since the
> non-standard behavior is in the lexer, I couldn't see any reasonable way
> to base it on a runtime switch.  I'm curious what is intended here.  Can
> anyone give a one-paragraph explanation of how this configuration option
> will work?

Have you read our documentation?

http://www.postgresql.org/docs/8.1/static/sql-syntax.html#SQL-SYNTAX-CONSTANTShttp://www.postgresql.org/docs/8.1/static/runtime-config-compatible.html#RUNTIME-CONFIG-COMPATIBLE-VERSION

Between those and the release notes, I don't know what additional
information you want.  In the future you will set
standard_conforming_strings to on and backslashes will be treated
literally.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Backslashes in string literals

From
"Kevin Grittner"
Date:
>>> On Sat, Dec 10, 2005 at  8:01 pm, in message
<200512110201.jBB21PE16562@candle.pha.pa.us>, Bruce Momjian
<pgman@candle.pha.pa.us> wrote: 
> Kevin Grittner wrote:
>> Since the
>> non- standard behavior is in the lexer, I couldn't see any
reasonable way
>> to base it on a runtime switch.  I'm curious what is intended here. 
Can
>> anyone give a one- paragraph explanation of how this configuration
option
>> will work?
> 
> Have you read our documentation?
>     http://www.postgresql.org/docs/8.1/static/sql- syntax.html#SQL-
SYNTAX- CONSTANTS
>     http://www.postgresql.org/docs/8.1/static/runtime- config-
compatible.html#RUNTI
> ME- CONFIG- COMPATIBLE- VERSION

Yes.

> Between those and the release notes, I don't know what additional
> information you want.  In the future you will set
> standard_conforming_strings to on and backslashes will be treated
> literally.

Perhaps my language was ambiguous.  I'm not curious about the intended
behavior from a user perspective, but what I might have missed in the
source code which would have allowed me to write my patch to better
comply with the documentation you cited.  Since the problem is in the
lexer, the only way I could see to implement it as a run-time
configuration option, rather than a compile-time option, would be to
duplicate the lexer and maintain two sets of rules in parallel.  I
generally try to avoid maintaining two parallel copies of code.  I'm
curious whether I missed some other programming approach.

-Kevin



Re: Backslashes in string literals

From
Bruce Momjian
Date:
Kevin Grittner wrote:
> > Between those and the release notes, I don't know what additional
> > information you want.  In the future you will set
> > standard_conforming_strings to on and backslashes will be treated
> > literally.
> 
> Perhaps my language was ambiguous.  I'm not curious about the intended
> behavior from a user perspective, but what I might have missed in the
> source code which would have allowed me to write my patch to better
> comply with the documentation you cited.  Since the problem is in the
> lexer, the only way I could see to implement it as a run-time
> configuration option, rather than a compile-time option, would be to
> duplicate the lexer and maintain two sets of rules in parallel.  I
> generally try to avoid maintaining two parallel copies of code.  I'm
> curious whether I missed some other programming approach.

Oh, that question.  :-)  We haven't looked yet at what it will require
to do this in the lexer, but I think what we will eventually do is to
add a more generalized filter to the lexer, and have the actions behave
differntly based on the boolean of whether we are using sql-standard
strings.

If you keep you eye on hackers or the committers messages you will see
when we commit the change for 8.2.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Backslashes in string literals

From
"Kevin Grittner"
Date:
We found a bug in the code from my first patch.  Since it was a low
frequency, non-destructive type of problem for us, I was able to take my
time and look over the task a little more closely.  Attached is a patch
which should come close to implementing the TODO.  In particular, it is
now implemented as a configurable option, which can be set in the
postgresql.conf file or at run time.  There are some remaining issues:

(1)  I couldn't figure out the best way to obtain a value for
standard_conforming_strings in the psql version of the scanner.  For our
needs, could just assume it is always on, so I left it that way.
Someone with a better handle on this issue can hopefully finish that
part.  Alternatively, if you give me some direction, I might have time
to generalize it.  As far as I can tell from some testing today,
everything works fine issuing statements through a connection, but psql
isn't settled down.

(2)  There should probably be some tests added to exercise these
options.

(3)  I took a quick shot at the documentation, but I'm sure I didn't
cover everything.

(4)  I made the changes on REL8_1_STABLE, since we need it now.  This
patch is relative to that branch, not the trunk.

I hope this is helpful.  If there's anything I should have done
differently, please let me know, so I can try to do better next time.

-Kevin


>>> On Fri, Dec 9, 2005 at 11:23 am, in message
<200512091723.jB9HNjB16785@candle.pha.pa.us>, Bruce Momjian
<pgman@candle.pha.pa.us> wrote:

> I think we we will be turning on escape_string_warning in 8.2 and
allow
> standard_conforming_strings to be optionally turned on in that
releaes.
> I will keep the patch for us in completing that item.
>
> This has been saved for the 8.2 release:
>
>     http://momjian.postgresql.org/cgi- bin/pgpatches_hold


Attachment

Re: Backslashes in string literals

From
"Kevin Grittner"
Date:
>>> On Wed, Jan 25, 2006 at  4:46 pm, in message
<43D7AB6B.EE98.0025.0@wicourts.gov>, "Kevin Grittner"
<Kevin.Grittner@wicourts.gov> wrote:
>
> (2)  There should probably be some tests added to exercise these
> options.

Attached is a patch to address this one.  Note that until psql is
fixed, this test will fail.  I manually generated a portion of the text
to match what I expect to get once psql is fixed, so there could be
typos.

-Kevin




Attachment

Re: Backslashes in string literals

From
Bruce Momjian
Date:
Kevin Grittner wrote:
> We found a bug in the code from my first patch.  Since it was a low
> frequency, non-destructive type of problem for us, I was able to take my
> time and look over the task a little more closely.  Attached is a patch
> which should come close to implementing the TODO.  In particular, it is
> now implemented as a configurable option, which can be set in the
> postgresql.conf file or at run time.  There are some remaining issues:
> 
> (1)  I couldn't figure out the best way to obtain a value for
> standard_conforming_strings in the psql version of the scanner.  For our
> needs, could just assume it is always on, so I left it that way. 
> Someone with a better handle on this issue can hopefully finish that
> part.  Alternatively, if you give me some direction, I might have time
> to generalize it.  As far as I can tell from some testing today,
> everything works fine issuing statements through a connection, but psql
> isn't settled down.

Sounds like you made great progress!

The proper way to do (1) is to call libpq's pqSaveParameterStatus() from
psql.  Take a look for psql's session_username().  It is called
everytime the prompt is printed if the username is required.  One great
feature of using pqSaveParameterStatus() is that it reads server packets
and keeps the tracked value updated for you without query overhead.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Backslashes in string literals

From
"Kevin Grittner"
Date:
>>> On Wed, Feb 1, 2006 at 10:50 am, in message
<200602011650.k11GoiU23147@candle.pha.pa.us>, Bruce Momjian
<pgman@candle.pha.pa.us> wrote: 
> Kevin Grittner wrote:
>> We found a bug in the code from my first patch.  Since it was a low
>> frequency, non- destructive type of problem for us, I was able to
take my
>> time and look over the task a little more closely.  Attached is a
patch
>> which should come close to implementing the TODO.  In particular, it
is
>> now implemented as a configurable option, which can be set in the
>> postgresql.conf file or at run time.  There are some remaining
issues:
>> 
>> (1)  I couldn't figure out the best way to obtain a value for
>> standard_conforming_strings in the psql version of the scanner.  For
our
>> needs, could just assume it is always on, so I left it that way. 
>> Someone with a better handle on this issue can hopefully finish
that
>> part.  Alternatively, if you give me some direction, I might have
time
>> to generalize it.  As far as I can tell from some testing today,
>> everything works fine issuing statements through a connection, but
psql
>> isn't settled down.
> 
> Sounds like you made great progress!

Thanks.  It was actually pretty easy once I took the time to learn
flex.  I'd kinda winged it in my emergency compile-time version.  I'm
pretty sure that what I've done works; my biggest concern is over what
I've missed.  For example, I was using pg_dump and pg_restore today and
realized that these, and other applications, likely need some kind of
work to support the feature.

> The proper way to do (1) is to call libpq's pqSaveParameterStatus()
from
> psql.  Take a look for psql's session_username().  It is called
> everytime the prompt is printed if the username is required.  One
great
> feature of using pqSaveParameterStatus() is that it reads server
packets
> and keeps the tracked value updated for you without query overhead.

I'll take a look at it.  If I feel confident that I "get it", I'll do
the work and post another patch.  Would you prefer that I resend the
whole works, or just the delta?

Also, since we're doing this out of need to fix the issue on our
production system, I'm compelled to work on the stable branch.  Is it OK
to post patches from the tip of that branch, or should I really check
out the trunk (HEAD), too, and replicate it there for my patch posts?

-Kevin



Re: Backslashes in string literals

From
Bruce Momjian
Date:
Kevin Grittner wrote:
> >> to generalize it.  As far as I can tell from some testing today,
> >> everything works fine issuing statements through a connection, but
> psql
> >> isn't settled down.
> > 
> > Sounds like you made great progress!
> 
> Thanks.  It was actually pretty easy once I took the time to learn
> flex.  I'd kinda winged it in my emergency compile-time version.  I'm
> pretty sure that what I've done works; my biggest concern is over what
> I've missed.  For example, I was using pg_dump and pg_restore today and
> realized that these, and other applications, likely need some kind of
> work to support the feature.

Right, I will look over the rest of the code and fix any places you
missed.  I think most of it centers around ESCAPE_STRING_SYNTAX usage.

> > The proper way to do (1) is to call libpq's pqSaveParameterStatus()
> from
> > psql.  Take a look for psql's session_username().  It is called
> > everytime the prompt is printed if the username is required.  One
> great
> > feature of using pqSaveParameterStatus() is that it reads server
> packets
> > and keeps the tracked value updated for you without query overhead.
> 
> I'll take a look at it.  If I feel confident that I "get it", I'll do
> the work and post another patch.  Would you prefer that I resend the
> whole works, or just the delta?

I would like the whole patch rather than just an additional one.  Again,
I will review it and polish whatever you don't do.

> Also, since we're doing this out of need to fix the issue on our
> production system, I'm compelled to work on the stable branch.  Is it OK
> to post patches from the tip of that branch, or should I really check
> out the trunk (HEAD), too, and replicate it there for my patch posts?

The branch is fine.  I will merge any changes in to HEAD.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Backslashes in string literals

From
Bruce Momjian
Date:
Kevin Grittner wrote:
> >>> On Wed, Feb 1, 2006 at 10:50 am, in message
> <200602011650.k11GoiU23147@candle.pha.pa.us>, Bruce Momjian
> <pgman@candle.pha.pa.us> wrote: 
> >> 
> >> (1)  I couldn't figure out the best way to obtain a value for
> >> standard_conforming_strings in the psql version of the scanner.
> > 
> > The proper way to do (1) is to call libpq's pqSaveParameterStatus()
> from
> > psql.  Take a look for psql's session_username().  It is called
> > everytime the prompt is printed if the username is required.  One
> great
> > feature of using pqSaveParameterStatus() is that it reads server
> packets
> > and keeps the tracked value updated for you without query overhead.
> 
> My attempt to do as you suggest isn't working.  It behaves as though
> the standard_strings() function I added to common.c is always returning
> false.  (If I comment out the reference the function, and the else
> clause, I can get psql to work with the "on" state; otherwise, no joy. 
> The back end is working fine in all my tests.)  I tried to mimic the
> technique in the existing functions.  Can you give me a clue where I'm
> going wrong?

OK, I got it working.  The fix is to add GUC_REPORT to guc.c for
standard_conforming_strings.  See the same flag on
session_authorization.  That will cause libpq to see any changes made to
that variable.  Sorry I didn't know that detail before.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Backslashes in string literals

From
"Kevin Grittner"
Date:
>>> On Thu, Feb 9, 2006 at 10:31 pm, in message
<200602100431.k1A4VlY21635@candle.pha.pa.us>, Bruce Momjian
<pgman@candle.pha.pa.us> wrote:
>
> OK, I got it working.  The fix is to add GUC_REPORT to guc.c for
> standard_conforming_strings.  See the same flag on
> session_authorization.  That will cause libpq to see any changes made
to
> that variable.  Sorry I didn't know that detail before.

Fantastic!  I added that flag on my end, and everything I've tried is
working perfectly, except: I found that I didn't get my "expected" file
100% right with my hand-crafted attempt.  You're probably already on
that, but just in case it might save you a few minutes -- attached is a
good patch for the "expected" file to go with the new regression test
script for strings.

-Kevin


Attachment

Re: Backslashes in string literals

From
Bruce Momjian
Date:
Kevin Grittner wrote:
> >>> On Thu, Feb 9, 2006 at 10:31 pm, in message
> <200602100431.k1A4VlY21635@candle.pha.pa.us>, Bruce Momjian
> <pgman@candle.pha.pa.us> wrote: 
> > 
> > OK, I got it working.  The fix is to add GUC_REPORT to guc.c for
> > standard_conforming_strings.  See the same flag on
> > session_authorization.  That will cause libpq to see any changes made
> to
> > that variable.  Sorry I didn't know that detail before.
> 
> Fantastic!  I added that flag on my end, and everything I've tried is
> working perfectly, except: I found that I didn't get my "expected" file
> 100% right with my hand-crafted attempt.  You're probably already on
> that, but just in case it might save you a few minutes -- attached is a
> good patch for the "expected" file to go with the new regression test
> script for strings.
> 

Oh, what I normally do is to look at regression.diff, and if that looks
OK, I just apply it to the expected file like this:
cd expectedpatch < ../regression.diff

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Backslashes in string literals

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Oh, what I normally do is to look at regression.diff, and if that looks
> OK, I just apply it to the expected file like this:

>     cd expected
>     patch < ../regression.diff

Oh, that explains a few things ...

It's much better to just copy the result file over the expected file
once you've decided it's OK.  The regression.diff file is inexact
because of the diff switches that are used.

If you need to update expected variants your machine doesn't generate,
make a fresh regular diff off the expected and actual, and apply that
to the other variants.
        regards, tom lane


Re: Backslashes in string literals

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Oh, what I normally do is to look at regression.diff, and if that looks
> > OK, I just apply it to the expected file like this:
> 
> >     cd expected
> >     patch < ../regression.diff
> 
> Oh, that explains a few things ...
> 
> It's much better to just copy the result file over the expected file
> once you've decided it's OK.  The regression.diff file is inexact
> because of the diff switches that are used.

I am confused.  patch dosen't make an indentical file?  Example?

> If you need to update expected variants your machine doesn't generate,
> make a fresh regular diff off the expected and actual, and apply that
> to the other variants.

Ah, good point.  I often forget about those.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Backslashes in string literals

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> It's much better to just copy the result file over the expected file
>> once you've decided it's OK.  The regression.diff file is inexact
>> because of the diff switches that are used.

> I am confused.  patch dosen't make an indentical file?  Example?

Not when the diff it's given to work from ignores spaces ...
        regards, tom lane


Re: Backslashes in string literals

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom Lane wrote:
> >> It's much better to just copy the result file over the expected file
> >> once you've decided it's OK.  The regression.diff file is inexact
> >> because of the diff switches that are used.
> 
> > I am confused.  patch dosen't make an indentical file?  Example?
> 
> Not when the diff it's given to work from ignores spaces ...

Ah, interesting.  I had not realized that.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Backslashes in string literals

From
"Kevin Grittner"
Date:
This patch doesn't leave the standard_conforming_strings entry in guc.c
with the GUC_REPORT flag, which it needs for psql to work right.  Should
I submit one last patch with this fix and the proper "expected"
regression file?  If so, where should I send it?  (The hackers list
won't take a file as big as that patch.)

-Kevin


>>> On Sun, Feb 12, 2006 at  3:17 pm, in message
<200602122117.k1CLHuU22118@candle.pha.pa.us>, Bruce Momjian
<pgman@candle.pha.pa.us> wrote: 

> Your patch has been added to the PostgreSQL unapplied patches list
at:
> 
>     http://momjian.postgresql.org/cgi- bin/pgpatches
> 
> It will be applied as soon as one of the PostgreSQL committers
reviews
> and approves it.



Re: Backslashes in string literals

From
Bruce Momjian
Date:
Kevin Grittner wrote:
> This patch doesn't leave the standard_conforming_strings entry in guc.c
> with the GUC_REPORT flag, which it needs for psql to work right.  Should
> I submit one last patch with this fix and the proper "expected"
> regression file?  If so, where should I send it?  (The hackers list
> won't take a file as big as that patch.)

Oh, I was just going to add the GUC_REPORT when I applied the patch.  I
put that email in the patch queue so I would not forget.

I you want, send a mega patch to the patches list,
pgsql-patches@postgresql.org.  One large patch is usually safest to
apply.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073