Thread: Escape handling in COPY, strings, psql
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Here is an updated version of the COPY \x patch. It is the first patch > > attached. > > Also, I realized that if we support \x in COPY, we should also support > > \x in strings to the backend. This is the second patch. > > Do we really want to do any of these things? We've been getting beaten > up recently about the fact that we have non-SQL-spec string escapes > (ie, all the backslash stuff) so I'm a bit dubious about adding more, > especially when there's little if any demand for it. I thought about that, but adding additional escape letters isn't our problem --- it is the escape mechanism itself that is the issue. I have wanted to post on this issue so now is a good time. I think we have been validly beaten up in that we pride ourselves on standards compliance but have escape requirement on all strings. Our string escapes are a major problem --- not the number of them but the requirement to double backslashes on input, like 'C:\\tmp'. I am thinking the only clean solution is to add a special keyword like ESCAPE before strings that contain escape information. I think a GUC is too general. You know if the string is a constant if it contains escapes just by looking at it, and if it is a variable, hopefully you know if it has escapes. Basically, I think we have to deal with this somehow. I think it could be implemented by looking for the ESCAPE keyword in parser/scan.l and handling it all in there by ignoring backslash escapes if ESCAPE preceeds the string. By the time you are in gram.y, it is too late. > I don't object too much to the COPY addition, since that's outside any > spec anyway, but I do think we ought to think twice about adding this > to SQL literal handling. > > > Third, I found out that psql has some unusual handling of escaped > > numbers. Instead of using \ddd as octal, it has \ddd is decimal, \0ddd > > is octal, and \0xddd is decimal. It is basically following the strtol() > > rules for an escaped value. This seems confusing and contradicts how > > the rest of our system works. > > I agree, that's just going to confuse people. > > > ! xqescape [\\][^0-7x] > > If you are going to insist on this, at least make it case-insensitive. The submitted COPY patch also was case-insensitive, \x and \X, but I changed that because we are case-sensitive for all backslashes in COPY, and C is the same (\n and \N are different too, so we actually use the case-sensitivity). Should we allow \X just so it is case-insensitive like the SQL specification X'4f'? That is the only logic I can think of for it to be case-insensitive, but we have to then do that at all levels, and I am not sure it makes sense. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > I thought about that, but adding additional escape letters isn't our > problem --- it is the escape mechanism itself that is the issue. In a random-encoding environment, the option to specify byte values directly -- at any level -- is of limited value anyway and is a potential source of errors. So let's stay away from that. I did not find the original posts that your quotations came from, but it has to be considered that COPY is also subject to encoding processing. Overall, I find this proposal to be a dangerous option. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Bruce Momjian wrote: > Tom Lane wrote: > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > Here is an updated version of the COPY \x patch. It is the first patch > > > attached. > > > Also, I realized that if we support \x in COPY, we should also support > > > \x in strings to the backend. This is the second patch. > > > > Do we really want to do any of these things? We've been getting beaten > > up recently about the fact that we have non-SQL-spec string escapes > > (ie, all the backslash stuff) so I'm a bit dubious about adding more, > > especially when there's little if any demand for it. > > I thought about that, but adding additional escape letters isn't our > problem --- it is the escape mechanism itself that is the issue. > > I have wanted to post on this issue so now is a good time. I think we > have been validly beaten up in that we pride ourselves on standards > compliance but have escape requirement on all strings. Our string > escapes are a major problem --- not the number of them but the > requirement to double backslashes on input, like 'C:\\tmp'. I am > thinking the only clean solution is to add a special keyword like ESCAPE > before strings that contain escape information. I think a GUC is too > general. You know if the string is a constant if it contains escapes > just by looking at it, and if it is a variable, hopefully you know if it > has escapes. > > Basically, I think we have to deal with this somehow. I think it could > be implemented by looking for the ESCAPE keyword in parser/scan.l and > handling it all in there by ignoring backslash escapes if ESCAPE > preceeds the string. By the time you are in gram.y, it is too late. One other idea would be to remove escape processing for single-quoted strings but keep it for our $$ strings, becuase they are not ANSI standard. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > One other idea would be to remove escape processing for single-quoted > strings but keep it for our $$ strings, becuase they are not ANSI > standard. There is *no* escape processing within $$, and never will be, because that would destroy the entire point. You'd be right back having to double backslashes. regards, tom lane
Bruce Momjian wrote: > > I am thinking the only clean solution is to add a special keyword > > like ESCAPE before strings that contain escape information. I > > think a GUC is too general. You know if the string is a constant > > if it contains escapes just by looking at it, and if it is a > > variable, hopefully you know if it has escapes. I do support gradually phasing out backslash escapes in standard string literals in the interest of portability. Most of the current escape sequences are of limited value anyway. Let's think about ways to get there: Enabling escape sequences in string literals controls the formatting of input (and output?) data, so it is akin to, say, the client encoding and the date style, so a GUC variable isn't out of the question in my mind. It makes most sense, though, if we want to eventually make users switch it off all the time, that is, as a transition aid. But before that can happen, we need to come up with an alternative mechanism to enter weird characters. One such way may be to provide functions (say, chr(), tab(), etc.) to give access to unprintable characters, but that will result in terrible performance for long strings and it also won't help with COPY or places where only literals are allowed. Another way would be to allow escape sequences only in specially marked strings. The proposal above doing 'foo' ESCAPE 'x' seems fairly elegant for SQL linguists but would be pretty weird to implement in the lexer. It won't help with COPY either, but that is really the case for all solutions. A more compact represenation may be using a prefix letter, like E'foo'. This fits the SQL syntax, is familiar with Python programmers (although in the other direction), and can be implemented efficiently in the lexer. I like that the best, personally. For COPY, we would probably have to use a flag in the COPY command itself either way (like already done for NULL AS). Comments? Other ideas? Keep the escapes? -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut <peter_e@gmx.net> writes: > I do support gradually phasing out backslash escapes in standard string > literals in the interest of portability. Most of the current escape > sequences are of limited value anyway. Let's think about ways to get > there: I really don't think there is any way to get there without creating gaping security holes in all kinds of client code :-(. If we change the escaping rules, then a client that is expecting some other rule than happens to be in force will be subject to trivial SQL-injection attacks. This will make the autocommit fiasco pale by comparison ... > For COPY, we would probably have to use a flag in the COPY command > itself either way (like already done for NULL AS). The spec-compatibility argument for removing escapes does not apply to COPY at all, so I see no need to fool with the COPY definition in any case. regards, tom lane
Peter Eisentraut wrote: > Bruce Momjian wrote: > > > I am thinking the only clean solution is to add a special keyword > > > like ESCAPE before strings that contain escape information. I > > > think a GUC is too general. You know if the string is a constant > > > if it contains escapes just by looking at it, and if it is a > > > variable, hopefully you know if it has escapes. > > I do support gradually phasing out backslash escapes in standard string > literals in the interest of portability. Most of the current escape > sequences are of limited value anyway. Let's think about ways to get > there: > > Enabling escape sequences in string literals controls the formatting of > input (and output?) data, so it is akin to, say, the client encoding > and the date style, so a GUC variable isn't out of the question in my > mind. It makes most sense, though, if we want to eventually make users > switch it off all the time, that is, as a transition aid. But before > that can happen, we need to come up with an alternative mechanism to > enter weird characters. > > One such way may be to provide functions (say, chr(), tab(), etc.) to > give access to unprintable characters, but that will result in terrible > performance for long strings and it also won't help with COPY or places > where only literals are allowed. > > Another way would be to allow escape sequences only in specially marked > strings. The proposal above doing 'foo' ESCAPE 'x' seems fairly > elegant for SQL linguists but would be pretty weird to implement in the > lexer. It won't help with COPY either, but that is really the case for > all solutions. I was suggesting ESCAPE 'string' or ESC 'string'. The marker has to be before the string so scan.l can alter its processing of the string --- after the string is too late --- there is no way to undo any escaping that has happened, and it might already be used by gram.y. I could probably hack up a sample implementation if people are interested. > A more compact representation may be using a prefix letter, like E'foo'. > This fits the SQL syntax, is familiar with Python programmers (although > in the other direction), and can be implemented efficiently in the > lexer. I like that the best, personally. > > For COPY, we would probably have to use a flag in the COPY command > itself either way (like already done for NULL AS). I agree with Tom that COPY has to be left unchanged. The fundamental problem is the representation of NULL values, that I don't think we can do without some escape mechanism. Single-quote escapes works by doubling them, but once you need to represent something more like null's, I can't think of a solution without escapes. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > I do support gradually phasing out backslash escapes in standard string > > literals in the interest of portability. Most of the current escape > > sequences are of limited value anyway. Let's think about ways to get > > there: > > I really don't think there is any way to get there without creating > gaping security holes in all kinds of client code :-(. If we change > the escaping rules, then a client that is expecting some other rule > than happens to be in force will be subject to trivial SQL-injection > attacks. This will make the autocommit fiasco pale by comparison ... I looked at PQescapeString() and fortunately it escapes single quotes by doing double-single quotes, not by using a backslash. This was probably chosen for standards compliance. Basically, I think our current behavior is not sustainable. I think we are going to need to do something, and I think we should consider a solution now rather than later. I don't think we can be as serious a contender for portability without some kind of solution. I am thinking we should first tell people in 8.1 that they should start using only double-single quotes, and perhaps support the ESCAPE phrase as a no-op, and then consider some kind of solution in 8.2 or later. I don't think fixing this is going to be a huge security problem, but it might be a small one. The good thing is that double-single quotes work, so if people use only that for quote escaping, if you forget the ESCAPE clause, you just get literal backslashes, not a security problem. I ran the following test:test=> select $$\$$; ?column?---------- \(1 row)test=> create table test (x TEXT);CREATE TABLEtest=>INSERT INTO test VALUES ($$\$$);INSERT 0 1test=> SELECT * FROM test; x--- \(1 row) and the good news is that output of backslashes is fine --- it is just input that is the issue, and the security problem is only using \', which we would have to tell people to avoid and start using only ''. I think we can tell people in 8.1 that they should modify their applications to only use '', and that \' might be a security problem in the future. If we get to that then using ESC or not only affects input of values and literal backslashes being entered, and my guess is that 90% of the backslash entries that want escaping are literal in the application and not supplied by program variables. In fact, if we disable backslash by default then strings coming in only have to deal with single quotes (like other databases) and the system is more secure because there is no special backslash handling by default. > > For COPY, we would probably have to use a flag in the COPY command > > itself either way (like already done for NULL AS). > > The spec-compatibility argument for removing escapes does not apply to > COPY at all, so I see no need to fool with the COPY definition in any > case. Agreed. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
> I think we can tell people in 8.1 that they should modify their > applications to only use '', and that \' might be a security problem in > the future. If we get to that then using ESC or not only affects input > of values and literal backslashes being entered, and my guess is that > 90% of the backslash entries that want escaping are literal in the > application and not supplied by program variables. In fact, if we > disable backslash by default then strings coming in only have to deal > with single quotes (like other databases) and the system is more secure > because there is no special backslash handling by default. I can tell you right now this will be a problem :) There are loads of PHP ppl who use addslashes() instead of pg_escape_string() to escape data. Chris
Bruce Momjian wrote: > I was suggesting ESCAPE 'string' or ESC 'string'. The marker has to > be before the string so scan.l can alter its processing of the string > --- after the string is too late --- there is no way to undo any > escaping that has happened, and it might already be used by gram.y. That pretty much corresponds to my E'string' proposal. Both are probably equally trivial to implement. > I agree with Tom that COPY has to be left unchanged. The fundamental > problem is the representation of NULL values, that I don't think we > can do without some escape mechanism. Single-quote escapes works by > doubling them, but once you need to represent something more like > null's, I can't think of a solution without escapes. Yes, I now realize that COPY has a whole set of different rules anyway, so we can leave that out of this discussion. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Christopher Kings-Lynne wrote: > > I think we can tell people in 8.1 that they should modify their > > applications to only use '', and that \' might be a security problem in > > the future. If we get to that then using ESC or not only affects input > > of values and literal backslashes being entered, and my guess is that > > 90% of the backslash entries that want escaping are literal in the > > application and not supplied by program variables. In fact, if we > > disable backslash by default then strings coming in only have to deal > > with single quotes (like other databases) and the system is more secure > > because there is no special backslash handling by default. > > I can tell you right now this will be a problem :) There are loads of > PHP ppl who use addslashes() instead of pg_escape_string() to escape data. I read the PHP addslashes() manual page: http://us3.php.net/addslashes First, I see what people mean about PHP having most of the complex content in comments, rather than in the actual manual text, and this tendency is certainly something we want to avoid --- you end up having to digest all the comments to find the details that should be in the manual already. On to the case at hand, the comments mention that addslashes() isn't safe for all databases, and in fact isn't the prefered method. I do think it could be a problem we have to have people avoid. One idea for 8.1 is to throw a warning if \' appears in a string, thereby helping people find the places they are using the incorrect non-standard escaping. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
> I read the PHP addslashes() manual page: > > http://us3.php.net/addslashes > > First, I see what people mean about PHP having most of the complex > content in comments, rather than in the actual manual text, and this > tendency is certainly something we want to avoid --- you end up having > to digest all the comments to find the details that should be in the > manual already. Actually, all the comments are posted on the php-doc list, with automatic urls in them for 'fixed in cvs', 'rejected', etc. Each comment is supposed to be acted upon (ie. fixed in source), then deleted. There's still a lot of old comment around that hasn't had that treatment though... Chris
Christopher Kings-Lynne wrote: > > I read the PHP addslashes() manual page: > > > > http://us3.php.net/addslashes > > > > First, I see what people mean about PHP having most of the complex > > content in comments, rather than in the actual manual text, and this > > tendency is certainly something we want to avoid --- you end up having > > to digest all the comments to find the details that should be in the > > manual already. > > Actually, all the comments are posted on the php-doc list, with > automatic urls in them for 'fixed in cvs', 'rejected', etc. > > Each comment is supposed to be acted upon (ie. fixed in source), then > deleted. > > There's still a lot of old comment around that hasn't had that treatment > though... Right, they are more _usage_ comments, but still I think they could be consolidated into manual text. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Peter Eisentraut wrote: > Bruce Momjian wrote: > > I was suggesting ESCAPE 'string' or ESC 'string'. The marker has to > > be before the string so scan.l can alter its processing of the string > > --- after the string is too late --- there is no way to undo any > > escaping that has happened, and it might already be used by gram.y. > > That pretty much corresponds to my E'string' proposal. Both are > probably equally trivial to implement. Right. I think your E'' idea has the benefit of fitting with our existing X'' and B'' modifiers. It is also simpler and cleaner to do in scan.l, so I think your idea is best. > > I agree with Tom that COPY has to be left unchanged. The fundamental > > problem is the representation of NULL values, that I don't think we > > can do without some escape mechanism. Single-quote escapes works by > > doubling them, but once you need to represent something more like > > null's, I can't think of a solution without escapes. > > Yes, I now realize that COPY has a whole set of different rules anyway, > so we can leave that out of this discussion. Cool. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Christopher Kings-Lynne wrote: > > > > Each comment is supposed to be acted upon (ie. fixed in source), then > > deleted. > > Right, they are more _usage_ comments, but still I think they could be > consolidated into manual text. If that's "supposed" to happen it certainly hasn't been the de facto procedure. I think they have things partly right here though. A lot of those comments aren't actually the kinds of things that belong in the canonical reference. They include things like "watch out for this common error" or "here's a handy use for this function". Often the "common error" or "handy use" are pretty bogus but every now and then there's a genuinely useful one. These kinds of things would just clutter up a reference. A reference should just state unambiguously this function does XYZ and give examples that help explain XYZ. The PHP Docs do have a bit of a problem in that often the comments include things like "In case X, what happens is Y" which really ought to be covered by the canonical reference. That's a problem. -- greg
Bruce Momjian wrote: > Peter Eisentraut wrote: > > Bruce Momjian wrote: > > > I was suggesting ESCAPE 'string' or ESC 'string'. The marker has to > > > be before the string so scan.l can alter its processing of the string > > > --- after the string is too late --- there is no way to undo any > > > escaping that has happened, and it might already be used by gram.y. > > > > That pretty much corresponds to my E'string' proposal. Both are > > probably equally trivial to implement. > > Right. I think your E'' idea has the benefit of fitting with our > existing X'' and B'' modifiers. It is also simpler and cleaner to do in > scan.l, so I think your idea is best. [ CC list trimmed.] OK, I talked to Tom and Peter and I have come up with a tentative plan. The goal, at some point, is that we would have two types of strings, '' strings and E'' strings. '' strings don't have any special backslash handling for compatibility with with the ANSI spec and all other databases except MySQL (and in MySQL it is now optional). E'' strings behave just like our strings do now, with backslash handling. In 8.0.X, we add support for E'' strings, but it is a noop. This is done just for portability with future releases. We also state that users should stop using \' to escape quotes in strings, and instead use '', and that we will throw a warning in 8.1 if we see a \' in a non-E string. (We could probably throw a warning for E'' use of \' too, but I want to give users the ability to avoid the warning if they can't change from using \' to ''.) In 8.1, we start issuing the warning for \' in non-E strings, plus we tell users who want escape processing that they will have to use E'' strings for this starting in release 8.2, and they should start migrating their escaped strings over to E''. Tom also suggested a readonly GUC variable that is sent to clients that indicates if simple strings have backslash handling, for use by applications that are doing escapes themselves, perhaps 'escape_all_strings'. PQescapeString() and PQescapeBytea() can still be used, but only with E'' strings in 8.2. We could create PQquoteString() for 8.1 and later to allow for just single-quote doubling for non-E strings. Tom asked about how to handle pg_dump contents that have strings, like function bodies. We could start using E'' for those in 8.0 but it does break backward movement of dumps, and someone upgrading from 7.1 to 8.2 would be in trouble. :-( Perhaps we will have another round of subrelease fixes and we can bundle this into that and tell people they have to upgrade to the newest subrelease before going to 8.2. I think we have had that requirement in the past when we had broken pg_dump processing. The good news is that once everyone uses only '' to quote string, we will not have any data security issues with this change. The only potential problem is the mishandling of backslash characters if there is a mismatch between what the client expects and the server uses. By backpatching E'' perhaps even to 7.4 and earlier (as a noop), we could minimize this problem. Is this whole thing ugly? Yes. Can we just close our eyes and hope we can continue with our current behavior while growing a larger userbase --- probabably not. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > The goal, at some point, is that we would have two types of strings, '' > strings and E'' strings. '' strings don't have any special backslash > handling for compatibility with with the ANSI spec and all other > databases except MySQL (and in MySQL it is now optional). E'' strings > behave just like our strings do now, with backslash handling. The only thing I'm not clear on is what exactly is the use case for E'' strings. That is, who do you expect to actually use them? Any new applications are recommended to be using '' strings. And any existing applications obviously won't be using them since they don't currently exist. The only potential candidates are existing applications being ported forward. And that only makes sense if they're currently using some function like addslash. Is it really easier to change all the SQL queries to use E'' (and still have a bug) than it is to replace addslash with PQquoteString() ? Also, I'm really confused why you would make PQescapeString require E'' strings and introduce a new function. That means existing non-buggy applications would suddenly be buggy? And it would be impossible to write a properly functioning application that interpolates a constant into a query that would be portable to 8.2 and 8.0? -- greg
Greg Stark wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > The goal, at some point, is that we would have two types of strings, '' > > strings and E'' strings. '' strings don't have any special backslash > > handling for compatibility with with the ANSI spec and all other > > databases except MySQL (and in MySQL it is now optional). E'' strings > > behave just like our strings do now, with backslash handling. > > > The only thing I'm not clear on is what exactly is the use case for E'' > strings. That is, who do you expect to actually use them? > > Any new applications are recommended to be using '' strings. And any existing > applications obviously won't be using them since they don't currently exist. We are saying to use '' to escape single quotes in all strings. E'' is still useful if you want to use backslash escapes in your strings. Does that answer your questions? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Greg Stark <gsstark@mit.edu> writes: > The only thing I'm not clear on is what exactly is the use case for E'' > strings. That is, who do you expect to actually use them? The case that convinced me we need to keep some sort of backslash capability is this: suppose you want to put a string including a tab into your database. Try to do it with psql:t=> insert into foo values ('<TAB> Guess what: you won't get anywhere, at least not unless you disable readline. So it's nice to be able to use \t. There are related issues involving \r and \n depending on your platform. And this doesn't even scratch the surface of encoding-related funnies. So there's definitely a use-case for keeping the existing backslash behavior, and E'string' seems like a reasonable proposal for doing that without conflicting with the SQL spec. What I do not see at the moment is how we get there from here (ie, dropping backslashing in regular literals) without incurring tremendous pain --- including breaking all existing pg_dump files, introducing security holes and/or data corruption into many existing apps that are not presently broken, and probably some other ways of ruining your day. I'm quite unconvinced that this particular letter of the SQL spec is worth complying with ... regards, tom lane
On Tue, 31 May 2005, Tom Lane wrote: > The case that convinced me we need to keep some sort of backslash > capability is this: suppose you want to put a string including a tab > into your database. Try to do it with psql: > t=> insert into foo values ('<TAB> > Guess what: you won't get anywhere, at least not unless you disable > readline. So it's nice to be able to use \t. To insert a tab using readline you can press ESC followed by TAB. This works as least in readline as it is setup in redhat/fedora (and readline can be setup in 1000 different ways so who knows how portable this is). -- /Dennis Björklund
Dennis Bjorklund <db@zigo.dhs.org> writes: > To insert a tab using readline you can press ESC followed by TAB. ...or ^V followed by TAB, as per age-old tradition. :-) -tih -- Don't ascribe to stupidity what can be adequately explained by ignorance.
On Tue, 31 May 2005, Tom Ivar Helbekkmo wrote: > ...or ^V followed by TAB, as per age-old tradition. :-) Right, I forgot about that one. One can also do other control characters instead of TAB by pressing CTRL-J and similar. Well, I just wanted to point out that it's possible. The main problem is still to make sure that old dumps work and can be imported. I don't see how that can work without a GUC variable in addition to the E'foo' stuff (but that's not so bad as it can be phased in to support old pg_dumps and phased out again in pg 10 or something). -- /Dennis Björklund
Tom Lane wrote: > Greg Stark <gsstark@mit.edu> writes: > > The only thing I'm not clear on is what exactly is the use case for E'' > > strings. That is, who do you expect to actually use them? > > The case that convinced me we need to keep some sort of backslash > capability is this: suppose you want to put a string including a tab > into your database. Try to do it with psql: > t=> insert into foo values ('<TAB> > Guess what: you won't get anywhere, at least not unless you disable > readline. So it's nice to be able to use \t. > > There are related issues involving \r and \n depending on your platform. > And this doesn't even scratch the surface of encoding-related funnies. > > So there's definitely a use-case for keeping the existing backslash > behavior, and E'string' seems like a reasonable proposal for doing that > without conflicting with the SQL spec. > > What I do not see at the moment is how we get there from here (ie, > dropping backslashing in regular literals) without incurring tremendous > pain --- including breaking all existing pg_dump files, introducing > security holes and/or data corruption into many existing apps that are > not presently broken, and probably some other ways of ruining your day. > I'm quite unconvinced that this particular letter of the SQL spec is > worth complying with ... I think this is going to be like the Win32 port, where there is little excitement from our existing users, but it is needed to grow our user base. I think the E'' is useful becuase it gives people a migration path for the escapes they are already using, and the escape mechanism itself it something useful to keep. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Tue, May 31, 2005 at 11:49:20 +0200, Dennis Bjorklund <db@zigo.dhs.org> wrote: > On Tue, 31 May 2005, Tom Lane wrote: > > > The case that convinced me we need to keep some sort of backslash > > capability is this: suppose you want to put a string including a tab > > into your database. Try to do it with psql: > > t=> insert into foo values ('<TAB> > > Guess what: you won't get anywhere, at least not unless you disable > > readline. So it's nice to be able to use \t. > > To insert a tab using readline you can press ESC followed by TAB. This > works as least in readline as it is setup in redhat/fedora (and readline > can be setup in 1000 different ways so who knows how portable this is). There are still advantages to having printable backslashed escaped characters in strings that are saved to files. It makes it easier to see what is really in the string and they are less likely to get accidentally munged when editing the file or moving it between systems with different line termination conventions.
Here is a summary of the issues with moving to no escapes for non-E strings: http://candle.pha.pa.us/cgi-bin/pgescape --------------------------------------------------------------------------- Bruno Wolff III wrote: > On Tue, May 31, 2005 at 11:49:20 +0200, > Dennis Bjorklund <db@zigo.dhs.org> wrote: > > On Tue, 31 May 2005, Tom Lane wrote: > > > > > The case that convinced me we need to keep some sort of backslash > > > capability is this: suppose you want to put a string including a tab > > > into your database. Try to do it with psql: > > > t=> insert into foo values ('<TAB> > > > Guess what: you won't get anywhere, at least not unless you disable > > > readline. So it's nice to be able to use \t. > > > > To insert a tab using readline you can press ESC followed by TAB. This > > works as least in readline as it is setup in redhat/fedora (and readline > > can be setup in 1000 different ways so who knows how portable this is). > > There are still advantages to having printable backslashed escaped characters > in strings that are saved to files. It makes it easier to see what is really > in the string and they are less likely to get accidentally munged when > editing the file or moving it between systems with different line termination > conventions. > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073