Re: Unicode string literals versus the world - Mailing list pgsql-hackers

From Marko Kreen
Subject Re: Unicode string literals versus the world
Date
Msg-id e51f66da0904111147xd206355h49bc143eb853bb65@mail.gmail.com
Whole thread Raw
In response to Unicode string literals versus the world  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Unicode string literals versus the world  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
On 4/11/09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>  It gets worse though: I have seldom seen such a badly designed piece of
>  syntax as the Unicode string syntax --- see
>  http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-UESCAPE
>
>  You scan the string, and then after that they tell you what the escape
>  character is!?  Not to mention the obvious ambiguity with & as an
>  operator.
>
>  If we let this go into 8.4, our previous rounds with security holes
>  caused by careless string parsing will look like a day at the beach.
>  No frontend that isn't fully cognizant of the Unicode string syntax is
>  going to parse such things correctly --- it's going to be trivial for
>  a bad guy to confuse a quoting mechanism as to what's an escape and what
>  isn't.
>
>  I think we need to give very serious consideration to ripping out that
>  "feature".

Ugh, it's rather dubious indeed.  Especially when we are already in
the middle of seriously confusing conversion from stdstr=off -> on.
Is it really OK to introduce even more complexity in the mix?

Alternative proposal - maybe it would be saner to introduce \uXXXX
escape to E'' strings as a non-standard way for quoting unicode.

Later when the standard quoting is our only quoting method we can play
with standard extensions?

-- 
marko


pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Allow COMMENT ON to accept an expression rather than just a string
Next
From: Josh Berkus
Date:
Subject: Re: Closing some 8.4 open items