Re: Unicode string literals versus the world - Mailing list pgsql-hackers

From Marko Kreen
Subject Re: Unicode string literals versus the world
Date
Msg-id e51f66da0904141251i52fb42d3t6a7f4bed43807ac@mail.gmail.com
Whole thread Raw
In response to Re: Unicode string literals versus the world  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 4/14/09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
>  > On Tuesday 14 April 2009 18:54:33 Tom Lane wrote:
>  >> The other proposal that seemed
>  >> attractive to me was a decode-like function:
>  >>
>  >> uescape('foo\00e9bar')
>  >> uescape('foo\00e9bar', '\')
>
>  > This was discussed previously, but rejected with the following argument:
>
>  > There are some other disadvantages for making a function call.  You
>  > couldn't use that kind of literal in any other place where the parser
>  > calls for a string constant: role names, tablespace locations,
>  > passwords, copy delimiters, enum values, function body, file names.
>
>
> I'm less than convinced that those are really plausible use-cases for
>  characters that one is unable to type directly.  However, I'll grant the
>  point.  So that narrows us down to considering the \u extension to E''
>  strings as a saner and safer alternative to the spec's syntax.

My vote would go to \u.  The U& may be "sql standard" but it's different
from any established practical standard.


Alternative would be to make U& follow stdstr setting:

stdstr=on -> you get fully standard-conforming syntax:
 U&'\xxx' UESCAPE '\'

stdstr=off -> you need to follow old quoting rules:
 U&'\\xxx' UESCAPE '\\'

This would result in safe, and when stdstr=on, fully standard compliant
syntax.  Only downside would be that in practice - stdstr=off - it would
be unusable.


Third alternative would be to do both of them - \u as a usable method
and safe-U& to mark the checkbox for SQL-standard compliance.
If we do want U&, I would prefer that to U&-only syntax.

-- 
marko


pgsql-hackers by date:

Previous
From: "Meredith L. Patterson"
Date:
Subject: Re: Unicode string literals versus the world
Next
From: Tom Lane
Date:
Subject: Replacing plpgsql's lexer