Re: Lexing with different charsets - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: Lexing with different charsets
Date
Msg-id 20040414.101855.108739877.t-ishii@sra.co.jp
Whole thread Raw
In response to Lexing with different charsets  (Dennis Bjorklund <db@zigo.dhs.org>)
Responses Re: Lexing with different charsets  (Stephan Szabo <sszabo@megazone.bigpanda.com>)
List pgsql-hackers
> I've spent some more time reading specs today. Together with Peter E's
> explanataion (Thanks!) I think I've got a farily good understanding of the
> parts talking about locales now.
>
> My next question is about lexing. The spec says that one can use strings
> of different charsets in the queries, like:
>
>   ... WHERE field1 = _latin1'FooBar' and field2 = _utf8'Åäö'

In my understanding this was removed as of SQL:1999. I'm not sure
about SQL:2003 though.
--
Tatsuo Ishii

> I can see that the lexer either needs to be taught about all the
> different charsets or this is not going to work very well.
>
> What if one wants to include a string in utf-16 in the query, the lexer
> can not handle that without understanding utf-16. The query can also be in
> different charsets. If it's in utf-8 for example, then we can not embed
> latin1 strings and still have a validating utf-8 query. With the above we
> can not think of the query as being in a single charset anymore. That's
> strange but okay I guess.
>
> The new wire protocol allows us to send data seperatly from the query
> which is nice, but the standard talked about strings as above so it's not
> a solution to the problem.
>
> Maybe I should have adressed this to Peter directly :-)
>
> --
> /Dennis Björklund
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>


pgsql-hackers by date:

Previous
From: Kurt Roeckx
Date:
Subject: Re: 7.5 beta version
Next
From: Stephan Szabo
Date:
Subject: Re: Lexing with different charsets