Thread: Table name with umlauts

Table name with umlauts

From
Thomas Kellerer
Date:
Hi,

I'm curious why the following is not working:

c:\psql postgres postgres
psql (9.0.1)
Type "help" for help.
postgres=# select version();
                            version
-------------------------------------------------------------
  PostgreSQL 9.0.1, compiled by Visual C++ build 1500, 32-bit
(1 row)


postgres=# select pg_encoding_to_char(encoding) from pg_database where datname = 'postgres';
  pg_encoding_to_char
---------------------
  UTF8
(1 row)


postgres=# show client_encoding;
  client_encoding
-----------------
  UTF8
(1 row)


postgres=# create table umlaut_test_ö (id integer);
ERROR:  invalid byte sequence for encoding "UTF8": 0xf6202869
postgres=#

(it doesn't work either when I quote the table name using "umlaut_test_ö")

When I run the same create table using a JDBC based tool the table *is* created but the table name does not show up
correctlywhen I use DatabaseMetaData.getTables(). 

pgAdmin does not show this table correctly and after creating it through JDBC, psql doesn't show the table name
correctlyeither: 

postgres=> \d umlaut*
  Table "public.umlaut_test_ã¶"
  Column |  Type   | Modifiers
--------+---------+-----------
  id     | integer |


I initially posted this on the JDBC mailing list because I noticed this with Java, but it seems that it's not a JDBC
problem.

Could this be a Windows problem?

Note: I don't really want to use such a table name, I'm just wondering if this _should_ work.

Regards
Thomas



Re: Table name with umlauts

From
Tom Lane
Date:
Thomas Kellerer <spam_eater@gmx.net> writes:
> I'm curious why the following is not working:

> postgres=# show client_encoding;
>   client_encoding
> -----------------
>   UTF8
> (1 row)


> postgres=# create table umlaut_test_� (id integer);
> ERROR:  invalid byte sequence for encoding "UTF8": 0xf6202869

It looks to me like your console is not in fact producing UTF8;
it's representing � as 0xf6, which I think is right for Latin1.
Select the proper client_encoding.

            regards, tom lane

Re: Table name with umlauts

From
Thomas Kellerer
Date:
Tom Lane wrote on 22.11.2010 19:25:
> Thomas Kellerer<spam_eater@gmx.net>  writes:
>> I'm curious why the following is not working:
>
>> postgres=# show client_encoding;
>>    client_encoding
>> -----------------
>>    UTF8
>> (1 row)
>
>
>> postgres=# create table umlaut_test_ö (id integer);
>> ERROR:  invalid byte sequence for encoding "UTF8": 0xf6202869
>
> It looks to me like your console is not in fact producing UTF8;
> it's representing ö as 0xf6, which I think is right for Latin1.
> Select the proper client_encoding.
>

I assume you mean the encoding in the console?

I changed to "chcp 1252" before running psql (I tried several other encodings as well)

And why does the JDBC driver return this incorrectly as well?
Create table and drop table is working through JDBC, but displaying the table names does not.


Regards
Thomas


Re: Table name with umlauts

From
Raymond O'Donnell
Date:
On 22/11/2010 19:01, Thomas Kellerer wrote:
> Tom Lane wrote on 22.11.2010 19:25:
>> Thomas Kellerer<spam_eater@gmx.net> writes:
>>> I'm curious why the following is not working:
>>
>>> postgres=# show client_encoding;
>>> client_encoding
>>> -----------------
>>> UTF8
>>> (1 row)
>>
>>
>>> postgres=# create table umlaut_test_ö (id integer);
>>> ERROR: invalid byte sequence for encoding "UTF8": 0xf6202869
>>
>> It looks to me like your console is not in fact producing UTF8;
>> it's representing ö as 0xf6, which I think is right for Latin1.
>> Select the proper client_encoding.
>>
>
> I assume you mean the encoding in the console?

No, he means the encoding on the connection:

   http://www.postgresql.org/docs/9.0/static/multibyte.html#AEN30728

...so that the server returns the correct characters for your console.

Ray.

--
Raymond O'Donnell :: Galway :: Ireland
rod@iol.ie

Re: Table name with umlauts

From
Tom Lane
Date:
"Raymond O'Donnell" <rod@iol.ie> writes:
> On 22/11/2010 19:01, Thomas Kellerer wrote:
>> Tom Lane wrote on 22.11.2010 19:25:
>>> It looks to me like your console is not in fact producing UTF8;
>>> it's representing � as 0xf6, which I think is right for Latin1.
>>> Select the proper client_encoding.

>> I assume you mean the encoding in the console?

> No, he means the encoding on the connection:
>    http://www.postgresql.org/docs/9.0/static/multibyte.html#AEN30728
> ...so that the server returns the correct characters for your console.

For this problem it's actually more the other way round: the characters
being *sent* to the server have to be in the encoding you said they'd be
in, namely client_encoding.

I had the idea that the Windows version of psql was smart enough to
set client_encoding based on the console encoding it finds itself
running under, but I might be wrong about that.  Or maybe you did
something that overrode its default?

>> I changed to "chcp 1252" before running psql (I tried several other encodings as well)

Try "set client_encoding = win1252", then.

            regards, tom lane

Re: Table name with umlauts

From
Peter Geoghegan
Date:
On 22 November 2010 19:36, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I had the idea that the Windows version of psql was smart enough to
> set client_encoding based on the console encoding it finds itself
> running under, but I might be wrong about that.  Or maybe you did
> something that overrode its default?

Apparently not:

Server [localhost]:
Database [postgres]:
Port [5432]:
Username [postgres]:
psql (8.4.5)
WARNING: Console code page (850) differs from Windows code page (1252)
         8-bit characters might not work correctly. See psql reference
         page "Notes for Windows users" for details.
Type "help" for help.

postgres=# show client_encoding;
 client_encoding
-----------------
 UTF8
(1 row)


postgres=#

--
Regards,
Peter Geoghegan

Re: Table name with umlauts

From
Thomas Kellerer
Date:
Tom Lane wrote on 22.11.2010 20:36:
> I had the idea that the Windows version of psql was smart enough to
> set client_encoding based on the console encoding it finds itself
> running under, but I might be wrong about that.  Or maybe you did
> something that overrode its default?
>
>>> I changed to "chcp 1252" before running psql (I tried several other encodings as well)
>
> Try "set client_encoding = win1252", then.
>

Thanks for the hint, unfortunately psql still shows the same behaviour.

Regards
Thomas