UTF-8 encoding problem w/ libpq - Mailing list pgsql-hackers

From Martin Schäfer
Subject UTF-8 encoding problem w/ libpq
Date
Msg-id 11A8567A97B15648846060F5CD818EB8CAC2253F5E@DEV001EX.Dev.cadcorp.net
Whole thread Raw
Responses Re: UTF-8 encoding problem w/ libpq
List pgsql-hackers

I try to create database columns with umlauts, using the UTF8 client encoding. However, the server seems to mess up the column names. In particular, it seems to perform a lowercase operation on each byte of the UTF-8 multi-byte sequence.

 

Here is my code:

 

    const wchar_t *strName = L"id_äß";

    wstring strCreate = wstring(L"create table test_umlaut(") + strName + L" integer primary key)";

 

    PGconn *pConn = PQsetdbLogin("", "", NULL, NULL, "dev503", "postgres", "******");

    if (!pConn) FAIL;

    if (PQsetClientEncoding(pConn, "UTF-8")) FAIL;

 

    PGresult *pResult = PQexec(pConn, "drop table test_umlaut");

    if (pResult) PQclear(pResult);

 

    pResult = PQexec(pConn, ToUtf8(strCreate.c_str()).c_str());

    if (pResult) PQclear(pResult);

 

    pResult = PQexec(pConn, "select * from test_umlaut");

    if (!pResult) FAIL;

    if (PQresultStatus(pResult)!=PGRES_TUPLES_OK) FAIL;

    if (PQnfields(pResult)!=1) FAIL;

    const char *fName = PQfname(pResult,0);

 

    ShowW("Name:     ", strName);

    ShowA("in UTF8:  ", ToUtf8(strName).c_str());

    ShowA("from DB:  ", fName);

    ShowW("in UTF16: ", ToWide(fName).c_str());

 

    PQclear(pResult);

    PQreset(pConn);

 

(ShowA/W call OutputDebugStringA/W, and ToUtf8/ToWide use WideCharToMultiByte/MultiByteToWideChar with CP_UTF8.)

 

And this is the output generated:

 

Name:     id_äß

in UTF8:  id_äß

from DB:  id_ã¤ãÿ

in UTF16: id_???

 

It seems like the backend thinks the name is in ANSI encoding, not in UTF-8.

If I change the strCreate query and add double quotes around the column name, then the problem disappears. But the original name is already in lowercase, so I think it should also work without quoting the column name.

Am I missing some setup in either the database or in the use of libpq?

 

I’m using PostgreSQL 9.2.1, compiled by Visual C++ build 1600, 64-bit

 

The database uses:

ENCODING = 'UTF8'

LC_COLLATE = 'English_United Kingdom.1252'

LC_CTYPE = 'English_United Kingdom.1252'

 

Thanks for any help,

 

Martin

 

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Perl 5.18 breaks pl/perl regression tests?
Next
From: "ktm@rice.edu"
Date:
Subject: Re: UTF-8 encoding problem w/ libpq