UTF-8 encoding problem w/ libpq - Mailing list pgsql-hackers

From Martin Schäfer
Subject UTF-8 encoding problem w/ libpq
Date
Msg-id 11A8567A97B15648846060F5CD818EB8CAC2253F5E@DEV001EX.Dev.cadcorp.net
Whole thread Raw
Responses Re: UTF-8 encoding problem w/ libpq
List pgsql-hackers
<div class="WordSection1"><p class="MsoNormal">I try to create database columns with umlauts, using the UTF8 client
encoding.However, the server seems to mess up the column names. In particular, it seems to perform a lowercase
operationon each byte of the UTF-8 multi-byte sequence.<p class="MsoNormal"> <p class="MsoNormal">Here is my code:<p
class="MsoNormal"> <pclass="MsoNormal">    const wchar_t *strName = L"id_äß";<p class="MsoNormal">    wstring strCreate
=wstring(L"create table test_umlaut(") + strName + L" integer primary key)";<p class="MsoNormal"> <p
class="MsoNormal">   PGconn *pConn = PQsetdbLogin("", "", NULL, NULL, "dev503", "postgres", "******");<p
class="MsoNormal">   if (!pConn) FAIL;<p class="MsoNormal">    if (PQsetClientEncoding(pConn, "UTF-8")) FAIL;<p
class="MsoNormal"> <pclass="MsoNormal">    PGresult *pResult = PQexec(pConn, "drop table test_umlaut");<p
class="MsoNormal">   if (pResult) PQclear(pResult);<p class="MsoNormal"> <p class="MsoNormal">    pResult =
PQexec(pConn,ToUtf8(strCreate.c_str()).c_str());<p class="MsoNormal">    if (pResult) PQclear(pResult);<p
class="MsoNormal"> <pclass="MsoNormal">    pResult = PQexec(pConn, "select * from test_umlaut");<p
class="MsoNormal">   if (!pResult) FAIL;<p class="MsoNormal">    if (PQresultStatus(pResult)!=PGRES_TUPLES_OK) FAIL;<p
class="MsoNormal">   if (PQnfields(pResult)!=1) FAIL;<p class="MsoNormal">    const char *fName = PQfname(pResult,0);<p
class="MsoNormal"> <pclass="MsoNormal">    ShowW("Name:     ", strName);<p class="MsoNormal">    ShowA("in UTF8:  ",
ToUtf8(strName).c_str());<pclass="MsoNormal">    ShowA("from DB:  ", fName);<p class="MsoNormal">    ShowW("in UTF16:
",ToWide(fName).c_str());<p class="MsoNormal"> <p class="MsoNormal">    PQclear(pResult);<p class="MsoNormal">   
PQreset(pConn);<pclass="MsoNormal"> <p class="MsoNormal">(ShowA/W call OutputDebugStringA/W, and ToUtf8/ToWide use
WideCharToMultiByte/MultiByteToWideCharwith CP_UTF8.)<p class="MsoNormal"> <p class="MsoNormal">And this is the output
generated:<pclass="MsoNormal"> <p class="MsoNormal">Name:     id_äß<p class="MsoNormal">in UTF8:  id_äß<p
class="MsoNormal">fromDB:  id_ã¤ãÿ<p class="MsoNormal">in UTF16: id_???<p class="MsoNormal"> <p class="MsoNormal">It
seemslike the backend thinks the name is in ANSI encoding, not in UTF-8.<p class="MsoNormal">If I change the strCreate
queryand add double quotes around the column name, then the problem disappears. But the original name is already in
lowercase,so I think it should also work without quoting the column name.<p class="MsoNormal">Am I missing some setup
ineither the database or in the use of libpq?<p class="MsoNormal"> <p class="MsoNormal">I’m using PostgreSQL 9.2.1,
compiledby Visual C++ build 1600, 64-bit<p class="MsoNormal"> <p class="MsoNormal">The database uses:<p
class="MsoNormal">ENCODING= 'UTF8'<p class="MsoNormal">LC_COLLATE = 'English_United Kingdom.1252'<p
class="MsoNormal">LC_CTYPE= 'English_United Kingdom.1252'<p class="MsoNormal"> <p class="MsoNormal">Thanks for any
help,<pclass="MsoNormal"> <p class="MsoNormal">Martin<p class="MsoNormal"> </div> 

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Perl 5.18 breaks pl/perl regression tests?
Next
From: "ktm@rice.edu"
Date:
Subject: Re: UTF-8 encoding problem w/ libpq