UTF-8 encoding problem w/ libpq - Mailing list pgsql-hackers
From | Martin Schäfer |
---|---|
Subject | UTF-8 encoding problem w/ libpq |
Date | |
Msg-id | 11A8567A97B15648846060F5CD818EB8CAC2253F5E@DEV001EX.Dev.cadcorp.net Whole thread Raw |
Responses |
Re: UTF-8 encoding problem w/ libpq
|
List | pgsql-hackers |
<div class="WordSection1"><p class="MsoNormal">I try to create database columns with umlauts, using the UTF8 client encoding.However, the server seems to mess up the column names. In particular, it seems to perform a lowercase operationon each byte of the UTF-8 multi-byte sequence.<p class="MsoNormal"> <p class="MsoNormal">Here is my code:<p class="MsoNormal"> <pclass="MsoNormal"> const wchar_t *strName = L"id_äß";<p class="MsoNormal"> wstring strCreate =wstring(L"create table test_umlaut(") + strName + L" integer primary key)";<p class="MsoNormal"> <p class="MsoNormal"> PGconn *pConn = PQsetdbLogin("", "", NULL, NULL, "dev503", "postgres", "******");<p class="MsoNormal"> if (!pConn) FAIL;<p class="MsoNormal"> if (PQsetClientEncoding(pConn, "UTF-8")) FAIL;<p class="MsoNormal"> <pclass="MsoNormal"> PGresult *pResult = PQexec(pConn, "drop table test_umlaut");<p class="MsoNormal"> if (pResult) PQclear(pResult);<p class="MsoNormal"> <p class="MsoNormal"> pResult = PQexec(pConn,ToUtf8(strCreate.c_str()).c_str());<p class="MsoNormal"> if (pResult) PQclear(pResult);<p class="MsoNormal"> <pclass="MsoNormal"> pResult = PQexec(pConn, "select * from test_umlaut");<p class="MsoNormal"> if (!pResult) FAIL;<p class="MsoNormal"> if (PQresultStatus(pResult)!=PGRES_TUPLES_OK) FAIL;<p class="MsoNormal"> if (PQnfields(pResult)!=1) FAIL;<p class="MsoNormal"> const char *fName = PQfname(pResult,0);<p class="MsoNormal"> <pclass="MsoNormal"> ShowW("Name: ", strName);<p class="MsoNormal"> ShowA("in UTF8: ", ToUtf8(strName).c_str());<pclass="MsoNormal"> ShowA("from DB: ", fName);<p class="MsoNormal"> ShowW("in UTF16: ",ToWide(fName).c_str());<p class="MsoNormal"> <p class="MsoNormal"> PQclear(pResult);<p class="MsoNormal"> PQreset(pConn);<pclass="MsoNormal"> <p class="MsoNormal">(ShowA/W call OutputDebugStringA/W, and ToUtf8/ToWide use WideCharToMultiByte/MultiByteToWideCharwith CP_UTF8.)<p class="MsoNormal"> <p class="MsoNormal">And this is the output generated:<pclass="MsoNormal"> <p class="MsoNormal">Name: id_äß<p class="MsoNormal">in UTF8: id_äß<p class="MsoNormal">fromDB: id_ã¤ãÿ<p class="MsoNormal">in UTF16: id_???<p class="MsoNormal"> <p class="MsoNormal">It seemslike the backend thinks the name is in ANSI encoding, not in UTF-8.<p class="MsoNormal">If I change the strCreate queryand add double quotes around the column name, then the problem disappears. But the original name is already in lowercase,so I think it should also work without quoting the column name.<p class="MsoNormal">Am I missing some setup ineither the database or in the use of libpq?<p class="MsoNormal"> <p class="MsoNormal">I’m using PostgreSQL 9.2.1, compiledby Visual C++ build 1600, 64-bit<p class="MsoNormal"> <p class="MsoNormal">The database uses:<p class="MsoNormal">ENCODING= 'UTF8'<p class="MsoNormal">LC_COLLATE = 'English_United Kingdom.1252'<p class="MsoNormal">LC_CTYPE= 'English_United Kingdom.1252'<p class="MsoNormal"> <p class="MsoNormal">Thanks for any help,<pclass="MsoNormal"> <p class="MsoNormal">Martin<p class="MsoNormal"> </div>
pgsql-hackers by date: