Re: unicode questions - Mailing list pgsql-hackers

From - -
Subject Re: unicode questions
Date
Msg-id 1842a500912241537y7d7cf845i8c6e1f74a19f43d1@mail.gmail.com
Whole thread Raw
In response to Re: unicode questions  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
On Thu, Dec 24, 2009 at 5:40 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> 1) If I set my database and connection encoding to UTF-8, does pg (and
>> future versions of it) guarantee that unicode code points are stored
>> unmodified? or could it be that pg does some unicode
>> normalization/manipulation with them before storing a string, or when
>> retrieving a string?
>>
>> The reason why I'm asking is, I've built a little program that reads
>> in and stores text and explicilty analyzes the text at a later point
>> in time, also regarding things like if the text is in NFC, NFD or
>> neither. and since I want to store them in the database, it is very
>> imporant for PG not to fiddle around with the normalization unless my
>> program explicitly told PG to do that.
>
> We don't do any normalization. If the client gives us UTF8 then we store
> exactly what it gives us, and return exactly that.

OK.

>
> (This question is not really a -hackers question. The correct forum is
> pgsql-general. Please make sure you use the correct forum in future.)

Are you sure? The description for -hackers says: "Discussion of
current development issues, problems and bugs, and proposed new
features.", which seems to be exactly where you'd ask my 2nd question,
which is still unanswered.

>>
>> 2) How far is normalization support in PG? When I checked a long time
>> ago, there was no such support. Now that the SQL standard mandates a
>> NORMALIZE function that may have changed. Any updates?
>>

Kind regards.


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Removing pg_migrator limitations
Next
From: Matteo Beccati
Date:
Subject: PQescapeByteaConn and the new hex encoding