On 3/15/19 11:59 AM, Gunther wrote:
> This is not an issue for "hackers" nor "performance" in fact even for
> "general" it isn't really an issue.
As long as it's already been posted, may as well make it something
helpful to find in the archive.
> Understand charsets -- character set, code point, and encoding. Then
> understand how encoding and string literals and "escape sequences" in
> string literals might work.
Good advice for sure.
> Know that UNICODE today is the one standard, and there is no more need
I wasn't sure from the question whether the original poster was in
a position to choose the encoding of the database. Lots of things are
easier if it can be set to UTF-8 these days, but perhaps it's a legacy
situation.
Maybe a good start would be to go do
SHOW server_encoding;
SHOW client_encoding;
and then hit the internet and look up what that encoding (or those
encodings, if different) can and can't represent, and go from there.
It's worth knowing that, when the server encoding isn't UTF-8,
PostgreSQL will have the obvious limitations entailed by that,
but also some non-obvious ones that may be surprising, e.g. [1].
-Chap
[1]
https://www.postgresql.org/message-id/CA%2BTgmobUp8Q-wcjaKvV%3DsbDcziJoUUvBCB8m%2B_xhgOV4DjiA1A%40mail.gmail.com