Home > mailing lists

Re: Question regarding UTF-8 data and "C" collation on definition of field of table - Mailing list pgsql-general

From	Dionisis Kontominas
Subject	Re: Question regarding UTF-8 data and "C" collation on definition of field of table
Date	February 6, 2023 01:07:01
Msg-id	CAB4Evu0OqyENXDKbxJDiUSXdWya9-X3Djzt76P1Uuk8ZgiJJKg@mail.gmail.com Whole thread
In response to	Re: Question regarding UTF-8 data and "C" collation on definition of field of table (Ron <ronljohnsonjr@gmail.com>)
List	pgsql-general

Tree view

Because if I don't specify the collation/lctype it seems to get the default from the OS, which in my case is : English_Netherlands.1252 (database encoding UTF8). That might not be best for truly unicode content columns, so I investigated the "C" option, which also seems not to work; might be worse.

To reframe my question, when you expect multilingual data in a column and the database encoding is utf8, which seems to accommodate the need for storage, what could be considered as best practice (if it can exist really) for collation and lctype?

On Mon, 6 Feb 2023 at 01:57, Ron <ronljohnsonjr@gmail.com> wrote:

Why are you specifying the collation to be "C" when the default db encoding
is UTF8, and UTF-8 has Greek, Chinese and English encodings?

On 2/5/23 17:08, Dionisis Kontominas wrote:
> Hello all,
>
> I have a question regarding the definition of the type of a character
> field in a table and more specifically about its collation and UTF-8
> characters and strings.
>
> Let's say that the definition is for example as follows:
>
> name character varying(8) COLLATE pg_catalog."C" NOT NULL
>
> and also assume that the database default encoding is UTF8 and also the
> Collate and Ctype is "C"". I plan to store strings of various languages in
> this field.
>
> Are these the correct settings that I should have used on creation of
> the database?.
>
> Thank you in Advance!
>
> Kindest regards,
>
> Dionisis Kontominas

--
Born in Arizona, moved to Babylonia.

pgsql-general by date:

From: Ron
Date: 06 February 2023, 00:57:13
Subject: Re: Question regarding UTF-8 data and "C" collation on definition of field of table

From: Peter Geoghegan
Date: 06 February 2023, 01:14:44
Subject: Re: Question regarding UTF-8 data and "C" collation on definition of field of table

Re: Question regarding UTF-8 data and "C" collation on definition of field of table - Mailing list pgsql-general

Previous

Next