Laurenz Albe wrote:
> > Then choose UTF8.
>
> Right! And I recommend "C" for the collation.
Yet the "C" collation is unsuitable for handling character types
beyond ASCII.
For instance, it considers that accented letters are not letters,
so upper('été') is 'éTé' instead of 'ÉTÉ', and 'é' ~ '\w' is false.
C.UTF-8 solves that, and since Postgres 17, it's available for all operating
systems with the builtin provider.
So if you target Postgres 17+, C.UTF-8 from the builtin provider is
a better choice for UTF-8 databases than "C" .
Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/