Re: unicode match normal forms - Mailing list pgsql-general

From goldgraeber-werbetechnik@t-online.de
Subject Re: unicode match normal forms
Date
Msg-id wolfgang-1210518075009.A027656@linux-tuxedo
Whole thread Raw
In response to Re: unicode match normal forms  (Matthias Apitz <guru@unixarea.de>)
List pgsql-general
>> El día lunes, mayo 17, 2021 a las 01:27:40p. m. -0000, hamann.w@t-online.de escribió:
>> >> > Hi,
>> > >> > in unicode letter ä exists in two versions - linux and windows use a composite whereas macos prefers
>> > the decomposed form. Is there any way to make a semi-exact match that accepts both variants?
>> > This question  is not about fulltext but about matching filenames across a network - I wish to avoid two
equally-looking
>> > filenames.
>> >> There is only *one* codepoint for the German letter a Umlaut:
>> LATIN SMALL LETTER A WITH DIAERESI U+00E4
>>
Hi Matthias,

unfortunately there also is letter a with combining dieretic - and it is used by MacOS
The mac seems to prefer decomposed characters in other contexts as well, so in my
everyday job I used to have fun with product catalogues from a few companies.
Depending on the computer used for adding / editing a productthe relevant field could be
iso-latin-1, utf8 normal, or utf8 decomposed

>> Said that, having such chars (non ASCII) in file names, I count as a bad
>> idea.
I usually try to avoid whitespace and accented charactersin filenames, to be able to use ssh and scp
without much hassle, but I am not the user in this case.

Now, if I look at a music collection (stored as folders with mp3 files for the tracks), I would really prefer
"Einstürzende Neubauten" over Einstuerzende_Neubauten

Regards
Wolfgang

>>







pgsql-general by date:

Previous
From: goldgraeber-werbetechnik@t-online.de
Date:
Subject: Re: unicode match normal forms
Next
From: Ben Hoskings
Date:
Subject: Re: Occasional lengthy locking causing stalling on commit