Re: unicode match normal forms - Mailing list pgsql-general

From goldgraeber-werbetechnik@t-online.de
Subject Re: unicode match normal forms
Date
Msg-id wolfgang-1210518071808.A017656@linux-tuxedo
Whole thread Raw
In response to Re: unicode match normal forms  (Gianni Ceccarelli <dakkar@thenautilus.net>)
List pgsql-general
Hi Gianni,

many thanks for your detailed response.
It turned out that my postgresql installation is too old for normalize,so I will probably
a) use an external script to normalize existing data
b) change application code to normalize data before inserting or searching

Regards
Wolfgang
>> On 17 May 2021 13:27:40 -0000
>> hamann.w@t-online.de wrote:
>> > in unicode letter ä exists in two versions - linux and windows use a
>> > composite whereas macos prefers the decomposed form. Is there any way
>> > to make a semi-exact match that accepts both variants?
>> >> You should probably normalise the strings in whatever application code
>> handles the inserting. NFC is the "usually sensible" normal form to
>> use.
>> >> If you can't change the application code, you may use a trigger and
>> apply the `normalize(text[,form])→text` function to the values
>> >> https://www.postgresql.org/docs/13/functions-string.html#id-1.5.8.10.5.2.2.7.1.1.2
>> >> something vaguely like (totally untested!)::
>> >>   create function normalize_filename() returns trigger as $$
>>   begin
>>     new.filename := normalize(new.filename);
>>     return new;
>>   end;
>>   $$ language plpgsql;
>> >>   create trigger normalize_filename
>>   before insert or update
>>   on that_table
>>   for each row
>>   execute function normalize_filename();
>> >> -- >>     Dakkar - <Mobilis in mobile>
>>     GPG public key fingerprint = A071 E618 DD2C 5901 9574
>>                                  6FE2 40EA 9883 7519 3F88
>>                         key id = 0x75193F88
>> >> >>







pgsql-general by date:

Previous
From: "Daniel Verite"
Date:
Subject: Re: unicode match normal forms
Next
From: goldgraeber-werbetechnik@t-online.de
Date:
Subject: Re: unicode match normal forms