Re: unicode match normal forms - Mailing list pgsql-general

From Gianni Ceccarelli
Subject Re: unicode match normal forms
Date
Msg-id 20210517150450.50e499f1@exelion
Whole thread Raw
In response to Re: unicode match normal forms  (Matthias Apitz <guru@unixarea.de>)
List pgsql-general
On Mon, 17 May 2021 15:45:00 +0200
Matthias Apitz <guru@unixarea.de> wrote:
> There is only *one* codepoint for the German letter a Umlaut:
> LATIN SMALL LETTER A WITH DIAERESI U+00E4

True. On the other hand, the sequence:

* U+0061 LATIN SMALL LETTER A
* U+0308 COMBINING DIAERESIS

will render exactly the same glyph. The two forms are closely related:
U+00E4 is in NFC (normalization form canonical composition), U+0061
U+0308 is in NFD (normalization form canonical decomposition).

See https://en.wikipedia.org/wiki/Unicode_equivalence#Normalization

-- 
    Dakkar - <Mobilis in mobile>
    GPG public key fingerprint = A071 E618 DD2C 5901 9574
                                 6FE2 40EA 9883 7519 3F88
                        key id = 0x75193F88




pgsql-general by date:

Previous
From: Gianni Ceccarelli
Date:
Subject: Re: unicode match normal forms
Next
From: "David G. Johnston"
Date:
Subject: Re: