Re: Hostnames, IDNs, Punycode and Unicode Case Folding - Mailing list pgsql-general

From Mike Cardwell
Subject Re: Hostnames, IDNs, Punycode and Unicode Case Folding
Date
Msg-id 20141230001858.GA24297@glue.grepular.com
Whole thread Raw
In response to Re: Hostnames, IDNs, Punycode and Unicode Case Folding  (Andrew Sullivan <ajs@crankycanuck.ca>)
Responses Re: Hostnames, IDNs, Punycode and Unicode Case Folding  (Andrew Sullivan <ajs@crankycanuck.ca>)
List pgsql-general
* on the Mon, Dec 29, 2014 at 07:00:05PM -0500, Andrew Sullivan wrote:

>> CREATE UNIQUE INDEX hostnames_hostname_key ON hostnames (lower(punycode_encode(hostname)));
>
> This wouldn't work to get the original back if oyu have any IDNA200
> data, because puncode-encoding the UTF-8 under IDNA2003 and the
> punycode-decoding it doesn't always result in the same label.  See my
> other message.

The original is the thing that is stored in the database. I wouldn't need to
do any conversion to get the original back. In my example I am storing
the original and creating an index on the punycode version.

This is exactly the same method that we commonly use for performing case
insensitive text searches using lower() indexes.

--
Mike Cardwell  https://grepular.com https://emailprivacytester.com
OpenPGP Key    35BC AF1D 3AA2 1F84 3DC3   B0CF 70A5 F512 0018 461F
XMPP OTR Key   8924 B06A 7917 AAF3 DBB1   BF1B 295C 3C78 3EF1 46B4

Attachment

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Rollback on include error in psql
Next
From: Andrew Sullivan
Date:
Subject: Re: Hostnames, IDNs, Punycode and Unicode Case Folding