Home > mailing lists

Re: Hostnames, IDNs, Punycode and Unicode Case Folding - Mailing list pgsql-general

From	Mike Cardwell
Subject	Re: Hostnames, IDNs, Punycode and Unicode Case Folding
Date	December 30, 2014 00:26:17
Msg-id	20141230002611.GB24297@glue.grepular.com Whole thread Raw
In response to	Re: Hostnames, IDNs, Punycode and Unicode Case Folding (David G Johnston <david.g.johnston@gmail.com>)
List	pgsql-general

Tree view

* on the Mon, Dec 29, 2014 at 05:01:44PM -0700, David G Johnston wrote:

>> CREATE UNIQUE INDEX hostnames_hostname_key ON hostnames
>> (lower(punycode_encode(hostname)));
>>
>> That would prevent adding more than one representation for the same
>> hostname
>> to the column.
>
> Except two different hostname can resolve to the same
> punycode_encode(hostname) value

That's exactly what I'm taking advantage of...

> so the unique index won't work.

I think you misunderstand what I'm suggesing. I'm suggesting storing the
unicode version in the database, and then creating a unique index on the
result of the punycode function to prevent more than one representation
of the same hostname being inserted.

> It was also mentioned that using the Perl encoding function was
> non-performant; which is why caching the data into a memoization table has
> value.

It was non-performant because it involved unnecessarily starting a Perl
interpreter. Not because the algorithm its self is slow.

> WHERE lower(punycode_encode(hostname)) =
> > lower(punycode_encode('any-representation'))
> >
>
> I'm not for knowing the rules of punycode but I'm not seeing what value
> lower() provides here...

Case insensitive matching. So that "EXAMPLE.COM" = "example.com"

--
Mike Cardwell  https://grepular.com https://emailprivacytester.com
OpenPGP Key    35BC AF1D 3AA2 1F84 3DC3   B0CF 70A5 F512 0018 461F
XMPP OTR Key   8924 B06A 7917 AAF3 DBB1   BF1B 295C 3C78 3EF1 46B4

Attachment

signature.asc

pgsql-general by date:

From: Andrew Sullivan
Date: 30 December 2014, 00:26:04
Subject: Re: Hostnames, IDNs, Punycode and Unicode Case Folding

From: David Johnston
Date: 30 December 2014, 00:26:54
Subject: Re: Rollback on include error in psql

Re: Hostnames, IDNs, Punycode and Unicode Case Folding - Mailing list pgsql-general

Attachment

Previous

Next