Re: ICU integration - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: ICU integration
Date
Msg-id CAM3SWZTLHsgFryPcOFFTCe-TTP42y7kwfrErxM1Fk6uTC=KCfw@mail.gmail.com
Whole thread Raw
In response to Re: ICU integration  (Doug Doole <ddoole@salesforce.com>)
Responses Re: ICU integration
List pgsql-hackers
On Tue, Sep 6, 2016 at 10:40 AM, Doug Doole <ddoole@salesforce.com> wrote:
> - Suppose in ICU X.X, AA = Å but in ICU Y.Y AA != Å. Further suppose there
> was an RI constraint where the primary key used AA and the foreign key used
> Å. If ICU was updated, the RI constraint between the rows would break,
> leaving an orphaned foreign key.

This isn't a problem for Postgres, or at least wouldn't be right now,
because we don't have case insensitive collations. So, we use a
strcmp()/memcmp() tie-breaker when strcoll() indicates equality, while
also making the general notion of text equality actually mean binary
equality. In short, we are aware that cases like this exist. IIRC
Unicode Technical Standard #10 independently recommends that this
tie-breaker strategy is one way of dealing with problems like this, in
a pinch, though I think we came up with the idea independently of that
recommendation. This was in response to a bug report over 10 years
ago.

I would like to get case insensitive collations some day, and was
really hoping that ICU would help. That being said, the need for a
strcmp() tie-breaker makes that hard. Oh well.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Gerdan Rezende dos Santos
Date:
Subject: Re: [PATCH] add option to pg_dumpall to exclude tables from the dump
Next
From: Amit Langote
Date:
Subject: Re: Let file_fdw access COPY FROM PROGRAM