Re: Collation versioning - Mailing list pgsql-hackers

From Douglas Doole
Subject Re: Collation versioning
Date
Msg-id CADE5jYKy9YxWucVENdpPMRj3yy97oN8Nvxo9ZsrLdDyxWykasQ@mail.gmail.com
Whole thread Raw
In response to Re: Collation versioning  (Greg Stark <stark@mit.edu>)
Responses Re: Collation versioning  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
On Mon, Sep 17, 2018 at 12:32 PM Greg Stark <stark@mit.edu> wrote:
This seems like a terrible idea in the open source world. Surely collation versioning means new ICU libraries can still provide the old collation rules so even if you update the library you can request the old version? We shouldn't need that actual old code with all its security holes and bugs just to get the old collation version.

We asked long and hard for this feature from the ICU team but they kept arguing it was too hard to do. There are apparently some tight couplings between the code and each version of CLDR. So the only way to support old collations is to ship the entire old library. (They even added make rules to allow the entire API to be version extended to accommodate this requirement.)

Even bug fixes are potentially problematic because the fix may alter how some code points collate. The ICU team won't (or at least wouldn't - been a few years since I dealt with them) guarantee any sort of backwards compatibility between code drops.

As an aside, they did look at making the CLDR data a separate data file that could be read by any version of the code (before finding there were too many dependencies). One thing that they discovered is that this approach didn't save much disk since the CLDR data is something like 90-95% of the library. So while it would have made the calling code a lot cleaner, it wasn't the huge footprint win we'd been hoping for.

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: [HACKERS] proposal: schema variables
Next
From: Thomas Munro
Date:
Subject: Re: Collation versioning