On Mon, Jul 1, 2019 at 10:13 PM Daniel Verite <daniel@manitou-mail.org> wrote:
>
> > > I think you'd be better off to define and document this as "reindex
> > > only collation-sensitive indexes", without any particular reference
> > > to a reason why somebody might want to do that.
> >
> > We should still document that indexes based on ICU would be exluded?
>
> But why exclude them?
> As a data point, in the last 5 years, the en_US collation in ICU
> had 7 different versions (across 11 major versions of ICU):
>
> ICU Unicode en_US
>
> 54.1 7.0 137.56
> 55.1 7.0 153.56
> 56.1 8.0 153.64
> 57.1 8.0 153.64
> 58.2 9.0 153.72
> 59.1 9.0 153.72
> 60.2 10.0 153.80
> 61.1 10.0 153.80
> 62.1 11.0 153.88
> 63.2 11.0 153.88
> 64.2 12.1 153.97
>
> The rightmost column corresponds to pg_collation.collversion
> in Postgres.
> Each time there's a new Unicode version, it seems
> all collation versions are bumped. And there's a new Unicode
> version pretty much every year these days.
> Based on this, most ICU upgrades in practice would require reindexing
> affected indexes.
I have a vague recollection that ICU was providing some backward
compatibility so that even if you upgrade your lib you can still get
the sort order that was active when you built your indexes, though
maybe for a limited number of versions.
Even if that's just me being delusional, I'd still prefer Alvaro's
approach to have distinct switches for each collation system.