On Fri, Sep 2, 2022 at 3:19 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Fri, 2 Sep 2022 13:43:50 +0700, John Naylor <john.naylor@enterprisedb.com> wrote in
> > If there is any doubt about including all of Cf, we could also just
> > add a branch in wchar.c to hard-code the 200B-200F range.
>
> If every way has defect to the similar extent, I think we will choose
> to use authoritative data at least for the first step. We might want
> to have additional filtering on it but it would be another issue,
> maybe.
>
> Attached is the first cut of that. (The commit messages is not great,
> though.)
Okay, the patch looks good to me overall. Comparing releases, some
other ranges were in v11 but left out in v12 with the transition to
using a script:
0x070F
{0x200B, 0x200F}
{0x202A, 0x202E}
{0x206A, 0x206F}
0xFEFF
{0xFFF9, 0xFFFB}
Does anyone want to advocate for backpatching these missing ranges to
v12 and up? v12 still has a table in-line so trivial to remedy, but
v13 and up use a script, so these exceptions would likely have to use
hard-coded branches to keep from bringing in new changes.
If so, does anyone want to advocate for including this patch in v15?
It claims Unicode 14.0.0, and this would make that claim more
technically correct as well as avoiding additional branches.
--
John Naylor
EDB: http://www.enterprisedb.com