At Thu, 1 Sep 2022 15:00:38 +0700, John Naylor <john.naylor@enterprisedb.com> wrote in
> On Thu, Sep 1, 2022 at 2:13 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:
> > problem is in bad width of invisible char 200E
>
> I removed this comment in bab982161e since it didn't match the code.
> I'd be interested to see what happened after v12.
>
> - * - Other format characters (general category code Cf in the Unicode
> - * database) and ZERO WIDTH SPACE (U+200B) have a column
> width of 0.
>
> UnicodeData.txt has this:
>
> 200B;ZERO WIDTH SPACE;Cf;0;BN;;;;;N;;;;;
> 200C;ZERO WIDTH NON-JOINER;Cf;0;BN;;;;;N;;;;;
> 200D;ZERO WIDTH JOINER;Cf;0;BN;;;;;N;;;;;
> 200E;LEFT-TO-RIGHT MARK;Cf;0;L;;;;;N;;;;;
> 200F;RIGHT-TO-LEFT MARK;Cf;0;R;;;;;N;;;;;
>
> So maybe we need to take Cf characters in this file into account, in
> addition to Me and Mn (combining characters).
Including them into unicode_combining_table.h actually worked, but I'm
not sure it is valid to include Cf's among Mn/Me's..
> diff --git a/src/common/unicode/generate-unicode_combining_table.pl
b/src/common/unicode/generate-unicode_combining_table.pl
> index 8177c20260..7030bc637b 100644
> --- a/src/common/unicode/generate-unicode_combining_table.pl
> +++ b/src/common/unicode/generate-unicode_combining_table.pl
> @@ -25,7 +25,7 @@ foreach my $line (<ARGV>)
> my @fields = split ';', $line;
> $codepoint = hex $fields[0];
>
> - if ($fields[2] eq 'Me' || $fields[2] eq 'Mn')
> + if ($fields[2] eq 'Me' || $fields[2] eq 'Mn' || $fields[2] eq 'Cf')
> {
> # combining character, save for start of range
> if (!defined($range_start))
By the way I was super annoyed that it was super-hard to reflect the
changes under src/common to the final binary. There are two hops of
missing dependencies and finally ccache stood in my way.. I find that
Andres once meant to try that using --dependency-files but I hope we
make that reflection automated even if we do define the dependencies
manually..
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center