pgsql: Update display widths as part of updating Unicode - Mailing list pgsql-committers

From John Naylor
Subject pgsql: Update display widths as part of updating Unicode
Date
Msg-id E1mJGx6-0002Xn-4v@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Update display widths as part of updating Unicode

The hardcoded "wide character" set in ucs_wcwidth() was last updated
around the Unicode 5.0 era.  This led to misalignment when printing
emojis and other codepoints that have since been designated
wide or full-width.

To fix and keep up to date, extend update-unicode to download the list
of wide and full-width codepoints from the offical sources.

In passing, remove some comments about non-spacing characters that
haven't been accurate since we removed the former hardcoded logic.

Jacob Champion

Reported and reviewed by Pavel Stehule
Discussion:
https://www.postgresql.org/message-id/flat/CAFj8pRCeX21O69YHxmykYySYyprZAqrKWWg0KoGKdjgqcGyygg@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/bab982161e0590746a2fd2a03043b27108b23ac6

Modified Files
--------------
src/common/unicode/.gitignore                      |   1 +
src/common/unicode/Makefile                        |   9 +-
.../generate-unicode_east_asian_fw_table.pl        |  76 +++++++++++++
src/common/wchar.c                                 |  41 +++----
src/include/common/unicode_east_asian_fw_table.h   | 120 +++++++++++++++++++++
5 files changed, 220 insertions(+), 27 deletions(-)


pgsql-committers by date:

Previous
From: John Naylor
Date:
Subject: pgsql: Revert "Rename unicode_combining_table to unicode_width_table"
Next
From: John Naylor
Date:
Subject: pgsql: Extend collection of Unicode combining characters to beyond the