Re: BUG #19354: JOHAB rejects valid byte sequences - Mailing list pgsql-bugs

From Henson Choi
Subject Re: BUG #19354: JOHAB rejects valid byte sequences
Date
Msg-id CAAAe_zAFz1v-3b7Je4L+=wZM3UGAczXV47YVZfZi9wbJxspxeA@mail.gmail.com
Whole thread
In response to Re: BUG #19354: JOHAB rejects valid byte sequences  (Henson Choi <assam258@gmail.com>)
Responses Re: BUG #19354: JOHAB rejects valid byte sequences
List pgsql-bugs
Subject: Fix and expand comments for Korean encodings in encnames.c

Hi hackers,

While reading through the encoding alias table in src/common/encnames.c,
I noticed a few long-standing inaccuracies and omissions in the per-entry
comments for the three Korean encodings.

The most visible issue is the JOHAB entry, whose comment describes it as
"Extended Unix Code for simplified Chinese" -- apparently a copy/paste
slip from a neighboring EUC entry.  JOHAB is in fact the Korean
combining-style encoding defined in KS X 1001 annex 3.

The attached 0002 patch makes comment-only adjustments to the three
Korean encodings:

  * JOHAB: replace the incorrect "simplified Chinese" description with
    a correct one that identifies it as the Korean combining (Johab)
    encoding standardized in KS X 1001 annex 3.

  * EUC_KR: drop a stray space before the comma in the existing
    comment, and note that the encoding covers the KS X 1001
    precomposed (Wansung) form.

  * UHC: spell out "Unified Hangul Code", clarify that it is
    Microsoft Windows CodePage 949, and describe its relationship to
    EUC-KR (superset covering all 11,172 precomposed Hangul syllables).

No behavior change, no catalog change, no pg_wchar.h change -- this
touches comments in src/common/encnames.c only.  pgindent is clean.

Thanks,
Henson Choi
Attachment

pgsql-bugs by date:

Previous
From: Henson Choi
Date:
Subject: Re: BUG #19354: JOHAB rejects valid byte sequences
Next
From: Thomas Munro
Date:
Subject: Re: BUG #19354: JOHAB rejects valid byte sequences