Home > mailing lists

Re: BUG #19354: JOHAB rejects valid byte sequences - Mailing list pgsql-bugs

From	Tom Lane
Subject	Re: BUG #19354: JOHAB rejects valid byte sequences
Date	April 15 05:06:18
Msg-id	1910469.1776218778@sss.pgh.pa.us Whole thread
In response to	Re: BUG #19354: JOHAB rejects valid byte sequences (Thomas Munro <thomas.munro@gmail.com>)
List	pgsql-bugs

Tree view

Thomas Munro <thomas.munro@gmail.com> writes:
> On Wed, Apr 15, 2026 at 1:20 PM Henson Choi <assam258@gmail.com> wrote:
>> I understand the appeal of simply deleting a dead-looking encoding,
>> and Thomas' removal patch is clean work.  However, Korean archival
>> data from the 1990s (government records, academic repositories, early
>> online corpora) does exist as JOHAB bytes; as a client encoding, JOHAB
>> in PostgreSQL provides a straightforward ingest path
>> (client_encoding=JOHAB, convert_from, then store as UTF-8).  Once
>> removed, that path closes with no obvious alternative short of
>> preprocessing outside PostgreSQL.  Fixing the verifier preserves the
>> capability at the cost of a ~30-line correction plus tests.

> The counter argument would be that you could use iconv
> --from-code=JOHAB ..., or libiconv, or the codecs available in Python,
> Java, etc for dealing with historical archived data, something that
> data archivists must be very aware of.

Sure.  But it's not comfortable to remove a user-visible feature
we've had for decades.  My own primary concern about it was that a
correct fix could require non-backwards-compatible behavior changes.
Henson's analysis says that that's not a problem.  So assuming this
patch withstands review, I'd be much happier to see it applied than
to remove JOHAB.

No opinion at the moment about whether to back-patch.

            regards, tom lane

pgsql-bugs by date:

From: Thomas Munro
Date: 15 April, 04:49:24
Subject: Re: BUG #19354: JOHAB rejects valid byte sequences

From: Henson Choi
Date: 15 April, 07:25:04
Subject: Re: BUG #19354: JOHAB rejects valid byte sequences

Re: BUG #19354: JOHAB rejects valid byte sequences - Mailing list pgsql-bugs

Previous

Next