pgsql: Change type "char"'s I/O format for non-ASCII characters. - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Change type "char"'s I/O format for non-ASCII characters.
Date
Msg-id E1oIsu3-002Pg0-Ms@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Change type "char"'s I/O format for non-ASCII characters.

Previously, a byte with the high bit set was just transmitted
as-is by charin() and charout().  This is problematic if the
database encoding is multibyte, because the result of charout()
won't be validly encoded, which breaks various stuff that
expects all text strings to be validly encoded.  We've
previously decided to enforce encoding validity rather than try
to individually harden each place that might have a problem with
such strings, so it's time to do something about "char".

To fix, represent high-bit-set characters as \ooo (backslash
and three octal digits), following the ancient "escape" format
for bytea.  charin() will continue to accept the old way as well,
though that is only reachable in single-byte encodings.

Add some test cases just so there is coverage for this code.
We'll otherwise leave this question undocumented as it was before,
because we don't really want to encourage end-user use of "char".

For the moment, back-patch into v15 so that this change appears
in 15beta3.  If there's not great pushback we should consider
absorbing this change into the older branches.

Discussion: https://postgr.es/m/2318797.1638558730@sss.pgh.pa.us

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/ec62ce55a813db5c925d89a53b5b22baa509abb6

Modified Files
--------------
doc/src/sgml/datatype.sgml           | 10 +++--
src/backend/utils/adt/char.c         | 72 ++++++++++++++++++++++++++++--------
src/test/regress/expected/char.out   | 63 ++++++++++++++++++++++++++++++-
src/test/regress/expected/char_1.out | 63 ++++++++++++++++++++++++++++++-
src/test/regress/expected/char_2.out | 63 ++++++++++++++++++++++++++++++-
src/test/regress/sql/char.sql        | 20 +++++++++-
6 files changed, 263 insertions(+), 28 deletions(-)


pgsql-committers by date:

Previous
From: David Rowley
Date:
Subject: pgsql: Improve performance of ORDER BY / DISTINCT aggregates
Next
From: David Rowley
Date:
Subject: pgsql: Remove unused fields from ExprEvalStep