Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> At Thu, 22 Apr 2021 23:17:19 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in
>> Doesn't seem like a good idea, because that locks us into an assumption
>> that the downcasing conversion doesn't change the string's physical
>> length. There are a lot of counterexamples to that :-(. I'm not sure
> Mmm. I didn't know of that.
The two examples I know of offhand are in German (eszett "ß" downcases to
"ss") and Turkish (dotted "Í" downcases to "i", likewise dotless "I"
downcases to "ı"; one of each of those pairs is an ASCII letter, the
other is not). Depending on which encoding is in use, these
transformations *could* be the same number of bytes, but they could
equally well not be. There are probably other examples.
regards, tom lane