Re: Support tab completion for upper character inputs in psql - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Support tab completion for upper character inputs in psql
Date
Msg-id 20210423.144443.2058612313278551429.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Support tab completion for upper character inputs in psql  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
FWIW...

At Fri, 23 Apr 2021 00:17:35 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in 
> Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> > At Thu, 22 Apr 2021 23:17:19 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in 
> >> Doesn't seem like a good idea, because that locks us into an assumption
> >> that the downcasing conversion doesn't change the string's physical
> >> length.  There are a lot of counterexamples to that :-(.  I'm not sure
> 
> > Mmm. I didn't know of that.
> 
> The two examples I know of offhand are in German (eszett "ß" downcases to
> "ss") and Turkish (dotted "Í" downcases to "i", likewise dotless "I"

According to Wikipedia, "ss" is equivalent to "ß" and their upper case
letters are "SS" and "ẞ" respectively. (I didn't even know of the
existence of "ẞ". AFAIK there's no word begins with eszett, but it
seems that there's a case where "ẞ" appears in a word is spelled only
with capital letters.

> downcases to "ı"; one of each of those pairs is an ASCII letter, the
> other is not).  Depending on which encoding is in use, these

Upper dotless "I" and lower dotted "i" are in ASCII (or English
alphabet?).  That's interesting.

> transformations *could* be the same number of bytes, but they could
> equally well not be.  There are probably other examples.

Yeah. Agreed.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: A test for replay of regression tests
Next
From: Ajin Cherian
Date:
Subject: Re: logical replication empty transactions