Home > mailing lists

Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails - Mailing list pgsql-bugs

From	Bruce Momjian
Subject	Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails
Date	November 21, 2024 17:47:56
Msg-id	Zz9IHPBf-z8MsLdw@momjian.us Whole thread Raw
In response to	Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
Responses	Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails
List	pgsql-bugs

Tree view

On Thu, Nov 21, 2024 at 02:35:50PM +0000, Bertrand Drouvot wrote:
> On Thu, Nov 21, 2024 at 09:21:16AM -0500, Bruce Momjian wrote:
> > I don't understand this logic.  Why are two bytes important?  If we knew
> > it was UTF8 we could check for non-first bytes always starting with
> > bits 10, but we can't know that.
> 
> I think this is because this is a reliable way to detect if the truncation happened
> in the middle of a character, without needing to know the specifics of the encoding.
> 
> My understanding is that the key insight is that in any multibyte encoding, all
> bytes within a multibyte character will have their high bits set.
> 
> That's just my understanding from the code and Tom's previous explanations:  I
> might be wrong as not an expert in this area.

But the logic doesn't make sense.  Why would two bytes be any different
than one?  I assumed you would just remove all trailing high-bit bytes
and stop and the first non-high-bit byte.  Also, do we really expect
there to be trailing multi-byte characters and then some ASCII before
it? Isn't it likely it will be all ASCII or all multi-byte characters? 
I guess for Latin1, it would work fine, but I assume for Asian
languages, it will be almost all multi-byte characters.  I guess digits
would be ASCII.  This all just seems very unfocused.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  When a patient asks the doctor, "Am I going to die?", he means 
  "Am I going to die soon?"

pgsql-bugs by date:

From: Bertrand Drouvot
Date: 21 November 2024, 17:35:50
Subject: Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails

From: Erik Wienhold
Date: 21 November 2024, 17:53:13
Subject: Re: AW: Wrong german error message encoding

Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails - Mailing list pgsql-bugs

Previous

Next