On Sun, Nov 17, 2024 at 01:00:14PM -0500, Tom Lane wrote:
> As said, the difficulty is that we don't know what encoding the
> incoming name is meant to be in, and with multibyte encodings that
> matters. The name actually stored in the catalog might be less
> than 63 bytes long if it was truncated in a multibyte-aware way,
> so that the former behavior of blindly truncating at 63 bytes
> can still yield unexpected no-such-database results.
>
> I can imagine still performing the truncation if the incoming
> name is all-ASCII, but that seems like a hack. The world isn't
> nearly as ASCII-centric as it was in 2001.
I wonder if we should consider removing the identifier truncation
altogether. Granted, it mostly works (or at least did before v17), but I'm
not sure we'd make the same decision today if we were starting from
scratch. IMHO it'd be better to ERROR so that users are forced to produce
legal identifiers. That being said, I realize this behavior has been
present for over a quarter century now [0] [1] [2], and folks are unlikely
to be happy with even more breakage.
[0] https://postgr.es/c/d15c37c
[1] https://postgr.es/c/0672a3c
[2] https://postgr.es/c/49581f9
--
nathan