On Tue, Nov 19, 2024 at 06:09:44PM -0500, Tom Lane wrote:
> Nathan Bossart <nathandbossart@gmail.com> writes:
> > On Tue, Nov 19, 2024 at 02:33:27PM -0500, Tom Lane wrote:
> >> I did think of a way that we could approximate encoding-correct
> >> truncation here, relying on the fact that what's in pg_database
> >> is encoding-correct according to somebody:
> >> ...
>
> > That's an interesting idea. That code would probably need to live in
> > GetDatabaseTuple(), but it seems doable. We might be able to avoid the
> > "mighty improbable" case by always truncating up to
> > MAX_MULTIBYTE_CHAR_LEN-1 times and failing if there are multiple matches,
> > too.
>
> Hmm ... but with short characters (e.g. LATIN1) there might be
> legitimately-different names that that rule would complain about.
Could we rely on pg_encoding_max_length() instead of MAX_MULTIBYTE_CHAR_LEN? That
would then work for short characters too IIUC.
> Also, we could bypass the multiple lookups unless both the
> NAMEDATALEN-1'th and NAMEDATALEN-2'th bytes are non-ASCII, which
> should be rare enough to make it not much of a performance issue.
>
> One annoying point is that we also need this for role lookup.
Yeah. The role existence is checked before the database existence. And even if
that would not be the case, it would make litle sense to rely on the
pg_encoding_max_length() of the matching database for the role, as those are
global.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com