Re: [18] Unintentional behavior change in commit e9931bfb75 - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: [18] Unintentional behavior change in commit e9931bfb75
Date
Msg-id a57ab007676c2764799eb3cfff18d78578298674.camel@j-davis.com
Whole thread Raw
In response to Re: [18] Unintentional behavior change in commit e9931bfb75  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, 2024-12-04 at 12:21 -0500, Tom Lane wrote:
> Peter Eisentraut <peter@eisentraut.org> writes:
> > On 02.12.24 23:25, Tom Lane wrote:
> > > Well, also for compatibility with our SQL parser's understanding
> > > of identifier lowercasing.
>
> > Maybe that was relevant before the "name" type got its own
> > collation?
>
> downcase_identifier doesn't give a fig about name's collation.

I'd like to understand better the relationship between the parser's
casefolding, object names in the catalog, the default collation, and
the default ctype.

The comment in downcase_identifier() says: "SQL99 specifies Unicode-
aware case normalization, which we don't yet have the infrastructure
for". The good news is that, with

https://commitfest.postgresql.org/51/5436/

we hopefully will have the infrastructure in place soon.

  (a) Do we want to use that infrastructure?
  (b) Do we want to use the default collation to provide the case
folding behavior, or do we want to always use the unicode behavior
that's built in to postgres?

I'm not quite sure where the single-byte encoding special case fits in
or how it helps. It seems like what it's really doing is locking the
database-wide ctype behavior down to either libc "C" or libc with any
non-Turkish-related locale. The reasoning behind this might answer
question (b) above, but I'm not sure which direction we should be
moving.

Thoughts?

Regards,
    Jeff Davis




pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: Add CASEFOLD() function.
Next
From: Jeff Davis
Date:
Subject: Re: Remaining dependency on setlocale()