Re: Multi-byte character case-folding - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Multi-byte character case-folding
Date
Msg-id 1479731.1594081942@sss.pgh.pa.us
Whole thread Raw
In response to Re: Multi-byte character case-folding  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: Multi-byte character case-folding
Re: Multi-byte character case-folding
Re: Multi-byte character case-folding
List pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2020-Jul-06, Tom Lane wrote:
>> More generally, I'd be mighty hesitant to change this behavior after
>> it's stood for so many years.  I suspect more people would complain
>> that we broke their application than would be happy about it.

> I think the fact that identifiers fail to follow language-specific case
> folding rules is more a known gotcha than a desired property, but on
> principle I tend to agree that Turkish people would not be happy about
> the prospect of us changing the downcasing rule in a major release -- it
> would mean having to edit any affected application code as part of a
> pg_upgrade process, which is not great.

It's not just the Turks.  As near as I can tell, we'd likely break *every*
app that's using such identifiers.  For example, supposing I do

test=# create table MYÉCLASS (f1 text);
CREATE TABLE
test=# \dt
          List of relations
 Schema |   Name   | Type  |  Owner   
--------+----------+-------+----------
 public | myÉclass | table | postgres
(1 row)

pg_dump will render this as

CREATE TABLE public."myÉclass" (
    f1 text
);

If we start to case-fold É, then the only way to access this table will
be by double-quoting its name, which the application probably is not
expecting (else it would have double-quoted in the original CREATE TABLE).

> Now you could say that this can be fixed by adding a GUC that preserves
> the old behavior, but generally we don't like that too much.

Yes, a GUC changing this would be a headache.  It would be just as much of
a compatibility and security hazard as standard_conforming_strings (which
indeed I've been thinking of proposing that we get rid of; it's hung
around long enough).

> The counter argument is that there are more future users than there are
> current users.

Especially if we drive away the current users :-(.  In practice, we've
heard very very few complaints about this, so my gut says to leave
it alone.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Dave Cramer
Date:
Subject: Re: Binary support for pgoutput plugin
Next
From: Alvaro Herrera
Date:
Subject: Re: min_safe_lsn column in pg_replication_slots view