Re: encoding affects ICU regex character classification - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: encoding affects ICU regex character classification
Date
Msg-id 3a86ea75efc0a7dd1b040d3358356c901a9c154a.camel@j-davis.com
Whole thread Raw
In response to Re: encoding affects ICU regex character classification  (Jeremy Schneider <schneider@ardentperf.com>)
List pgsql-hackers
On Fri, 2023-12-15 at 16:48 -0800, Jeremy Schneider wrote:
> This goes back to my other thread (which sadly got very little
> discussion): PosgreSQL really needs to be safe by /default/

Doesn't a built-in provider help create a safer option?

The built-in provider's version of Unicode will be consistent with
unicode_assigned(), which is a first step toward rejecting code points
that the provider doesn't understand. And by rejecting unassigned code
points, we get all kinds of Unicode compatibility guarantees that avoid
the kinds of change risks that you are worried about.

Regards,
    Jeff Davis




pgsql-hackers by date:

Previous
From: "Daniel Verite"
Date:
Subject: Fixing backslash dot for COPY FROM...CSV
Next
From: Robert Haas
Date:
Subject: Re: index prefetching