Re: ICU locale validation / canonicalization - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: ICU locale validation / canonicalization
Date
Msg-id 4efdd7bdf9c811e1a32b8d10524bae8040367068.camel@j-davis.com
Whole thread Raw
In response to Re: ICU locale validation / canonicalization  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Responses Re: ICU locale validation / canonicalization
Re: ICU locale validation / canonicalization
List pgsql-hackers
On Thu, 2023-03-23 at 07:27 +0100, Peter Eisentraut wrote:
> So, does uloc_canonicalize() always convert to ICU locale IDs?  What
> if
> you pass a language tag, does it convert it to ICU locale ID as well?

Yes.

The documentation is not clear on that point, but my testing shows that
it does. And this is only for old versions of the code, so we don't
need to worry about later versions of ICU changing that.

I thought about using uloc_forLanguageTag(), but the documentation for
that is not clear what formats it accepts as an input, so it doesn't
seem like a win. If wanted to be paranoid we could use
uloc_toLanguageTag() followed by uloc_forLanguageTag(), but that seemed
excessive.

> >
> 0002 and 0003 look ok to me now.

Thank you, committed 0002 and 0003.

> In 0002, the error "opening default collator is not supported",
> should
> that be an assert or an elog?  Is it reachable by the user?

It's not reachable by the user, but could catch a bug if we
accidentally read a NULL field from the catalog or something like that.
It seemed a worthwhile check to leave in production builds.

> You might want to check the declarations at the top of
> pg_ucol_open().
> 0003 reformats them after they were just added in 0002.  Maybe check
> that they are pgindent'ed in 0002 properly.

They seem to be pgindented fine in 0002, it was unnecessarily
reindented in 0003 and I fixed that.

I use emacs "align-current" and generally that does the right thing,
but I'll rely more on pgindent in the future.

> I don't understand patch 0004.  It seems to do two things, handle
> C/POSIX locale specifications and add an SQL-callable function.  Are
> those connected?

It's hard to test (or even exercise) the former without the latter.

I could get rid of the SQL-callable function and move the rest of the
changes into 0006. I'll see if that arrangement works better, and that
way we can add the SQL-callable function later (or perhaps not at all
if it's not desired).

Regards,
    Jeff Davis




pgsql-hackers by date:

Previous
From: Onur Tirtir
Date:
Subject: RE: [EXTERNAL] Re: [PATCH] Report the query string that caused a memory error under Valgrind
Next
From: Bharath Rupireddy
Date:
Subject: Re: Add pg_walinspect function with block info columns