Thread: check_strxfrm_bug()

check_strxfrm_bug()

From
Thomas Munro
Date:
Hi

While studying Jeff's new crop of collation patches I noticed in
passing that check_strxfrm_bug() must surely by now be unnecessary.
The buffer overrun bugs were fixed a decade ago, and the relevant
systems are way out of support.  If you're worried that the bugs might
come back, then the test is insufficient: modern versions of both OSes
have strxfrm_l(), which we aren't checking.  In any case, we also
completely disable this stuff because of bugs and quality problems in
every other known implementation, via TRUST_STRXFRM (or rather the
lack of it).  So I think it's time to remove that function; please see
attached.

Just by the way, if you like slow motion domino runs, check this out:

* Original pgsql-bugs investigation into strxfrm() inconsistencies
  https://www.postgresql.org/message-id/flat/111D0E27-A8F3-4A84-A4E0-B0FB703863DF@s24.com

* I happened to test that on bleeding-edge FreeBSD 11 (wasn't released
yet), because at that time FreeBSD was in the process of adopting
illumos's new collation code, and reported teething problems:
  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=208266

* FreeBSD, DragonFly and illumos's trees were then partially fixed by
the authors, but our strcolltest.c still showed some remaining
problems in some locales (and it still does on my local FreeBSD
battlestation):
  https://github.com/freebsd/freebsd-src/commit/c48dc2a193b9befceda8dfc6f894d73251cc00a4
  https://www.illumos.org/rb/r/402/

* The authors traced the remaining problem to flaws in the Unicode
project's CLDR/POSIX data, and the report was accepted:
  https://www.illumos.org/issues/7962
  https://unicode-org.atlassian.net/browse/CLDR-10394

Eventually that'll be fixed, and (I guess) trigger at least a CLDR
minor version bump affecting all downstream consumers (ICU, ...).
Then... maybe... at least FreeBSD will finally pass that test.  I do
wonder whether other consumer libraries are also confused by that
problem source data, and if not, why not; are glibc's problems related
or just random code or data quality problems in different areas?  (I
also don't know why a problem in that data should affect strxfrm() and
strcoll() differently, but I don't plan to find out owing to an acute
shortage of round tuits).

But in the meantime, I still can't recommend turning on TRUST_STRXFRM
on any OS that I know of!  The strcolltest.c program certainly still
finds fault with glibc 2.36 despite the last update on that redhat
bugzilla ticket that suggested that the big resync back in 2.28 was
going to fix it.

To be fair, macOS does actually pass that test for all locales, but
the strxfrm() result is too narrow to be useful, according to comments
in our tree.  I would guess that a couple of other OSes with the old
Berkeley locale code are similar.

Attachment

Re: check_strxfrm_bug()

From
Thomas Munro
Date:
On Thu, Dec 15, 2022 at 3:22 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> ... If you're worried that the bugs might
> come back, then the test is insufficient: modern versions of both OSes
> have strxfrm_l(), which we aren't checking.

With my garbage collector hat on, that made me wonder if there was
some more potential cleanup here: could we require locale_t yet?  The
last straggler systems on our target OS list to add the POSIX locale_t
stuff were Solaris 11.4 (2018) and OpenBSD 6.2 (2018).  Apparently
it's still too soon: we have two EOL'd OSes in the farm that are older
than that.  But here's an interesting fact about wrasse, assuming its
host is gcc211: it looks like it can't even apply further OS updates
because the hardware[1] is so old that Solaris doesn't support it
anymore[2].

[1] https://cfarm.tetaneutral.net/machines/list/
[2] https://support.oracle.com/knowledge/Sun%20Microsystems/2382427_1.html



Re: check_strxfrm_bug()

From
Thomas Munro
Date:
On Sun, Dec 18, 2022 at 10:27 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> With my garbage collector hat on, that made me wonder if there was
> some more potential cleanup here: could we require locale_t yet?  The
> last straggler systems on our target OS list to add the POSIX locale_t
> stuff were Solaris 11.4 (2018) and OpenBSD 6.2 (2018).  Apparently
> it's still too soon: we have two EOL'd OSes in the farm that are older
> than that.  But here's an interesting fact about wrasse, assuming its
> host is gcc211: it looks like it can't even apply further OS updates
> because the hardware[1] is so old that Solaris doesn't support it
> anymore[2].

For the record, now the OpenBSD machines have been upgraded, so now
"wrasse" is the last relevant computer on earth with no POSIX
locale_t.  Unfortunately there is no reason to think it's going to go
away soon, so I'm just noting this fact here as a reminder for when it
eventually does...



Re: check_strxfrm_bug()

From
Nathan Bossart
Date:
On Thu, Dec 15, 2022 at 03:22:59PM +1300, Thomas Munro wrote:
> While studying Jeff's new crop of collation patches I noticed in
> passing that check_strxfrm_bug() must surely by now be unnecessary.
> The buffer overrun bugs were fixed a decade ago, and the relevant
> systems are way out of support.  If you're worried that the bugs might
> come back, then the test is insufficient: modern versions of both OSes
> have strxfrm_l(), which we aren't checking.  In any case, we also
> completely disable this stuff because of bugs and quality problems in
> every other known implementation, via TRUST_STRXFRM (or rather the
> lack of it).  So I think it's time to remove that function; please see
> attached.

Seems reasonable to me.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: check_strxfrm_bug()

From
Tom Lane
Date:
Nathan Bossart <nathandbossart@gmail.com> writes:
> On Thu, Dec 15, 2022 at 03:22:59PM +1300, Thomas Munro wrote:
>> While studying Jeff's new crop of collation patches I noticed in
>> passing that check_strxfrm_bug() must surely by now be unnecessary.
>> The buffer overrun bugs were fixed a decade ago, and the relevant
>> systems are way out of support.  If you're worried that the bugs might
>> come back, then the test is insufficient: modern versions of both OSes
>> have strxfrm_l(), which we aren't checking.  In any case, we also
>> completely disable this stuff because of bugs and quality problems in
>> every other known implementation, via TRUST_STRXFRM (or rather the
>> lack of it).  So I think it's time to remove that function; please see
>> attached.

> Seems reasonable to me.

+1.  I wonder if we should go further and get rid of TRUST_STRXFRM
and the not-so-trivial amount of code around it (pg_strxfrm_enabled
etc).  Carrying that indefinitely in the probably-vain hope that
the libraries will become trustworthy seems rather pointless.
Besides, if such a miracle does occur, we can dig the code out
of our git history.

            regards, tom lane



Re: check_strxfrm_bug()

From
Peter Geoghegan
Date:
On Mon, Apr 17, 2023 at 2:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> +1.  I wonder if we should go further and get rid of TRUST_STRXFRM
> and the not-so-trivial amount of code around it (pg_strxfrm_enabled
> etc).  Carrying that indefinitely in the probably-vain hope that
> the libraries will become trustworthy seems rather pointless.
> Besides, if such a miracle does occur, we can dig the code out
> of our git history.

+1 for getting rid of TRUST_STRXFRM.

ICU-based collations (which aren't affected by TRUST_STRXFRM) are
becoming the de facto standard (possibly even the de jure standard).
So even if we thought that the situation with strxfrm() had improved,
we'd still have little motivation to do anything about it.

--
Peter Geoghegan



Re: check_strxfrm_bug()

From
Michael Paquier
Date:
On Mon, Apr 17, 2023 at 03:40:14PM -0700, Peter Geoghegan wrote:
> On Mon, Apr 17, 2023 at 2:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> +1.  I wonder if we should go further and get rid of TRUST_STRXFRM
>> and the not-so-trivial amount of code around it (pg_strxfrm_enabled
>> etc).  Carrying that indefinitely in the probably-vain hope that
>> the libraries will become trustworthy seems rather pointless.
>> Besides, if such a miracle does occur, we can dig the code out
>> of our git history.
>
> +1 for getting rid of TRUST_STRXFRM.
>
> ICU-based collations (which aren't affected by TRUST_STRXFRM) are
> becoming the de facto standard (possibly even the de jure standard).
> So even if we thought that the situation with strxfrm() had improved,
> we'd still have little motivation to do anything about it.

Makes sense to do some cleanup now as this is new in the tree.
Perhaps somebody from the RMT would like to comment?

FYI, Jeff has also posted patches to replace this CFLAGS with a GUC:
https://www.postgresql.org/message-id/6ec4ad7f93f255dbb885da0a664d9c77ed4872c4.camel@j-davis.com
--
Michael

Attachment

Re: check_strxfrm_bug()

From
Thomas Munro
Date:
On Tue, Apr 18, 2023 at 11:52 AM Michael Paquier <michael@paquier.xyz> wrote:
> On Mon, Apr 17, 2023 at 03:40:14PM -0700, Peter Geoghegan wrote:
> > +1 for getting rid of TRUST_STRXFRM.

+1

The situation is not improving fast, and requires hard work to follow
on each OS.  Clearly, mainstreaming ICU is the way to go.  libc
support will always have niche uses, to be compatible with other
software on the box, but trusting strxfrm doesn't seem to be on the
cards any time soon.



Re: check_strxfrm_bug()

From
"Jonathan S. Katz"
Date:
On 4/18/23 9:19 PM, Thomas Munro wrote:
> On Tue, Apr 18, 2023 at 11:52 AM Michael Paquier <michael@paquier.xyz> wrote:
>> On Mon, Apr 17, 2023 at 03:40:14PM -0700, Peter Geoghegan wrote:
>>> +1 for getting rid of TRUST_STRXFRM.
> 
> +1
> 
> The situation is not improving fast, and requires hard work to follow
> on each OS.  Clearly, mainstreaming ICU is the way to go.  libc
> support will always have niche uses, to be compatible with other
> software on the box, but trusting strxfrm doesn't seem to be on the
> cards any time soon.

[RMT hat, personal opinion on RMT]

To be clear, is the proposal to remove both "check_strxfrm_bug" and 
"TRUST_STRXFRM"?

Given a bunch of folks who have expertise in this area of code all agree 
with removing the above as part of the collation cleanups targeted for 
v16, I'm inclined to agree. I don't really see the need for an explicit 
RMT action, but based on the consensus this seems OK to add as an open item.

Thanks,

Jonathan

Attachment

Re: check_strxfrm_bug()

From
Joe Conway
Date:
On 4/18/23 21:19, Thomas Munro wrote:
> On Tue, Apr 18, 2023 at 11:52 AM Michael Paquier <michael@paquier.xyz> wrote:
>> On Mon, Apr 17, 2023 at 03:40:14PM -0700, Peter Geoghegan wrote:
>> > +1 for getting rid of TRUST_STRXFRM.
> 
> +1
> 
> The situation is not improving fast, and requires hard work to follow
> on each OS.  Clearly, mainstreaming ICU is the way to go.  libc
> support will always have niche uses, to be compatible with other
> software on the box, but trusting strxfrm doesn't seem to be on the
> cards any time soon.

I have wondered a few times, given the known issues with strxfrm, how is 
the use in selfuncs.c still ok. Has anyone taken a hard look at that?

-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: check_strxfrm_bug()

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> I have wondered a few times, given the known issues with strxfrm, how is 
> the use in selfuncs.c still ok. Has anyone taken a hard look at that?

On the one hand, we only need approximately-correct results in that
code.  On the other, the result is fed to convert_string_to_scalar(),
which has a rather naive idea that it's dealing with ASCII text.
I've seen at least some strxfrm output that isn't even vaguely
textual-looking.

            regards, tom lane



Re: check_strxfrm_bug()

From
Thomas Munro
Date:
On Wed, Apr 19, 2023 at 2:31 PM Jonathan S. Katz <jkatz@postgresql.org> wrote:
> To be clear, is the proposal to remove both "check_strxfrm_bug" and
> "TRUST_STRXFRM"?
>
> Given a bunch of folks who have expertise in this area of code all agree
> with removing the above as part of the collation cleanups targeted for
> v16, I'm inclined to agree. I don't really see the need for an explicit
> RMT action, but based on the consensus this seems OK to add as an open item.

Thanks all.  I went ahead and removed check_strxfrm_bug().

I could write a patch to remove the libc strxfrm support, but since
Jeff recently wrote new code in 16 to abstract that stuff, he might
prefer to look at it?



Re: check_strxfrm_bug()

From
Jeff Davis
Date:
On Thu, 2023-04-20 at 13:34 +1200, Thomas Munro wrote:
> I could write a patch to remove the libc strxfrm support, but since
> Jeff recently wrote new code in 16 to abstract that stuff, he might
> prefer to look at it?

+1 to removing it.

As far as how it's removed, we could directly check:

  if (!collate_c && !(locale && locale->provider == COLLPROVIDER_ICU))
    abbreviate = false;

as it was before, or we could still try to hide it as a detail behind a
function. I don't have a strong opinion there, though I thought it
might be good for varlena.c to not know those internal details.

Regards,
    Jeff Davis




Re: check_strxfrm_bug()

From
"Jonathan S. Katz"
Date:
On 4/19/23 9:34 PM, Thomas Munro wrote:
> On Wed, Apr 19, 2023 at 2:31 PM Jonathan S. Katz <jkatz@postgresql.org> wrote:
>> To be clear, is the proposal to remove both "check_strxfrm_bug" and
>> "TRUST_STRXFRM"?
>>
>> Given a bunch of folks who have expertise in this area of code all agree
>> with removing the above as part of the collation cleanups targeted for
>> v16, I'm inclined to agree. I don't really see the need for an explicit
>> RMT action, but based on the consensus this seems OK to add as an open item.
> 
> Thanks all.  I went ahead and removed check_strxfrm_bug().

Thanks! For housekeeping, I put this into "Open Items" and marked it as 
resolved.

> I could write a patch to remove the libc strxfrm support, but since
> Jeff recently wrote new code in 16 to abstract that stuff, he might
> prefer to look at it?

I believe we'd be qualifying this as an open item too? If so, let's add it.

Thanks,

Jonathan

Attachment

Re: check_strxfrm_bug()

From
Thomas Munro
Date:
On Mon, Apr 17, 2023 at 8:00 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Sun, Dec 18, 2022 at 10:27 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> > With my garbage collector hat on, that made me wonder if there was
> > some more potential cleanup here: could we require locale_t yet?  The
> > last straggler systems on our target OS list to add the POSIX locale_t
> > stuff were Solaris 11.4 (2018) and OpenBSD 6.2 (2018).  Apparently
> > it's still too soon: we have two EOL'd OSes in the farm that are older
> > than that.  But here's an interesting fact about wrasse, assuming its
> > host is gcc211: it looks like it can't even apply further OS updates
> > because the hardware[1] is so old that Solaris doesn't support it
> > anymore[2].
>
> For the record, now the OpenBSD machines have been upgraded, so now
> "wrasse" is the last relevant computer on earth with no POSIX
> locale_t.  Unfortunately there is no reason to think it's going to go
> away soon, so I'm just noting this fact here as a reminder for when it
> eventually does...

Since talk of server threads erupted again, I just wanted to note that
this system locale API stuff would be on the long list of
micro-obstacles.  You'd *have* to use the locale_t-based interfaces
(or come up with replacements using a big ugly lock to serialise all
access to the process-global locale, or allow only ICU locale support
in that build, but those seem like strange lengths to go to just to
support a dead version of Solaris).  There are still at least a couple
of functions that lack XXX_l variants in the standard: mbstowcs() and
wcstombs() (though we use the non-standard _l variants if we find them
in <xlocale.h>), but that's OK because we use uselocale() and not
setlocale(), because uselocale() is required to be thread-local.  The
use of setlocale() to set up the per-backend/per-database default
locale would have to be replaced with uselocale().  In other words, I
think wrasse would not be coming with us on this hypothetical quest.



Re: check_strxfrm_bug()

From
Heikki Linnakangas
Date:
On 12/06/2023 01:15, Thomas Munro wrote:
> There are still at least a couple
> of functions that lack XXX_l variants in the standard: mbstowcs() and
> wcstombs() (though we use the non-standard _l variants if we find them
> in <xlocale.h>), but that's OK because we use uselocale() and not
> setlocale(), because uselocale() is required to be thread-local.

Right, mbstowcs() and wcstombs() are already thread-safe, that's why 
there are no _l variants of them.

> The use of setlocale() to set up the per-backend/per-database
> default locale would have to be replaced with uselocale().
This recent bug report is also related to that: 
https://www.postgresql.org/message-id/17946-3e84cb577e9551c3%40postgresql.org. 
In a nutshell, libperl calls uselocale(), which overrides the setting we 
try set with setlocale().

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: check_strxfrm_bug()

From
Thomas Munro
Date:
The GCC build farm has just received some SPARC hardware new enough to
run modern Solaris (hostname gcc106), so if wrasse were moved over
there we could finally assume all systems have POSIX 2008 (AKA
SUSv4)'s locale_t.

It's slightly annoying that Windows has locale_t but doesn't have
uselocale().  It does have thread-local locales via another API,
though.  I wonder how hard it would be to get to a point where all
systems have uselocale() too, by supplying a replacement.  I noticed
that some other projects eg older versions of LLVM libcxx do that.  I
see from one of their discussions[1] that it worked, except that
thread-local locales are only available with one of the MinGW C
runtimes and not another.  We'd have to get to the bottom of that.

[1] https://reviews.llvm.org/D40181



Re: check_strxfrm_bug()

From
Thomas Munro
Date:
On Wed, Jun 28, 2023 at 11:03 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> The GCC build farm has just received some SPARC hardware new enough to
> run modern Solaris (hostname gcc106), so if wrasse were moved over
> there we could finally assume all systems have POSIX 2008 (AKA
> SUSv4)'s locale_t.

That would look something like this.

Attachment

Re: check_strxfrm_bug()

From
Noah Misch
Date:
On Wed, Jun 28, 2023 at 01:02:21PM +1200, Thomas Munro wrote:
> On Wed, Jun 28, 2023 at 11:03 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> > The GCC build farm has just received some SPARC hardware new enough to
> > run modern Solaris (hostname gcc106), so if wrasse were moved over
> > there we could finally assume all systems have POSIX 2008 (AKA
> > SUSv4)'s locale_t.
> 
> That would look something like this.

This removes thirty-eight ifdefs, most of them located in the middle of
function bodies.  That's far more beneficial than most proposals to raise
minimum requirements.  +1 for revoking support for wrasse's OS version.
(wrasse wouldn't move, but it would stop testing v17+.)



Re: check_strxfrm_bug()

From
Thomas Munro
Date:
On Sun, Jul 2, 2023 at 4:25 AM Noah Misch <noah@leadboat.com> wrote:
> On Wed, Jun 28, 2023 at 01:02:21PM +1200, Thomas Munro wrote:
> > On Wed, Jun 28, 2023 at 11:03 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> > > The GCC build farm has just received some SPARC hardware new enough to
> > > run modern Solaris (hostname gcc106), so if wrasse were moved over
> > > there we could finally assume all systems have POSIX 2008 (AKA
> > > SUSv4)'s locale_t.
> >
> > That would look something like this.
>
> This removes thirty-eight ifdefs, most of them located in the middle of
> function bodies.  That's far more beneficial than most proposals to raise
> minimum requirements.  +1 for revoking support for wrasse's OS version.
> (wrasse wouldn't move, but it would stop testing v17+.)

Great.  It sounds like I should wait a few days for any other feedback
and then push this patch.  Thanks for looking.



Re: check_strxfrm_bug()

From
"Tristan Partin"
Date:
On Sun Jul 2, 2023 at 8:49 PM CDT, Thomas Munro wrote:
> On Sun, Jul 2, 2023 at 4:25 AM Noah Misch <noah@leadboat.com> wrote:
> > On Wed, Jun 28, 2023 at 01:02:21PM +1200, Thomas Munro wrote:
> > > On Wed, Jun 28, 2023 at 11:03 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> > > > The GCC build farm has just received some SPARC hardware new enough to
> > > > run modern Solaris (hostname gcc106), so if wrasse were moved over
> > > > there we could finally assume all systems have POSIX 2008 (AKA
> > > > SUSv4)'s locale_t.
> > >
> > > That would look something like this.
> >
> > This removes thirty-eight ifdefs, most of them located in the middle of
> > function bodies.  That's far more beneficial than most proposals to raise
> > minimum requirements.  +1 for revoking support for wrasse's OS version.
> > (wrasse wouldn't move, but it would stop testing v17+.)
>
> Great.  It sounds like I should wait a few days for any other feedback
> and then push this patch.  Thanks for looking.

The patch looks good to me as well. Happy to rebase my other patch on
this one.

--
Tristan Partin
Neon (https://neon.tech)



Re: check_strxfrm_bug()

From
Thomas Munro
Date:
On Tue, Jul 4, 2023 at 2:52 AM Tristan Partin <tristan@neon.tech> wrote:
> The patch looks good to me as well. Happy to rebase my other patch on
> this one.

Thanks.  Here is a slightly tidier version.  It passes on CI[1]
including the optional extra MinGW64/Meson task, and the
MinGW64/autoconf configure+build that is in the SanityCheck task.
There are two questions I'm hoping to get feedback on:  (1) I believe
that defining HAVE_MBSTOWCS_L etc in win32_port.h is the best idea
because that is also where we define mbstowcs_l() etc.  Does that make
sense?  (2) IIRC, ye olde Solution.pm system might break if I were to
completely remove HAVE_MBSTOWCS_L and HAVE_WCSTOMBS_L from Solution.pm
(there must be a check somewhere that compares it with pg_config.h.in
or something like that), but it would also break if I defined them as
1 there (macro redefinition).    Will undef in Solution.pm be
acceptable (ie define nothing to avoid redefinition, but side-step the
sanity check)?  It's a bit of a kludge, but IIRC we're dropping that
3rd build system in 17 so maybe that's OK?  (Not tested as I don't
have Windows and CI doesn't test Solution.pm, so I'd be grateful if
someone who has Windows/Solution.pm setup could try this.)

[1] https://cirrus-ci.com/build/5298278007308288

Attachment

Re: check_strxfrm_bug()

From
Thomas Munro
Date:
On Wed, Jul 5, 2023 at 10:15 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> [1] https://cirrus-ci.com/build/5298278007308288

That'll teach me to be impatient.  I only waited for compiling to
finish after triggering the optional extra MinGW task before sending
the above email, figuring that the only risk was there, but then the
pg_upgrade task failed due to mismatched locales.  Apparently there is
something I don't understand yet about locale_t support under MinGW.



Re: check_strxfrm_bug()

From
Peter Eisentraut
Date:
On 05.07.23 00:15, Thomas Munro wrote:
> On Tue, Jul 4, 2023 at 2:52 AM Tristan Partin <tristan@neon.tech> wrote:
>> The patch looks good to me as well. Happy to rebase my other patch on
>> this one.
> 
> Thanks.  Here is a slightly tidier version.  It passes on CI[1]
> including the optional extra MinGW64/Meson task, and the
> MinGW64/autoconf configure+build that is in the SanityCheck task.
> There are two questions I'm hoping to get feedback on:  (1) I believe
> that defining HAVE_MBSTOWCS_L etc in win32_port.h is the best idea
> because that is also where we define mbstowcs_l() etc.  Does that make
> sense?  (2) IIRC, ye olde Solution.pm system might break if I were to
> completely remove HAVE_MBSTOWCS_L and HAVE_WCSTOMBS_L from Solution.pm
> (there must be a check somewhere that compares it with pg_config.h.in
> or something like that), but it would also break if I defined them as
> 1 there (macro redefinition).

I think the correct solution is to set HAVE_MBSTOWCS_L in Solution.pm. 
Compare HAVE_FSEEKO, which is set in Solution.pm with fseeko being 
defined in win32_port.h.




Re: check_strxfrm_bug()

From
Thomas Munro
Date:
On Fri, Jul 7, 2023 at 4:20 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> I think the correct solution is to set HAVE_MBSTOWCS_L in Solution.pm.
> Compare HAVE_FSEEKO, which is set in Solution.pm with fseeko being
> defined in win32_port.h.

In this version I have it in there, but set it to undef.  This way
configure (MinGW), meson (MinGW or MSVC), and Solution.pm all agree.
This version passes on CI (not quite sure what I screwed up before).

To restate the problem I am solving with this Solution.pm change +
associated changes in the C: configure+MinGW disabled the libc
provider completely before.  With this patch that is fixed, and that
forced me to address the inconsistency (because if you have the libc
provider but no _l functions, you hit uselocale() code paths that
Windows can't do).  It was a bug, really, but I don't plan to
back-patch anything (nobody really uses configure+MinGW builds AFAIK,
they exist purely as a developer [in]convenience).  But that explains
why I needed to make a change.

Thinking about how to bring this all into "normal" form -- where
HAVE_XXX means "system defines XXX", not "system defines XXX || we
define a replacement" -- leads to the attached.  Do you like this one?

Attachment

Re: check_strxfrm_bug()

From
"Tristan Partin"
Date:
On Fri Jul 7, 2023 at 3:30 PM CDT, Thomas Munro wrote:
> On Fri, Jul 7, 2023 at 4:20 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> > I think the correct solution is to set HAVE_MBSTOWCS_L in Solution.pm.
> > Compare HAVE_FSEEKO, which is set in Solution.pm with fseeko being
> > defined in win32_port.h.
>
> In this version I have it in there, but set it to undef.  This way
> configure (MinGW), meson (MinGW or MSVC), and Solution.pm all agree.
> This version passes on CI (not quite sure what I screwed up before).
>
> To restate the problem I am solving with this Solution.pm change +
> associated changes in the C: configure+MinGW disabled the libc
> provider completely before.  With this patch that is fixed, and that
> forced me to address the inconsistency (because if you have the libc
> provider but no _l functions, you hit uselocale() code paths that
> Windows can't do).  It was a bug, really, but I don't plan to
> back-patch anything (nobody really uses configure+MinGW builds AFAIK,
> they exist purely as a developer [in]convenience).  But that explains
> why I needed to make a change.
>
> Thinking about how to bring this all into "normal" form -- where
> HAVE_XXX means "system defines XXX", not "system defines XXX || we
> define a replacement" -- leads to the attached.  Do you like this one?

Should you wrap the two _l function replacements in HAVE_USELOCALE
instead of WIN32?

> +if not cc.has_type('locale_t', prefix: '#include <locale.h>') and cc.has_type('locale_t', prefix: '#include
<xlocale.h>')

I wouldn't mind a line break after the 'and'.

Other than these comments, the patch looks fine to me.

--
Tristan Partin
Neon (https://neon.tech)



Re: check_strxfrm_bug()

From
Thomas Munro
Date:
On Sat, Jul 8, 2023 at 8:52 AM Tristan Partin <tristan@neon.tech> wrote:
> Should you wrap the two _l function replacements in HAVE_USELOCALE
> instead of WIN32?

I find that more confusing, and I'm also not sure if HAVE_USELOCALE is
even going to survive (based on your nearby thread).  I mean, by the
usual criteria that we applied to a lot of other system library
functions in the 16 cycle, I'd drop it.  It's in the standard and all
relevant systems have it except Windows which we have to handle with
special pain-in-the-neck logic anyway.

> > +if not cc.has_type('locale_t', prefix: '#include <locale.h>') and cc.has_type('locale_t', prefix: '#include
<xlocale.h>')
>
> I wouldn't mind a line break after the 'and'.

Ah, right, I am still learning what is allowed in this syntax...  will do.

> Other than these comments, the patch looks fine to me.

Thanks.  I will wait a bit to see if Peter has any other comments and
then push this.  I haven't actually tested with Solution.pm due to
lack of CI for that, but fingers crossed, since the build systems will
now agree, reducing the screw-up surface.



Re: check_strxfrm_bug()

From
Thomas Munro
Date:
On Sat, Jul 8, 2023 at 10:13 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> Thanks.  I will wait a bit to see if Peter has any other comments and
> then push this.  I haven't actually tested with Solution.pm due to
> lack of CI for that, but fingers crossed, since the build systems will
> now agree, reducing the screw-up surface.

Done.  Let's see what the build farm thinks...



Re: check_strxfrm_bug()

From
Peter Eisentraut
Date:
On 07.07.23 22:30, Thomas Munro wrote:
> Thinking about how to bring this all into "normal" form -- where
> HAVE_XXX means "system defines XXX", not "system defines XXX || we
> define a replacement"

HAVE_XXX means "code can use XXX", doesn't matter how it got there (it 
could also be a libpgport replacement).

So I don't think this code is correct.  AFAICT, there is nothing right 
now that can possibly define HAVE_MBSTOWCS_L on Windows/MSVC.  Was that 
the intention?

I think these changes would need to be reverted to make this correct:

-# MSVC has replacements defined in src/include/port/win32_port.h.
-if cc.get_id() == 'msvc'
-  cdata.set('HAVE_WCSTOMBS_L', 1)
-  cdata.set('HAVE_MBSTOWCS_L', 1)
-endif

-       HAVE_MBSTOWCS_L => 1,
+       HAVE_MBSTOWCS_L => undef,

-       HAVE_WCSTOMBS_L => 1,
+       HAVE_WCSTOMBS_L => undef,




Re: check_strxfrm_bug()

From
Thomas Munro
Date:
On Sun, Jul 9, 2023 at 6:20 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> So I don't think this code is correct.  AFAICT, there is nothing right
> now that can possibly define HAVE_MBSTOWCS_L on Windows/MSVC.  Was that
> the intention?

Yes, that was my intention.  Windows actually doesn't have them.  The
autoconf/MinGW test result was telling the truth.  Somehow I had to
make the three build systems agree on this.  Either by strong-arming
all three of them to emit a hard-coded claim that it does, or by
removing the test that produces a different answer in different build
systems.  I will happily do it the other way if you insist, which
would involve restoring the meson.build and Solultion.pm kludges you
quoted, but I'd also have to add a compatible kludge to configure.ac.
It doesn't seem like an improvement to me but I don't feel strongly
about it.  In the end, Solution.pm and configure.ac will be vaporised
by lasers, so we'll be left with 0 or 1 special cases.  I don't care
much, but I like 0, it's nice and round.



Re: check_strxfrm_bug()

From
Thomas Munro
Date:
On Sun, Jul 9, 2023 at 6:35 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Sun, Jul 9, 2023 at 6:20 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> > So I don't think this code is correct.  AFAICT, there is nothing right
> > now that can possibly define HAVE_MBSTOWCS_L on Windows/MSVC.  Was that
> > the intention?
>
> Yes, that was my intention.  Windows actually doesn't have them.

Thinking about that some more...  Its _XXX implementations don't deal
with UTF-8 the way Unix-based developers would expect, and are
therefore just portability hazards, aren't they?  What would we gain
by restoring the advertisement that they are available?  Perhaps we
should go the other way completely and remove the relevant #defines
from win32_port.h, and fully confine knowledge of them to pg_locale.c?
 It knows how to deal with that.  Here is a patch trying this idea
out, with as slightly longer explanation.

Attachment

Re: check_strxfrm_bug()

From
Peter Eisentraut
Date:
On 10.07.23 04:51, Thomas Munro wrote:
> On Sun, Jul 9, 2023 at 6:35 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>> On Sun, Jul 9, 2023 at 6:20 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>>> So I don't think this code is correct.  AFAICT, there is nothing right
>>> now that can possibly define HAVE_MBSTOWCS_L on Windows/MSVC.  Was that
>>> the intention?
>>
>> Yes, that was my intention.  Windows actually doesn't have them.
> 
> Thinking about that some more...  Its _XXX implementations don't deal
> with UTF-8 the way Unix-based developers would expect, and are
> therefore just portability hazards, aren't they?  What would we gain
> by restoring the advertisement that they are available?  Perhaps we
> should go the other way completely and remove the relevant #defines
> from win32_port.h, and fully confine knowledge of them to pg_locale.c?
>   It knows how to deal with that.  Here is a patch trying this idea
> out, with as slightly longer explanation.

This looks sensible to me.

If we ever need mbstowcs_l() etc. outside of pg_locale.c, then the 
proper way would be to make a mbstowcs_l.c file in src/port/.

But I like your approach for now because it moves us more firmly into 
the direction of having it contained in pg_locale.c, instead of having 
some knowledge global and some local.




Re: check_strxfrm_bug()

From
Thomas Munro
Date:
On Tue, Jul 11, 2023 at 2:28 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> This looks sensible to me.
>
> If we ever need mbstowcs_l() etc. outside of pg_locale.c, then the
> proper way would be to make a mbstowcs_l.c file in src/port/.
>
> But I like your approach for now because it moves us more firmly into
> the direction of having it contained in pg_locale.c, instead of having
> some knowledge global and some local.

Thanks.  Pushed.

That leaves only one further cleanup opportunity from discussion this
thread, for future work: a TRUST_STRXFRM-ectomy.