Thread: Windows locale cause server to send invalid data encoding to client
I set up a postgresql server on Windows 10 and connected it using Rust, but the Rust client reports invalid UTF-8 data when the password is wrong. I use a french locale windows that contain some accents "éèê" etc.
Windows use WTF-16 but the Rust client asks UTF-8 to the serveur so we think it's a bug from the server to not send UTF-8 even if Windows doesn't use it for its locale translation.
Details can be found at this github issue #803. Here a recorded wireshark of the problem output.pcapng.gz.
The version is: `psql (PostgreSQL) 13.3`
I never used a mailing list so I'm not used to it.
--
La Terre est le berceau de l'humanité mais qui voudrait passer sa vie dans un berceau.
On Wed, Jul 14, 2021 at 01:49:24PM +0200, Antoine wrote: > I set up a postgresql server on Windows 10 and connected it using Rust, but > the Rust client reports invalid UTF-8 data when the password is wrong. I > use a french locale windows that contain some accents "éèê" etc. > > Windows use WTF-16 <https://simonsapin.github.io/wtf-8/> but the Rust > client asks UTF-8 to the serveur so we think it's a bug from the server to > not send UTF-8 even if Windows doesn't use it for its locale translation. This is unfortunately working as designed. The client encoding can't be set during startup (and authentication is part of it), see https://github.com/postgres/postgres/blob/master/src/backend/utils/mb/mbutils.c#L85-L88 for more details about it: > /* > * During backend startup we can't set client encoding because we (a) > * can't look up the conversion functions, and (b) may not know the database > * encoding yet either. So SetClientEncoding() just accepts anything and > * remembers it for InitializeClientEncoding() to apply later. > */ The driver should be prepared to receive non UTF-8 messages until authentication succeeded.
Julien Rouhaud <rjuju123@gmail.com> writes: > On Wed, Jul 14, 2021 at 01:49:24PM +0200, Antoine wrote: >> I set up a postgresql server on Windows 10 and connected it using Rust, but >> the Rust client reports invalid UTF-8 data when the password is wrong. I >> use a french locale windows that contain some accents "éèê" etc. > This is unfortunately working as designed. The client encoding can't be set > during startup (and authentication is part of it), see > https://github.com/postgres/postgres/blob/master/src/backend/utils/mb/mbutils.c#L85-L88 > for more details about it: It seems like the core problem is that the "authentication failed" error text may be sent in an unexpected encoding. I wonder if we should decline to translate any error messages until we've established the requested client encoding. Sending the message in English isn't ideal either, but it'd avoid this hazard. regards, tom lane
Julien Rouhaud <rjuju123@gmail.com> writes: > On Wed, Jul 14, 2021 at 10:19:59AM -0400, Tom Lane wrote: >> I wonder if we should decline >> to translate any error messages until we've established the requested >> client encoding. Sending the message in English isn't ideal either, >> but it'd avoid this hazard. > I'm not sure which one is the worst. One the bright side there aren't that > many messages that can be sent until the client encoding can be set up for I'm > +0.5 for this change. Yeah, it's ugly either way. I think though that the reason we don't hear more complaints about this is that such messages are currently sent using the language and encoding derived from the postmaster's environment. In simple cases that'll be the same as the client's environment and everything works. So after thinking harder, I'm afraid that breaking that scenario would make this idea a net loss. regards, tom lane