Thread: gai_strerror() is not thread-safe on Windows
Hi, Commit 5579388d removed a bunch of dead code, formerly needed for old systems that lacked getaddrinfo() in the early days of IPv6. We already used the system getaddrinfo() via either configure-time tests (Unix) or runtime tests (Windows using attempt-to-find-with-dlsym that always succeeded on modern systems), so no modern system needed the fallback code, except for one small detail: getaddrinfo() has a companion function to spit out human readable error messages, and although Windows has that too, it's not thread safe[1]. libpq shouldn't call it, or else an unlucky multi-threaded program might see an error message messed up by another thread. Here's a patch to put that bit back. It's simpler than before: the original replacement had a bunch of #ifdefs for various historical reasons, but now we can just handle the 8 documented EAI errors on Windows. Noticed while wondering why the list of symbols reported in bug #18219 didn't include gai_strerrorA. That turned out to be because it is static inline in ws2tcpip.h, and its definition set alarm bells ringing. Avoid. [1] https://learn.microsoft.com/en-us/windows/win32/api/ws2tcpip/nf-ws2tcpip-getaddrinfo
Attachment
On second thoughts, I guess it would make more sense to use the exact messages Windows' own implementation would return instead of whatever we had in the past (probably cribbed from some other OS or just made up?). I asked CI to spit those out[1]. Updated patch attached. Will add to CF. [1] https://cirrus-ci.com/task/5816802207334400?logs=main#L15
Attachment
At Tue, 5 Dec 2023 08:26:54 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in > On second thoughts, I guess it would make more sense to use the exact > messages Windows' own implementation would return instead of whatever > we had in the past (probably cribbed from some other OS or just made > up?). I asked CI to spit those out[1]. Updated patch attached. Will > add to CF. > > [1] https://cirrus-ci.com/task/5816802207334400?logs=main#L15 Windows' gai_strerror outputs messages that correspond to the language environment. Similarly, I think that the messages that the messages returned by our version should be translatable. These messages may add extra line-end periods to the parent (or cotaining) messages when appended. This looks as follows. (auth.c:517 : errdetail_log() : sub (detail) message) > Could not translate client host name "hoge" to IP address: An address incompatible with the requested protocol was used.. (hba.c:1562 : errmsg() : main message) > invalid IP address "192.0.2.1": This is usually a temporary error during hostname resolution and means that the local serverdid not receive a response from an authoritative server. When I first saw the first version, I thought it would be better to use Windows' own messages, just like you did. However, considering the content of the message above, wouldn't it be better to adhere to Linux-style messages overall? A slightly subtler point is that the second example seems to have a misalignment between the descriptions before and after the colon, but do you think it's not something to be concerned about to this extent? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Tue, Dec 5, 2023 at 3:43 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > At Tue, 5 Dec 2023 08:26:54 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in > > On second thoughts, I guess it would make more sense to use the exact > > messages Windows' own implementation would return instead of whatever > > we had in the past (probably cribbed from some other OS or just made > > up?). I asked CI to spit those out[1]. Updated patch attached. Will > > add to CF. > > > > [1] https://cirrus-ci.com/task/5816802207334400?logs=main#L15 > > Windows' gai_strerror outputs messages that correspond to the language > environment. Similarly, I think that the messages that the messages > returned by our version should be translatable. Hmm, that is a good point. Wow, POSIX has given us a terrible interface here, in terms of resource management. Let's see what glibc does: https://github.com/lattera/glibc/blob/master/sysdeps/posix/gai_strerror.c https://github.com/lattera/glibc/blob/master/sysdeps/posix/gai_strerror-strs.h It doesn't look like it knows about locales at all. And a test program seems to confirm: #include <locale.h> #include <netdb.h> #include <stdio.h> int main() { setlocale(LC_MESSAGES, "ja_JP.UTF-8"); printf("%s\n", gai_strerror(EAI_MEMORY)); } That prints: Memory allocation failure FreeBSD tries harder, and prints: メモリ割り当て失敗 We can see that it has a thread-local variable that holds a copy of that localised string until the next call to gai_strerror() in the same thread: https://github.com/freebsd/freebsd-src/blob/main/lib/libc/net/gai_strerror.c https://github.com/freebsd/freebsd-src/blob/main/lib/libc/nls/ja_JP.UTF-8.msg FreeBSD's message catalogues would provide a read-made source of translations, bu... hmm, if glibc doesn't bother and the POSIX interface is unhelpful and Windows' own implementation is so willfully unusable, I don't really feel inclined to build a whole thread-local cache thing on our side just to support this mess. So I think we should just hard-code the error messages in English and move on. However, English is my language so perhaps I should abstain and leave it to others to decide how important that is. > These messages may add extra line-end periods to the parent (or > cotaining) messages when appended. This looks as follows. > > (auth.c:517 : errdetail_log() : sub (detail) message) > > Could not translate client host name "hoge" to IP address: An address incompatible with the requested protocol was used.. > > (hba.c:1562 : errmsg() : main message) > > invalid IP address "192.0.2.1": This is usually a temporary error during hostname resolution and means that the localserver did not receive a response from an authoritative server. > > When I first saw the first version, I thought it would be better to > use Windows' own messages, just like you did. However, considering the > content of the message above, wouldn't it be better to adhere to > Linux-style messages overall? Yeah, I agree that either the glibc or the FreeBSD messages would be better than those now that I've seen them. They are short and sweet. > A slightly subtler point is that the second example seems to have a > misalignment between the descriptions before and after the colon, but > do you think it's not something to be concerned about to this extent? I didn't understand what you meant here.
At Thu, 7 Dec 2023 09:43:37 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in > On Tue, Dec 5, 2023 at 3:43 PM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: > > Windows' gai_strerror outputs messages that correspond to the language > > environment. Similarly, I think that the messages that the messages > > returned by our version should be translatable. > > Hmm, that is a good point. Wow, POSIX has given us a terrible > interface here, in terms of resource management. Let's see what glibc > does: > > https://github.com/lattera/glibc/blob/master/sysdeps/posix/gai_strerror.c > https://github.com/lattera/glibc/blob/master/sysdeps/posix/gai_strerror-strs.h It is quite a sight for sore eyes... > It doesn't look like it knows about locales at all. And a test > program seems to confirm: .. > setlocale(LC_MESSAGES, "ja_JP.UTF-8"); > printf("%s\n", gai_strerror(EAI_MEMORY)); > > That prints: > > Memory allocation failure > > FreeBSD tries harder, and prints: > > メモリ割り当て失敗 > > We can see that it has a thread-local variable that holds a copy of > that localised string until the next call to gai_strerror() in the > same thread: > > https://github.com/freebsd/freebsd-src/blob/main/lib/libc/net/gai_strerror.c > https://github.com/freebsd/freebsd-src/blob/main/lib/libc/nls/ja_JP.UTF-8.msg > > FreeBSD's message catalogues would provide a read-made source of > translations, bu... hmm, if glibc doesn't bother and the POSIX > interface is unhelpful and Windows' own implementation is so willfully > unusable, I don't really feel inclined to build a whole thread-local > cache thing on our side just to support this mess. I agree, I wouldn't want to do it either. > So I think we should just hard-code the error messages in English and > move on. However, English is my language so perhaps I should abstain > and leave it to others to decide how important that is. I also think that would be a good way. > > These messages may add extra line-end periods to the parent (or > > cotaining) messages when appended. This looks as follows. > > > > (auth.c:517 : errdetail_log() : sub (detail) message) > > > Could not translate client host name "hoge" to IP address: An address incompatible with the requested protocol wasused.. > > > > (hba.c:1562 : errmsg() : main message) > > > invalid IP address "192.0.2.1": This is usually a temporary error during hostname resolution and means that the localserver did not receive a response from an authoritative server. > > > > When I first saw the first version, I thought it would be better to > > use Windows' own messages, just like you did. However, considering the > > content of the message above, wouldn't it be better to adhere to > > Linux-style messages overall? > > Yeah, I agree that either the glibc or the FreeBSD messages would be > better than those now that I've seen them. They are short and sweet. > > > A slightly subtler point is that the second example seems to have a > > misalignment between the descriptions before and after the colon, but > > do you think it's not something to be concerned about to this extent? > > I didn't understand what you meant here. If it was just a temporary error that couldn't be resolved, it doesn't mean that the IP address is invalid. If such a cause is possible, then probabyly an error message saying "failed to resolve" would be more appropriate. However, I wrote it meaning that there is no need to go to great length to ensure consistency with this message. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Wed, Dec 6, 2023 at 8:45 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > So I think we should just hard-code the error messages in English and > > move on. However, English is my language so perhaps I should abstain > > and leave it to others to decide how important that is. > > I also think that would be a good way. Considering this remark from Kyotaro Horiguchi, I think the previously-posted patch could be committed. Thomas, do you plan to do that, or are there outstanding issues here? -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, 5 Dec 2023 at 00:57, Thomas Munro <thomas.munro@gmail.com> wrote: > > On second thoughts, I guess it would make more sense to use the exact > messages Windows' own implementation would return instead of whatever > we had in the past (probably cribbed from some other OS or just made > up?). I asked CI to spit those out[1]. Updated patch attached. Will > add to CF. CFBot shows that the patch does not apply anymore as in [1]: === Applying patches on top of PostgreSQL commit ID 376c216138c75e161d39767650ea30536f23b482 === === applying patch ./v2-0001-Fix-gai_strerror-thread-safety-on-Windows.patch patching file configure Hunk #1 succeeded at 16388 (offset 34 lines). patching file configure.ac Hunk #1 succeeded at 1885 (offset 7 lines). patching file src/include/port/win32/sys/socket.h patching file src/port/meson.build patching file src/port/win32gai_strerror.c can't find file to patch at input line 134 Perhaps you used the wrong -p or --strip option? The text leading up to this was: -------------------------- |diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm |index 46df01cc8d..c51296bdb6 100644 |--- a/src/tools/msvc/Mkvcbuild.pm |+++ b/src/tools/msvc/Mkvcbuild.pm -------------------------- No file to patch. Skipping patch. 1 out of 1 hunk ignored Please have a look and post an updated version. [1] - http://cfbot.cputube.org/patch_46_4682.log Regards, Vignesh
On Tue, Jan 16, 2024 at 8:52 AM Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Dec 6, 2023 at 8:45 PM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: > > > So I think we should just hard-code the error messages in English and > > > move on. However, English is my language so perhaps I should abstain > > > and leave it to others to decide how important that is. > > > > I also think that would be a good way. > > Considering this remark from Kyotaro Horiguchi, I think the > previously-posted patch could be committed. > > Thomas, do you plan to do that, or are there outstanding issues here? Pushed. I went with FreeBSD's error messages (I assume it'd be OK to take glibc's too under fair use but I didn't want to think about that).