Thread: thread-safety: gmtime_r(), localtime_r()
Here is a patch for using gmtime_r() and localtime_r() instead of gmtime() and localtime(), for thread-safety. There are a few affected calls in libpq and ecpg's libpgtypes, which are probably effectively bugs, because those libraries already claim to be thread-safe. There is one affected call in the backend. Most of the backend otherwise uses the custom functions pg_gmtime() and pg_localtime(), which are implemented differently. Some portability fun: gmtime_r() and localtime_r() are in POSIX but are not available on Windows. Windows has functions gmtime_s() and localtime_s() that can fulfill the same purpose, so we can add some small wrappers around them. (Note that these *_s() functions are also different from the *_s() functions in the bounds-checking extension of C11. We are not using those here.) MinGW exposes neither *_r() nor *_s() by default. You can get at the POSIX-style *_r() functions by defining _POSIX_C_SOURCE appropriately before including <time.h>. (There is apparently probably also a way to get at the Windows-style *_s() functions by supplying some additional options or defines. But we might as well just use the POSIX ones.)
Attachment
On Thu, Jun 27, 2024 at 1:42 AM Peter Eisentraut <peter@eisentraut.org> wrote:
Here is a patch for using gmtime_r() and localtime_r() instead of
gmtime() and localtime(), for thread-safety.
There are a few affected calls in libpq and ecpg's libpgtypes, which are
probably effectively bugs, because those libraries already claim to be
thread-safe.
There is one affected call in the backend. Most of the backend
otherwise uses the custom functions pg_gmtime() and pg_localtime(),
which are implemented differently.
Some portability fun: gmtime_r() and localtime_r() are in POSIX but are
not available on Windows. Windows has functions gmtime_s() and
localtime_s() that can fulfill the same purpose, so we can add some
small wrappers around them. (Note that these *_s() functions are also
different from the *_s() functions in the bounds-checking extension of
C11. We are not using those here.)
MinGW exposes neither *_r() nor *_s() by default. You can get at the
POSIX-style *_r() functions by defining _POSIX_C_SOURCE appropriately
before including <time.h>. (There is apparently probably also a way to
get at the Windows-style *_s() functions by supplying some additional
options or defines. But we might as well just use the POSIX ones.)
Hi! Looks good to me.
But why you don`t change localtime function at all places?
For example:
src/bin/pg_controldata/pg_controldata.c
src/bin/pg_dump/pg_backup_archiver.c
src/bin/initdb/findtimezone.c
Best regards, Stepan Neretin.
On 27.06.24 06:47, Stepan Neretin wrote: > Hi! Looks good to me. > But why you don`t change localtime function at all places? > For example: > src/bin/pg_controldata/pg_controldata.c > src/bin/pg_dump/pg_backup_archiver.c > src/bin/initdb/findtimezone.c At the moment, I am focusing on the components that are already meant to be thread-safe (libpq, ecpg libs) and the ones we are actively looking at maybe converting (backend). I don't intend at this point to convert all other code to use only thread-safe APIs.
On 26/06/2024 21:42, Peter Eisentraut wrote: > Here is a patch for using gmtime_r() and localtime_r() instead of > gmtime() and localtime(), for thread-safety. > > There are a few affected calls in libpq and ecpg's libpgtypes, which are > probably effectively bugs, because those libraries already claim to be > thread-safe. +1 The Linux man page for localtime_r() says: > According to POSIX.1-2001, localtime() is required to behave as > though tzset(3) was called, while localtime_r() does not have this > requirement. For portable code, tzset(3) should be called before > localtime_r(). It's not clear to me what happens if tzset() has not been called and the localtime_r() implementation does not call it either. I guess some implementation default timezone is used. In the libpq traces, I don't think we care much. In ecpg, I'm not sure what the impact is if the application has not previously called tzset(). I'd suggest that we just document that an ecpg application should call tzset() before calling the functions that are sensitive to local timezone setting. > There is one affected call in the backend. Most of the backend > otherwise uses the custom functions pg_gmtime() and pg_localtime(), > which are implemented differently. Do we need to call tzset(3) somewhere in backend startup? Or could we replace that localtime() call in the backend with pg_localtime()? pg_gmtime() isn't thread-safe, because of the static 'gmtptr' in gmtsub(). But we can handle that in a separate patch. -- Heikki Linnakangas Neon (https://neon.tech)
On 04.07.24 18:36, Heikki Linnakangas wrote: > The Linux man page for localtime_r() says: > >> According to POSIX.1-2001, localtime() is required to behave as >> though tzset(3) was called, while localtime_r() does not have this >> requirement. For portable code, tzset(3) should be called before >> localtime_r(). > > It's not clear to me what happens if tzset() has not been called and the > localtime_r() implementation does not call it either. I guess some > implementation default timezone is used. > > In the libpq traces, I don't think we care much. In ecpg, I'm not sure > what the impact is if the application has not previously called tzset(). > I'd suggest that we just document that an ecpg application should call > tzset() before calling the functions that are sensitive to local > timezone setting. I have been studying this question. It appears that various libc implementers have been equally puzzled by this; there are various comments like "it's unclear what POSIX wants here" in the sources. (I have checked glibc, FreeBSD, and Solaris.) Consider if a program calls localtime() or localtime_r() twice: localtime(...); ... localtime(...); or localtime_r(...); ... localtime_r(...); The question here is, how many times does this effectively (internally) call tzset(). There are three possible answers: 0, 1, or 2. For localtime(), the answer is clear. localtime() is required to call tzset() every time, so the answer is 2. For localtime_r(), it's unclear. What you are wondering, and I have been wondering, is whether the answer is 0 or non-zero (and possibly, if it's 0, will these calls misbehave badly). What the libc implementers are wondering is whether the answer is 1 or 2. The implementations whose source code I have checked think it should be 1. They never consider that it could be 0 and it's okay to misbehave. Where this difference appears it practice would be something like setenv("TZ", "foo"); localtime(...); // uses TZ foo setenv("TZ", "bar"); localtime(...); // uses TZ bar versus setenv("TZ", "foo"); localtime_r(...); // uses TZ foo if first call in program setenv("TZ", "bar"); localtime_r(...); // probably does not use new TZ If you want the second case to pick up the changed TZ setting, you must explicitly call tzset() to be sure. I think, considering this, the proposed patch should be okay. At least, the libraries are not going to misbehave if tzset() hasn't been called explicitly. It will be called internally the first time it's needed. I don't think we need to cater to cases like my example where the application changes the TZ environment variable but neglects to call tzset() itself. >> There is one affected call in the backend. Most of the backend >> otherwise uses the custom functions pg_gmtime() and pg_localtime(), >> which are implemented differently. > > Do we need to call tzset(3) somewhere in backend startup? Or could we > replace that localtime() call in the backend with pg_localtime()? Let's look at what this code actually does. It just takes the current time and then loops through all possible weekdays and months to collect and cache the localized names. The current time or time zone doesn't actually matter for this, we just need to fill in the struct tm a bit for strftime() to be happy. We could probably replace this with gmtime() to avoid the question about time zone state. (We probably don't even need to call time() beforehand, we could just use time zero. But the comment says "We use times close to current time as data for strftime().", which is probably prudent.) (Since we are caching the results for the session, we're already not dealing with the hilarious hypothetical situation where the weekday and month names depend on the actual time, in case that is a concern.)
On Tue, Jul 23, 2024 at 10:52 PM Peter Eisentraut <peter@eisentraut.org> wrote: > Let's look at what this code actually does. It just takes the current > time and then loops through all possible weekdays and months to collect > and cache the localized names. The current time or time zone doesn't > actually matter for this, we just need to fill in the struct tm a bit > for strftime() to be happy. We could probably replace this with > gmtime() to avoid the question about time zone state. (We probably > don't even need to call time() beforehand, we could just use time zero. > But the comment says "We use times close to current time as data for > strftime().", which is probably prudent.) (Since we are caching the > results for the session, we're already not dealing with the hilarious > hypothetical situation where the weekday and month names depend on the > actual time, in case that is a concern.) I think you could even just use a struct tm filled in with an example date? Hmm, but it's annoying to choose one, and annoying that POSIX says there may be other members of the struct, so yeah, I think gmtime_r(0, tm) makes sense. It can't be that important, because we aren't even using dates consistent with tm_wday, so we're assuming that strftime("%a") only looks at tm_wday. This change complements CF #5170's change strftime()->strftime_l(), to make the function fully thread-safe. Someone could also rewrite it to call nl_langinfo_l({ABDAY,ABMON,DAY,MON}_1 + n, locale) directly, or GetLocaleInfoEx(locale_name, LOCALE_S{ABBREVDAY,ABBREVMONTH,DAY,MONTH}NAME1 + n, ...) on Window, but that'd be more code churn.
On Sat, Aug 17, 2024 at 1:09 AM Thomas Munro <thomas.munro@gmail.com> wrote: > This change complements CF #5170's change strftime()->strftime_l(), to > make the function fully thread-safe. (Erm, I meant its standard library... of course it has its own global variables to worry about still.)
On Sat, Aug 17, 2024 at 3:43 AM Peter Eisentraut <peter@eisentraut.org> wrote: > I moved the _POSIX_C_SOURCE definition for MinGW from the header file to > a command-line option (-D_POSIX_C_SOURCE). This matches the treatment > of _GNU_SOURCE and similar. I was trying to figure out what else -D_POSIX_C_SOURCE does to MinGW. Enables __USE_MINGW_ANSI_STDIO, apparently, but I don't know if we were using that already, or if it matters. I suppose if it ever shows up as a problem, we can explicitly disable it. . o O ( MinGW is a strange beast. Do we want to try to keep the code it runs as close as possible to what is used by MSVC? I thought so, but we can't always do that due to missing interfaces (though I suspect that many #ifdef _MSC_VER tests are based on ancient versions and now bogus). But it also offers ways to be more POSIX-y if we want, and then we have to decide whether to take them, and make it more like a separate platform with different quirks... ) > I think this is about as good as it's going to get, and we need it to > be, so I propose to commit this version if there are no further concerns. LGTM.
On 16.08.24 23:01, Thomas Munro wrote: > On Sat, Aug 17, 2024 at 3:43 AM Peter Eisentraut<peter@eisentraut.org> wrote: >> I moved the _POSIX_C_SOURCE definition for MinGW from the header file to >> a command-line option (-D_POSIX_C_SOURCE). This matches the treatment >> of _GNU_SOURCE and similar. > I was trying to figure out what else -D_POSIX_C_SOURCE does to MinGW. > Enables __USE_MINGW_ANSI_STDIO, apparently, but I don't know if we > were using that already, or if it matters. I suppose if it ever shows > up as a problem, we can explicitly disable it. > > . o O ( MinGW is a strange beast. Do we want to try to keep the code > it runs as close as possible to what is used by MSVC? I thought so, > but we can't always do that due to missing interfaces (though I > suspect that many #ifdef _MSC_VER tests are based on ancient versions > and now bogus). But it also offers ways to be more POSIX-y if we > want, and then we have to decide whether to take them, and make it > more like a separate platform with different quirks... ) Yeah, ideally we'd keep it aligned with MSVC. But a problem here is that if _POSIX_C_SOURCE (or _GNU_SOURCE or something like that) gets defined for other reasons, then there would be conflicts between the system headers and our workaround #define's. At least plpython triggers such a conflict in my testing. This is the usual problem that we also have with _GNU_SOURCE in other contexts.
On 19.08.24 11:43, Peter Eisentraut wrote: > On 16.08.24 23:01, Thomas Munro wrote: >> On Sat, Aug 17, 2024 at 3:43 AM Peter >> Eisentraut<peter@eisentraut.org> wrote: >>> I moved the _POSIX_C_SOURCE definition for MinGW from the header file to >>> a command-line option (-D_POSIX_C_SOURCE). This matches the treatment >>> of _GNU_SOURCE and similar. >> I was trying to figure out what else -D_POSIX_C_SOURCE does to MinGW. >> Enables __USE_MINGW_ANSI_STDIO, apparently, but I don't know if we >> were using that already, or if it matters. I suppose if it ever shows >> up as a problem, we can explicitly disable it. >> >> . o O ( MinGW is a strange beast. Do we want to try to keep the code >> it runs as close as possible to what is used by MSVC? I thought so, >> but we can't always do that due to missing interfaces (though I >> suspect that many #ifdef _MSC_VER tests are based on ancient versions >> and now bogus). But it also offers ways to be more POSIX-y if we >> want, and then we have to decide whether to take them, and make it >> more like a separate platform with different quirks... ) > > Yeah, ideally we'd keep it aligned with MSVC. But a problem here is > that if _POSIX_C_SOURCE (or _GNU_SOURCE or something like that) gets > defined for other reasons, then there would be conflicts between the > system headers and our workaround #define's. At least plpython triggers > such a conflict in my testing. This is the usual problem that we also > have with _GNU_SOURCE in other contexts. I have committed this, with this amended explanation.