Thread: Re: 64 bit numbers vs format strings
On 05.12.24 23:18, Thomas Munro wrote: > Having learned some things about gettext based on clues[1] from Peter > E, I decided to see what it would take to expunge all (long long) and > similar casts now that we're using the standard types with system > support. > > The short version is tha given uint64 x: > > Old: errmsg("hello %llu", (unsigned long long) x) > New: errmsg("hello %" PRIu64, x) I have committed the subset of this patch for pg_checksums.c so that the translators and whoever else might be affected can try this out at small scale. (I don't expect any particular problems.) Then we can move on to the rest in a few weeks, I think.
On Mon, Mar 3, 2025 at 6:21 AM Peter Eisentraut <peter@eisentraut.org> wrote: > On 05.12.24 23:18, Thomas Munro wrote: > > Old: errmsg("hello %llu", (unsigned long long) x) > > New: errmsg("hello %" PRIu64, x) > > I have committed the subset of this patch for pg_checksums.c so that the > translators and whoever else might be affected can try this out at small > scale. (I don't expect any particular problems.) Then we can move on > to the rest in a few weeks, I think. Good plan, thanks. Here's a rebase. I also added one more patch that I expect to be more contentious as it is a UX change. Why do we display LSNs with a slash? I believe there are two reasons: (1) Back in 2000 didn't require the existence of a 64 bit type, so XLogRecPtr was a struct holding two uint32 values. The author could still have used "%08X%08X" for both printf and scanf if that was the only reason. (2) It initially had a real semantic division into two parts log ID and log offset, which the author apparently wanted to convey to readers. I didn't check the full history but I think at some point our log segments (first 16MB, now initdb-time variable) superseded the log ID concept, which I think originally had 4GB segments? (I could also have had something to do with the abandoned undo system's needs, IDK.) That leads to the idea of ditching the slash and displaying them in the more obvious (to my aesthetic, anyway, YMMV): SELECT pg_lsn(23783416::numeric); - pg_lsn ------------ - 0/16AE7F8 + pg_lsn +------------------ + 00000000016AE7F8 And likewise wherever they appear or are parsed in tools, protocols, command lines etc. /me activates flame-proof force field I realised while contemplating that that my treatment of pgoff_t might not be quite right in the first patch. It casts off_t (eg from struct stat) to (pgoff_t) and display as "%" PRId64, which is correct for Windows where pgoff_t is a typedef to __int64 (actually int64_t would be more logical, but presumably int64_t is __int64 on that system, not sure whether that is truly a distinct type according to its native compiler), but on non-Windows we use the system off_t whose printf type is unknown to us, and might in theory be a different signed 64 bit type and provoke a warning from GCC/Clang printf attributes. Perhaps we should define just pgoff_t as int64_t everywhere? There are no warnings on any of our CI OSes so I assume that those OSes coincidentally define off_t the same way they define int64_t. That being the case we could just ignore it for now, but another system using GCC/Clang printf attributes (eg illumos or the other BSDs) might not happen to agree. Not done yet. And one more thing like that: in a couple of places we see warnings on macOS CI that I'd missed: when printing the result of i64abs() as PRId64, because it happens to use labs() and it happens to define int64_t as long long, and when printing a Datum as PRIx64, because Datum is uintptr_t and it happens to define that as unsigned long. I suppose we should cast to int64 in the definition of c.h's i64abs() macro and a couple of similar things, and cast Datum to uint64 in that one place that wants to print it out. Not done yet, so you can still see this on macOS CI's build step.
Attachment
Thomas Munro <thomas.munro@gmail.com> writes: > I also added one more patch that I expect to be more contentious as it > is a UX change. Why do we display LSNs with a slash? While there's surely little reason to do that anymore, I think the blast radius of such a change will be vastly greater than is warranted by aesthetics. It's not only our code that will be affected --- I'm pretty sure there is a great deal of replication tooling out there that this will break. Don't expect me to defend you from the villagers with pitchforks. regards, tom lane
On 02.03.25 22:08, Thomas Munro wrote: > And one more thing like that: in a couple of places we see warnings on > macOS CI that I'd missed: when printing the result of i64abs() as > PRId64, because it happens to use labs() and it happens to define > int64_t as long long, and when printing a Datum as PRIx64, because > Datum is uintptr_t and it happens to define that as unsigned long. I > suppose we should cast to int64 in the definition of c.h's i64abs() > macro and a couple of similar things, agreed > and cast Datum to uint64 in that > one place that wants to print it out. Since Datum is uintptr_t, it should be printed using the format PRIxPTR. Then it should work out.
On 02.03.25 22:08, Thomas Munro wrote: > On Mon, Mar 3, 2025 at 6:21 AM Peter Eisentraut <peter@eisentraut.org> wrote: >> On 05.12.24 23:18, Thomas Munro wrote: >>> Old: errmsg("hello %llu", (unsigned long long) x) >>> New: errmsg("hello %" PRIu64, x) >> >> I have committed the subset of this patch for pg_checksums.c so that the >> translators and whoever else might be affected can try this out at small >> scale. (I don't expect any particular problems.) Then we can move on >> to the rest in a few weeks, I think. > > Good plan, thanks. Here's a rebase. I think this went ok, and we can proceed here. I looked through the v2-0001 patch in detail. Most of it is mechanical, so no problems. I couple of issues you already mentioned: - correct placeholder for Datum (uintptr_t) - i64abs() definition needs return cast - I don't think it's proper to assume that pgoff_t or off_t matches int64_t. A few additional comments: - In src/backend/access/transam/xlogreader.c, you change a cast that is part of an arithmetic expression: - ((long long) total_len) - gotlen, + total_len - gotlen, Is this no longer required to keep the math correct? Both total_len and gotlen are uint32. Maybe this was meant to convert to signed arithmetic? - In src/backend/backup/basebackup.c, you change -static long long int total_checksum_failures; +static int64 total_checksum_failures; I don't think it is required, and I don't think it should be encouraged, to expunge all uses of long long int, or something like that. I think you should use long long int for "I need a big counter" and int64 when you want to control the storage layout. Similar to how you might choose int vs. int32. So I would leave this file alone. - In src/bin/pg_verifybackup/astreamer_verify.c, you change the signedness of some arguments, e.g., in member_verify_header(): report_backup_error(mystreamer->context, - "\"%s\" has size %llu in \"%s\" but size %llu in the manifest", + "\"%s\" has size %" PRId64 " in \"%s\" but size %" PRId64 " in the manifest", The first signedness change is correct (member->size is pgoff_t), but the second is not (m->size is uint64). I think it might be better to keep this patch as a mechanical change and fix up the signedness issues separately. (There are actually a few more that were previously hidden by casts but will now show up with something like -Wformat-signedness.) - In src/fe_utils/print.c, there is also a format change in the second hunk, but if we're going to do that one, we should also make the same change in the first hunk. Also, in the first hunk, the second format should be %zu not %zd. - In src/test/modules/libpq_pipeline/libpq_pipeline.c, you're changing the shift base from 1LL (signed) to UINT64_C(1) (unsigned). This appears to be a semantic change separate from this patch? But if this change is desired, then the signedness of the format argument should also be adjusted. About the subsequent pgbench patches: v2-0002: ok v2-0003: Again, I'm not sure insisting on int64 use is needed here, and I don't know that the existing code is incorrect. If we don't like using "long", we could just switch to "long long" here. v2-0004: ok About the LSN format patch, I'm generally sympathetic about this, and I think I've sort of asked for a similar change some years ago, but it's probably not worth pursuing for this release (if ever).
On Mon, Mar 10, 2025 at 10:49 PM Peter Eisentraut <peter@eisentraut.org> wrote: > On 02.03.25 22:08, Thomas Munro wrote: > > Good plan, thanks. Here's a rebase. > > I think this went ok, and we can proceed here. Cool, I'll post a new patch soon, but first a question about this bit: > - I don't think it's proper to assume that pgoff_t or off_t matches int64_t. So we should make pgoff_t a typedef for int64_t everywhere. It's a bit annoying that we have to teach everyone to remember to use PRId64 to print it, though. How reasonable would it be to add an extra filter into whatever script is used to run xgettext on our source tree? It could replace a very small number of agreed useful tokens to match some macros that we would also define in our tree, so that we could write PRI_PGOFF_T in our messages, but xgettext would see PRId64 and still emit those magic %<PRId64> tokens that GNU/NetBSD/Solaris gettext() know how to translate on the fly when loading message catalogues. I'm not sure how many other candidates there would be, not many (and LSN is obviously an attractive but thorny one...). (For those who missed the reason why: I believe xgettext only treats the <inttypes.h> macros with special portability gloves when used directly, so if you wrapped them in your own macros and did nothing else, you'd get the fully expanded macros as defined on the system that runs xgettext. if I understood correctly. Concretely, if that's a 64 bit glibc system where PRId64 is "ld", the resulting catalogues wouldn't work on a Windows or 32 bit system where sizeof(long) < sizeof(int64_t). You might be able to get away with it if you hijacked those macros as seen by xgettext and made them all "lld" everywhere since that's at least the right size on all known systems, but that'd be a bit gross and not in the spirit of this exercise...)
On 15.03.25 03:42, Thomas Munro wrote: >> - I don't think it's proper to assume that pgoff_t or off_t matches int64_t. > > So we should make pgoff_t a typedef for int64_t everywhere. It's a > bit annoying that we have to teach everyone to remember to use PRId64 > to print it, though. The ramifications of such a change are not clear to me. I thought pgoff_t is supposed to be off_t on Unix systems. If we change that, then how will this affect pointer type arguments, function pointers, etc. This seems to be a much larger problem than what this thread is originally about. I think we should leave pgoff_t the way it is (mostly?) done now: Cast to long long int and print using %lld. > How reasonable would it be to add an extra > filter into whatever script is used to run xgettext on our source > tree? It could replace a very small number of agreed useful tokens to > match some macros that we would also define in our tree, so that we > could write PRI_PGOFF_T in our messages, but xgettext would see PRId64 > and still emit those magic %<PRId64> tokens that GNU/NetBSD/Solaris > gettext() know how to translate on the fly when loading message > catalogues. This is not really possible. The <PRIxxx> behavior is baked deeply into the gettext code. (Also note that you don't only need support in xgettext, which is part of our build system, but also in the runtime library, which we don't control.)
Peter Eisentraut <peter@eisentraut.org> writes: > This is not really possible. The <PRIxxx> behavior is baked deeply into > the gettext code. (Also note that you don't only need support in > xgettext, which is part of our build system, but also in the runtime > library, which we don't control.) Hmm, I find that comment fairly scary. How do we know that the runtime library actually gets this right on every supported platform? It's surely not because we test it, because we do not. regards, tom lane
On Mon, Mar 17, 2025 at 8:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Eisentraut <peter@eisentraut.org> writes: > > This is not really possible. The <PRIxxx> behavior is baked deeply into > > the gettext code. (Also note that you don't only need support in > > xgettext, which is part of our build system, but also in the runtime > > library, which we don't control.) > > Hmm, I find that comment fairly scary. How do we know that the > runtime library actually gets this right on every supported platform? > It's surely not because we test it, because we do not. I don't know too much about libintl and its history other than what I've looked up for these discussions, but I can't find any other implementations other than Sun's, GNU's and NetBSD's. Sun/Oracle and NetBSD went out of their way to understand these and other GNUisms. I am not sure if they should even be called "extensions"... extensions to what? I guess the historical answer would have been "Sun's version", but see below for a new development which raises philosophical questions. 1. Solaris -- the original implementation has special support for the things GNU's added, and certainly covers this <inttypes.h> stuff: https://docs.oracle.com/cd/E36784_01/html/E39536/gnkbn.html#ILDEVgnosj I just tried it out on a cfarm Solaris box (well I thought I already knew this from an earlier round of discussions about this but wanted to check again before replying and found my old test program still there...). Note the "<PRId64>" in the catalogue: tmunro@s11-sparc:~/gettext-hacking$ uname -a SunOS s11-sparc.cfarm 5.11 11.4.78.189.2 sun4v sparc sun4v logical-domain tmunro@s11-sparc:~/gettext-hacking$ tail -5 locales/fr/LC_MESSAGES/messages.po #: test.c:8 #, c-format msgid "the answer is %<PRId64>\n" msgstr "la réponse est %<PRId64>\n" tmunro@s11-sparc:~/gettext-hacking$ cat test.c #include <inttypes.h> #include <libintl.h> #include <locale.h> #include <stdio.h> #include <stdlib.h> #define GETTEXT_DOMAIN "messages" #define GETTEXT_OUTPUT_DIR "locales" int main() { setenv("LANGUAGE", "fr", 1); setlocale(LC_ALL, "fr_FR.UTF-8"); bindtextdomain(GETTEXT_DOMAIN, GETTEXT_OUTPUT_DIR); textdomain(GETTEXT_DOMAIN); printf(gettext("the answer is %" PRId64 "\n"), (int64_t) 42); } tmunro@s11-sparc:~/gettext-hacking$ gcc test.c tmunro@s11-sparc:~/gettext-hacking$ ./a.out la réponse est 42 You can also see that stuff in the illumos source tree: https://github.com/illumos/illumos-gate/blob/master/usr/src/lib/libc/port/i18n/gettext_gnu.c 2. NetBSD -- I haven't try it myself (I can send my test program if you are interested) but it clearly knows about GNU's system-dependent macros, and its stated goal was to be "based on the specifications from GNU gettext": https://wiki.netbsd.org/projects/project/libintl/ https://github.com/NetBSD/src/blob/trunk/lib/libintl/sysdep.c What aspect of that might not work portably? Are there any other implementations I'm missing? What standard would an implementation follow, if it were to exist? POSIX 2024 also finally standardised gettext() and associated tools. I don't see these macros mentioned there (after an admittedly cursory scan of the relevant sections), or for that matter any mention of the portability problem they solve (perhaps I'll write in about that), but it doesn't seem to make any sense to deprive ourselves of features supported by all known implementations that solve a real problem, just because a standard suddenly appeared retroactively rendering them "extensions" in some sense. I mean, GNU is clearly functioning as a of de facto standard of very long standing, which I think the POSIX discussion[1] acknowledged succinctly in the description field "POSIX defines catgets() but most software rather uses gettext()". I don't think I've ever seen catgets() in several decades around C and Unix. (Amusingly the GNU maintainer showed up to say (paraphrasing) "don't do it", and (paraphrasing) "if you want to solve a problem that we actually have why don't you add all the missing _l function so we can write portable multithreaded programs". Hear hear!) [1] https://www.austingroupbugs.net/view.php?id=1122
On Mon, Mar 17, 2025 at 8:03 PM Peter Eisentraut <peter@eisentraut.org> wrote: > On 15.03.25 03:42, Thomas Munro wrote: > > So we should make pgoff_t a typedef for int64_t everywhere. It's a > > bit annoying that we have to teach everyone to remember to use PRId64 > > to print it, though. > > The ramifications of such a change are not clear to me. I thought > pgoff_t is supposed to be off_t on Unix systems. If we change that, > then how will this affect pointer type arguments, function pointers, > etc. This seems to be a much larger problem than what this thread is > originally about. > > I think we should leave pgoff_t the way it is (mostly?) done now: Cast > to long long int and print using %lld. WFM. > > How reasonable would it be to add an extra > > filter into whatever script is used to run xgettext on our source > > tree? It could replace a very small number of agreed useful tokens to > > match some macros that we would also define in our tree, so that we > > could write PRI_PGOFF_T in our messages, but xgettext would see PRId64 > > and still emit those magic %<PRId64> tokens that GNU/NetBSD/Solaris > > gettext() know how to translate on the fly when loading message > > catalogues. > > This is not really possible. The <PRIxxx> behavior is baked deeply into > the gettext code. (Also note that you don't only need support in > xgettext, which is part of our build system, but also in the runtime > library, which we don't control.) Hmm, but that's why I was asking about filtering the source *before* xgettext sees it, but it sounds like I may still be confused about how that works and I'm very happy to abandon that idea and leave those bits unchanged. Will update the patch shortly to incorporate your other feedback. Thanks!
On Mon, Mar 17, 2025 at 11:52 PM Thomas Munro <thomas.munro@gmail.com> wrote: > tmunro@s11-sparc:~/gettext-hacking$ gcc test.c > tmunro@s11-sparc:~/gettext-hacking$ ./a.out > la réponse est 42 And just to be paranoid, I checked a few more things: the .mo definitely contains the literal "PRId64" (rearranged as "^@PRId64^@the answer is %") and it's definitely using gettext() from libc and not somehow automatically finding a GNU library in some search path. (And woop, this cfarm Sun box has received the new preadv()/pwritev() in its libc, that they added for PostgreSQL.) And since I remembered that I had a NetBSD vagrant VM handy from investigating Champion's libpq troubles the other day: [vagrant@netbsd9 gettext-hacking]$ cc test.c -lintl [vagrant@netbsd9 gettext-hacking]$ ldd a.out a.out: -lintl.1 => /usr/lib/libintl.so.1 -lc.12 => /usr/lib/libc.so.12 [vagrant@netbsd9 gettext-hacking]$ ./a.out la réponse est 42 Not that I had much doubt but I checked that the library is indeed the NetBSD code and not somehow GNU code, based on clearly identifiable strings.
Thomas Munro <thomas.munro@gmail.com> writes: > On Mon, Mar 17, 2025 at 8:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Hmm, I find that comment fairly scary. How do we know that the >> runtime library actually gets this right on every supported platform? > I don't know too much about libintl and its history other than what > I've looked up for these discussions, but I can't find any other > implementations other than Sun's, GNU's and NetBSD's. Sun/Oracle and > NetBSD went out of their way to understand these and other GNUisms. Okay, that reduces the size of the problem considerably. > 2. NetBSD -- I haven't try it myself (I can send my test program if > you are interested) I'd be happy to try it, but I see downthread that you already did, so that seems unnecessary. I still wonder if we shouldn't have more than zero testing of our NLS behavior, but that's just a generalized worry not a concern over any specific feature. regards, tom lane
On 02.03.25 22:08, Thomas Munro wrote: > And one more thing like that: in a couple of places we see warnings on > macOS CI that I'd missed: when printing the result of i64abs() as > PRId64, because it happens to use labs() and it happens to define > int64_t as long long, [...]. I > suppose we should cast to int64 in the definition of c.h's i64abs() > macro I have committed a fix for that.
On 10.03.25 10:49, Peter Eisentraut wrote: > On 02.03.25 22:08, Thomas Munro wrote: >> On Mon, Mar 3, 2025 at 6:21 AM Peter Eisentraut <peter@eisentraut.org> >> wrote: >>> On 05.12.24 23:18, Thomas Munro wrote: >>>> Old: errmsg("hello %llu", (unsigned long long) x) >>>> New: errmsg("hello %" PRIu64, x) >>> >>> I have committed the subset of this patch for pg_checksums.c so that the >>> translators and whoever else might be affected can try this out at small >>> scale. (I don't expect any particular problems.) Then we can move on >>> to the rest in a few weeks, I think. >> >> Good plan, thanks. Here's a rebase. > > I think this went ok, and we can proceed here. > > I looked through the v2-0001 patch in detail. Most of it is mechanical, > so no problems. I couple of issues you already mentioned: I have committed v2-0001, omitting the parts that I had flagged in my review. I have also committed v2-0002. From my perspective, this can conclude this thread.