Thread: Re: 64 bit numbers vs format strings

Re: 64 bit numbers vs format strings

From
Peter Eisentraut
Date:
On 05.12.24 23:18, Thomas Munro wrote:
> Having learned some things about gettext based on clues[1] from Peter
> E, I decided to see what it would take to expunge all (long long) and
> similar casts now that we're using the standard types with system
> support.
> 
> The short version is tha given uint64 x:
> 
>      Old: errmsg("hello %llu", (unsigned long long) x)
>      New: errmsg("hello %" PRIu64, x)

I have committed the subset of this patch for pg_checksums.c so that the 
translators and whoever else might be affected can try this out at small 
scale.  (I don't expect any particular problems.)  Then we can move on 
to the rest in a few weeks, I think.




Re: 64 bit numbers vs format strings

From
Thomas Munro
Date:
On Mon, Mar 3, 2025 at 6:21 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> On 05.12.24 23:18, Thomas Munro wrote:
> >      Old: errmsg("hello %llu", (unsigned long long) x)
> >      New: errmsg("hello %" PRIu64, x)
>
> I have committed the subset of this patch for pg_checksums.c so that the
> translators and whoever else might be affected can try this out at small
> scale.  (I don't expect any particular problems.)  Then we can move on
> to the rest in a few weeks, I think.

Good plan, thanks.  Here's a rebase.

I also added one more patch that I expect to be more contentious as it
is a UX change.  Why do we display LSNs with a slash?  I believe there
are two reasons: (1) Back in 2000 didn't require the existence of a 64
bit type, so XLogRecPtr was a struct holding two uint32 values. The
author could still have used "%08X%08X" for both printf and scanf if
that was the only reason.  (2) It initially had a real semantic
division into two parts log ID and log offset, which the author
apparently wanted to convey to readers.  I didn't check the full
history but I think at some point our log segments (first 16MB, now
initdb-time variable) superseded the log ID concept, which I think
originally had 4GB segments?  (I could also have had something to do
with the abandoned undo system's needs, IDK.)   That leads to the idea
of ditching the slash and displaying them in the more obvious (to my
aesthetic, anyway, YMMV):

 SELECT pg_lsn(23783416::numeric);
-  pg_lsn
------------
- 0/16AE7F8
+      pg_lsn
+------------------
+ 00000000016AE7F8

And likewise wherever they appear or are parsed in tools, protocols,
command lines etc.

/me activates flame-proof force field

I realised while contemplating that that my treatment of pgoff_t might
not be quite right in the first patch.  It casts off_t (eg from struct
stat) to (pgoff_t) and display as "%" PRId64, which is correct for
Windows where pgoff_t is a typedef to __int64 (actually int64_t would
be more logical, but presumably int64_t is __int64 on that system, not
sure whether that is truly a distinct type according to its native
compiler), but on non-Windows we use the system off_t whose printf
type is unknown to us, and might in theory be a different signed 64
bit type and provoke a warning from GCC/Clang printf attributes.
Perhaps we should define just pgoff_t as int64_t everywhere?  There
are no warnings on any of our CI OSes so I assume that those OSes
coincidentally define off_t the same way they define int64_t.  That
being the case we could just ignore it for now, but another system
using GCC/Clang printf attributes (eg illumos or the other BSDs) might
not happen to agree.  Not done yet.

And one more thing like that: in a couple of places we see warnings on
macOS CI that I'd missed: when printing the result of i64abs() as
PRId64, because it happens to use labs() and it happens to define
int64_t as long long, and when printing a Datum as PRIx64, because
Datum is uintptr_t and it happens to define that as unsigned long.  I
suppose we should cast to int64 in the definition of c.h's i64abs()
macro and a couple of similar things, and cast Datum to uint64 in that
one place that wants to print it out.  Not done yet, so you can still
see this on macOS CI's build step.

Attachment

Re: 64 bit numbers vs format strings

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> I also added one more patch that I expect to be more contentious as it
> is a UX change.  Why do we display LSNs with a slash?

While there's surely little reason to do that anymore, I think the
blast radius of such a change will be vastly greater than is warranted
by aesthetics.  It's not only our code that will be affected --- I'm
pretty sure there is a great deal of replication tooling out there
that this will break.  Don't expect me to defend you from the
villagers with pitchforks.

            regards, tom lane



Re: 64 bit numbers vs format strings

From
Peter Eisentraut
Date:
On 02.03.25 22:08, Thomas Munro wrote:
> And one more thing like that: in a couple of places we see warnings on
> macOS CI that I'd missed: when printing the result of i64abs() as
> PRId64, because it happens to use labs() and it happens to define
> int64_t as long long, and when printing a Datum as PRIx64, because
> Datum is uintptr_t and it happens to define that as unsigned long.  I
> suppose we should cast to int64 in the definition of c.h's i64abs()
> macro and a couple of similar things,

agreed

> and cast Datum to uint64 in that
> one place that wants to print it out.

Since Datum is uintptr_t, it should be printed using the format PRIxPTR. 
  Then it should work out.




Re: 64 bit numbers vs format strings

From
Peter Eisentraut
Date:
On 02.03.25 22:08, Thomas Munro wrote:
> On Mon, Mar 3, 2025 at 6:21 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>> On 05.12.24 23:18, Thomas Munro wrote:
>>>       Old: errmsg("hello %llu", (unsigned long long) x)
>>>       New: errmsg("hello %" PRIu64, x)
>>
>> I have committed the subset of this patch for pg_checksums.c so that the
>> translators and whoever else might be affected can try this out at small
>> scale.  (I don't expect any particular problems.)  Then we can move on
>> to the rest in a few weeks, I think.
> 
> Good plan, thanks.  Here's a rebase.

I think this went ok, and we can proceed here.

I looked through the v2-0001 patch in detail.  Most of it is mechanical, 
so no problems.  I couple of issues you already mentioned:

- correct placeholder for Datum (uintptr_t)

- i64abs() definition needs return cast

- I don't think it's proper to assume that pgoff_t or off_t matches int64_t.

A few additional comments:

- In src/backend/access/transam/xlogreader.c, you change a cast that is 
part of an arithmetic expression:

-                                     ((long long) total_len) - gotlen,
+                                     total_len - gotlen,

Is this no longer required to keep the math correct?  Both total_len and 
gotlen are uint32.  Maybe this was meant to convert to signed arithmetic?

- In src/backend/backup/basebackup.c, you change

-static long long int total_checksum_failures;
+static int64 total_checksum_failures;

I don't think it is required, and I don't think it should be encouraged, 
to expunge all uses of long long int, or something like that.  I think 
you should use long long int for "I need a big counter" and int64 when 
you want to control the storage layout.  Similar to how you might choose 
int vs. int32.  So I would leave this file alone.

- In src/bin/pg_verifybackup/astreamer_verify.c, you change the 
signedness of some arguments, e.g., in member_verify_header():

         report_backup_error(mystreamer->context,
-                           "\"%s\" has size %llu in \"%s\" but size 
%llu in the manifest",
+                           "\"%s\" has size %" PRId64 " in \"%s\" but 
size %" PRId64 " in the manifest",

The first signedness change is correct (member->size is pgoff_t), but 
the second is not (m->size is uint64).

I think it might be better to keep this patch as a mechanical change and 
fix up the signedness issues separately.  (There are actually a few more 
that were previously hidden by casts but will now show up with something 
like -Wformat-signedness.)

- In src/fe_utils/print.c, there is also a format change in the second 
hunk, but if we're going to do that one, we should also make the same 
change in the first hunk.  Also, in the first hunk, the second format 
should be %zu not %zd.

- In src/test/modules/libpq_pipeline/libpq_pipeline.c, you're changing 
the shift base from 1LL (signed) to UINT64_C(1) (unsigned).  This 
appears to be a semantic change separate from this patch?  But if this 
change is desired, then the signedness of the format argument should 
also be adjusted.


About the subsequent pgbench patches:

v2-0002: ok

v2-0003: Again, I'm not sure insisting on int64 use is needed here, and 
I don't know that the existing code is incorrect.  If we don't like 
using "long", we could just switch to "long long" here.

v2-0004: ok


About the LSN format patch, I'm generally sympathetic about this, and I 
think I've sort of asked for a similar change some years ago, but it's 
probably not worth pursuing for this release (if ever).




Re: 64 bit numbers vs format strings

From
Thomas Munro
Date:
On Mon, Mar 10, 2025 at 10:49 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> On 02.03.25 22:08, Thomas Munro wrote:
> > Good plan, thanks.  Here's a rebase.
>
> I think this went ok, and we can proceed here.

Cool, I'll post a new patch soon, but first a question about this bit:

> - I don't think it's proper to assume that pgoff_t or off_t matches int64_t.

So we should make pgoff_t a typedef for int64_t everywhere.  It's a
bit annoying that we have to teach everyone to remember to use PRId64
to print it, though.  How reasonable would it be to add an extra
filter into whatever script is used to run xgettext on our source
tree?  It could replace a very small number of agreed useful tokens to
match some macros that we would also define in our tree, so that we
could write PRI_PGOFF_T in our messages, but xgettext would see PRId64
and still emit those magic %<PRId64> tokens that GNU/NetBSD/Solaris
gettext() know how to translate on the fly when loading message
catalogues.  I'm not sure how many other candidates there would be,
not many (and LSN is obviously an attractive but thorny one...).

(For those who missed the reason why: I believe xgettext only treats
the <inttypes.h> macros with special portability gloves when used
directly, so if you wrapped them in your own macros and did nothing
else, you'd get the fully expanded macros as defined on the system
that runs xgettext. if I understood correctly.  Concretely, if that's
a 64 bit glibc system where PRId64 is "ld", the resulting catalogues
wouldn't work on a Windows or 32 bit system where sizeof(long) <
sizeof(int64_t).   You might be able to get away with it if you
hijacked those macros as seen by xgettext and made them all "lld"
everywhere since that's at least the right size on all known systems,
but that'd be a bit gross and not in the spirit of this exercise...)



Re: 64 bit numbers vs format strings

From
Peter Eisentraut
Date:
On 15.03.25 03:42, Thomas Munro wrote:
>> - I don't think it's proper to assume that pgoff_t or off_t matches int64_t.
> 
> So we should make pgoff_t a typedef for int64_t everywhere.  It's a
> bit annoying that we have to teach everyone to remember to use PRId64
> to print it, though.

The ramifications of such a change are not clear to me.  I thought 
pgoff_t is supposed to be off_t on Unix systems.  If we change that, 
then how will this affect pointer type arguments, function pointers, 
etc.  This seems to be a much larger problem than what this thread is 
originally about.

I think we should leave pgoff_t the way it is (mostly?) done now: Cast 
to long long int and print using %lld.

> How reasonable would it be to add an extra
> filter into whatever script is used to run xgettext on our source
> tree?  It could replace a very small number of agreed useful tokens to
> match some macros that we would also define in our tree, so that we
> could write PRI_PGOFF_T in our messages, but xgettext would see PRId64
> and still emit those magic %<PRId64> tokens that GNU/NetBSD/Solaris
> gettext() know how to translate on the fly when loading message
> catalogues.

This is not really possible.  The <PRIxxx> behavior is baked deeply into 
the gettext code.  (Also note that you don't only need support in 
xgettext, which is part of our build system, but also in the runtime 
library, which we don't control.)




Re: 64 bit numbers vs format strings

From
Tom Lane
Date:
Peter Eisentraut <peter@eisentraut.org> writes:
> This is not really possible.  The <PRIxxx> behavior is baked deeply into 
> the gettext code.  (Also note that you don't only need support in 
> xgettext, which is part of our build system, but also in the runtime 
> library, which we don't control.)

Hmm, I find that comment fairly scary.  How do we know that the
runtime library actually gets this right on every supported platform?
It's surely not because we test it, because we do not.

            regards, tom lane



Re: 64 bit numbers vs format strings

From
Thomas Munro
Date:
On Mon, Mar 17, 2025 at 8:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Peter Eisentraut <peter@eisentraut.org> writes:
> > This is not really possible.  The <PRIxxx> behavior is baked deeply into
> > the gettext code.  (Also note that you don't only need support in
> > xgettext, which is part of our build system, but also in the runtime
> > library, which we don't control.)
>
> Hmm, I find that comment fairly scary.  How do we know that the
> runtime library actually gets this right on every supported platform?
> It's surely not because we test it, because we do not.

I don't know too much about libintl and its history other than what
I've looked up for these discussions, but I can't find any other
implementations other than Sun's, GNU's and NetBSD's.  Sun/Oracle and
NetBSD went out of their way to understand these and other GNUisms.  I
am not sure if they should even be called "extensions"...  extensions
to what?  I guess the historical answer would have been "Sun's
version", but see below for a new development which raises
philosophical questions.

1.  Solaris -- the original implementation has special support for the
things GNU's added, and certainly covers this <inttypes.h> stuff:

https://docs.oracle.com/cd/E36784_01/html/E39536/gnkbn.html#ILDEVgnosj

I just tried it out on a cfarm Solaris box (well I thought I already
knew this from an earlier round of discussions about this but wanted
to check again before replying and found my old test program still
there...).  Note the "<PRId64>" in the catalogue:

tmunro@s11-sparc:~/gettext-hacking$ uname -a
SunOS s11-sparc.cfarm 5.11 11.4.78.189.2 sun4v sparc sun4v logical-domain
tmunro@s11-sparc:~/gettext-hacking$ tail -5 locales/fr/LC_MESSAGES/messages.po

#: test.c:8
#, c-format
msgid "the answer is %<PRId64>\n"
msgstr "la réponse est %<PRId64>\n"
tmunro@s11-sparc:~/gettext-hacking$ cat test.c
#include <inttypes.h>
#include <libintl.h>
#include <locale.h>
#include <stdio.h>
#include <stdlib.h>

#define GETTEXT_DOMAIN "messages"
#define GETTEXT_OUTPUT_DIR "locales"

int
main()
{
        setenv("LANGUAGE", "fr", 1);
        setlocale(LC_ALL, "fr_FR.UTF-8");
    bindtextdomain(GETTEXT_DOMAIN, GETTEXT_OUTPUT_DIR);
    textdomain(GETTEXT_DOMAIN);
        printf(gettext("the answer is %" PRId64 "\n"), (int64_t) 42);
}
tmunro@s11-sparc:~/gettext-hacking$ gcc test.c
tmunro@s11-sparc:~/gettext-hacking$ ./a.out
la réponse est 42

You can also see that stuff in the illumos source tree:

https://github.com/illumos/illumos-gate/blob/master/usr/src/lib/libc/port/i18n/gettext_gnu.c

2.  NetBSD -- I haven't try it myself (I can send my test program if
you are interested) but it clearly knows about GNU's system-dependent
macros, and its stated goal was to be "based on the specifications
from GNU gettext":

https://wiki.netbsd.org/projects/project/libintl/
https://github.com/NetBSD/src/blob/trunk/lib/libintl/sysdep.c

What aspect of that might not work portably?  Are there any other
implementations I'm missing?  What standard would an implementation
follow, if it were to exist?

POSIX 2024 also finally standardised gettext() and associated tools.
I don't see these macros mentioned there (after an admittedly cursory
scan of the relevant sections), or for that matter any mention of the
portability problem they solve (perhaps I'll write in about that), but
it doesn't seem to make any sense to deprive ourselves of features
supported by all known implementations that solve a real problem, just
because a standard suddenly appeared retroactively rendering them
"extensions" in some sense.  I mean, GNU is clearly functioning as a
of de facto standard of very long standing, which I think the POSIX
discussion[1] acknowledged succinctly in the description field "POSIX
defines catgets() but most software rather uses gettext()".  I don't
think I've ever seen catgets() in several decades around C and Unix.

(Amusingly the GNU maintainer showed up to say (paraphrasing) "don't
do it", and (paraphrasing) "if you want to solve a problem that we
actually have why don't you add all the missing _l function so we can
write portable multithreaded programs".  Hear hear!)

[1] https://www.austingroupbugs.net/view.php?id=1122



Re: 64 bit numbers vs format strings

From
Thomas Munro
Date:
On Mon, Mar 17, 2025 at 8:03 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> On 15.03.25 03:42, Thomas Munro wrote:
> > So we should make pgoff_t a typedef for int64_t everywhere.  It's a
> > bit annoying that we have to teach everyone to remember to use PRId64
> > to print it, though.
>
> The ramifications of such a change are not clear to me.  I thought
> pgoff_t is supposed to be off_t on Unix systems.  If we change that,
> then how will this affect pointer type arguments, function pointers,
> etc.  This seems to be a much larger problem than what this thread is
> originally about.
>
> I think we should leave pgoff_t the way it is (mostly?) done now: Cast
> to long long int and print using %lld.

WFM.

> > How reasonable would it be to add an extra
> > filter into whatever script is used to run xgettext on our source
> > tree?  It could replace a very small number of agreed useful tokens to
> > match some macros that we would also define in our tree, so that we
> > could write PRI_PGOFF_T in our messages, but xgettext would see PRId64
> > and still emit those magic %<PRId64> tokens that GNU/NetBSD/Solaris
> > gettext() know how to translate on the fly when loading message
> > catalogues.
>
> This is not really possible.  The <PRIxxx> behavior is baked deeply into
> the gettext code.  (Also note that you don't only need support in
> xgettext, which is part of our build system, but also in the runtime
> library, which we don't control.)

Hmm, but that's why I was asking about filtering the source *before*
xgettext sees it, but it sounds like I may still be confused about how
that works and I'm very happy to abandon that idea and leave those
bits unchanged.  Will update the patch shortly to incorporate your
other feedback.

Thanks!



Re: 64 bit numbers vs format strings

From
Thomas Munro
Date:
On Mon, Mar 17, 2025 at 11:52 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> tmunro@s11-sparc:~/gettext-hacking$ gcc test.c
> tmunro@s11-sparc:~/gettext-hacking$ ./a.out
> la réponse est 42

And just to be paranoid, I checked a few more things: the .mo
definitely contains the literal "PRId64"  (rearranged as
"^@PRId64^@the answer is %") and it's definitely using gettext() from
libc and not somehow automatically finding a GNU library in some
search path.  (And woop, this cfarm Sun box has received the new
preadv()/pwritev() in its libc, that they added for PostgreSQL.)

And since I remembered that I had a NetBSD vagrant VM handy from
investigating Champion's libpq troubles the other day:

[vagrant@netbsd9 gettext-hacking]$ cc test.c -lintl
[vagrant@netbsd9 gettext-hacking]$ ldd a.out
a.out:
        -lintl.1 => /usr/lib/libintl.so.1
        -lc.12 => /usr/lib/libc.so.12
[vagrant@netbsd9 gettext-hacking]$ ./a.out
la réponse est 42

Not that I had much doubt but I checked that the library is indeed the
NetBSD code and not somehow GNU code, based on clearly identifiable
strings.



Re: 64 bit numbers vs format strings

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Mon, Mar 17, 2025 at 8:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hmm, I find that comment fairly scary.  How do we know that the
>> runtime library actually gets this right on every supported platform?

> I don't know too much about libintl and its history other than what
> I've looked up for these discussions, but I can't find any other
> implementations other than Sun's, GNU's and NetBSD's.  Sun/Oracle and
> NetBSD went out of their way to understand these and other GNUisms.

Okay, that reduces the size of the problem considerably.

> 2.  NetBSD -- I haven't try it myself (I can send my test program if
> you are interested)

I'd be happy to try it, but I see downthread that you already did,
so that seems unnecessary.

I still wonder if we shouldn't have more than zero testing of our
NLS behavior, but that's just a generalized worry not a concern
over any specific feature.

            regards, tom lane



Re: 64 bit numbers vs format strings

From
Peter Eisentraut
Date:
On 02.03.25 22:08, Thomas Munro wrote:
> And one more thing like that: in a couple of places we see warnings on
> macOS CI that I'd missed: when printing the result of i64abs() as
> PRId64, because it happens to use labs() and it happens to define
> int64_t as long long, [...].  I
> suppose we should cast to int64 in the definition of c.h's i64abs()
> macro

I have committed a fix for that.




Re: 64 bit numbers vs format strings

From
Peter Eisentraut
Date:
On 10.03.25 10:49, Peter Eisentraut wrote:
> On 02.03.25 22:08, Thomas Munro wrote:
>> On Mon, Mar 3, 2025 at 6:21 AM Peter Eisentraut <peter@eisentraut.org> 
>> wrote:
>>> On 05.12.24 23:18, Thomas Munro wrote:
>>>>       Old: errmsg("hello %llu", (unsigned long long) x)
>>>>       New: errmsg("hello %" PRIu64, x)
>>>
>>> I have committed the subset of this patch for pg_checksums.c so that the
>>> translators and whoever else might be affected can try this out at small
>>> scale.  (I don't expect any particular problems.)  Then we can move on
>>> to the rest in a few weeks, I think.
>>
>> Good plan, thanks.  Here's a rebase.
> 
> I think this went ok, and we can proceed here.
> 
> I looked through the v2-0001 patch in detail.  Most of it is mechanical, 
> so no problems.  I couple of issues you already mentioned:

I have committed v2-0001, omitting the parts that I had flagged in my 
review.  I have also committed v2-0002.  From my perspective, this can 
conclude this thread.