Thread: Remove pg_strtouint64(), use strtoull() directly
pg_strtouint64() is a wrapper around strtoull/strtoul/_strtoui64, but it seems no longer necessary to have this indirection. msvc/Solution.pm claims HAVE_STRTOULL, so the "MSVC only" part seems unnecessary. Also, we have code in c.h to substitute alternatives for strtoull() if not found, and that would appear to cover all currently supported platforms, so having a further fallback in pg_strtouint64() seems unnecessary. (AFAICT, the only buildfarm member that does not have strtoull() directly but relies on the code in c.h is gaur. So we can hang on to that code for a while longer, but its utility is also fading away.) Therefore, remove pg_strtouint64(), and use strtoull() directly in all call sites. (This is also useful because we have pg_strtointNN() functions that have a different API than this pg_strtouintNN(). So removing the latter makes this problem go away.)
Attachment
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: > Therefore, remove pg_strtouint64(), and use strtoull() directly in all > call sites. Our experience with the variable size of "long" has left a sufficiently bad taste in my mouth that I'm not enthused about adding hard-wired assumptions that "long long" is identical to int64. So this seems like it's going in the wrong direction, and giving up portability that we might want back someday. I'd be okay with making pg_strtouint64 into a really thin wrapper (ie a macro, at least on most platforms). But please let's not give up the notational distinction. regards, tom lane
On 10.12.21 16:25, Tom Lane wrote: > Our experience with the variable size of "long" has left a sufficiently > bad taste in my mouth that I'm not enthused about adding hard-wired > assumptions that "long long" is identical to int64. So this seems like > it's going in the wrong direction, and giving up portability that we > might want back someday. What kind of scenario do you have in mind? Someone making their long long int 128 bits?
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: > On 10.12.21 16:25, Tom Lane wrote: >> Our experience with the variable size of "long" has left a sufficiently >> bad taste in my mouth that I'm not enthused about adding hard-wired >> assumptions that "long long" is identical to int64. So this seems like >> it's going in the wrong direction, and giving up portability that we >> might want back someday. > What kind of scenario do you have in mind? Someone making their long > long int 128 bits? Yeah, exactly. That seems like a natural evolution: short -> 2 bytes int -> 4 bytes long -> 8 bytes long long -> 16 bytes so I'm surprised that vendors haven't done that already instead of inventing hacks like __int128. Our current hard-coded uses of long long are all written on the assumption that it's *at least* 64 bits, so we'd survive OK on such a platform so long as we don't start confusing it with *exactly* 64 bits. regards, tom lane
On Mon, Dec 13, 2021 at 9:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yeah, exactly. That seems like a natural evolution: > short -> 2 bytes > int -> 4 bytes > long -> 8 bytes > long long -> 16 bytes > so I'm surprised that vendors haven't done that already instead > of inventing hacks like __int128. I really am glad they haven't. I think it's super-annoying that we need hacks like UINT64_FORMAT all over the place. I think it was a mistake not to nail down the size that each type is expected to be in the original C standard, and making more changes to the conventions now would cause a whole bunch of unnecessary code churn, probably for almost everybody using C. It's not like people are writing high-level applications in C these days; it's all low-level stuff that is likely to care about the width of a word. It seems much more sensible to standardize on names for words of all lengths in the standard than to do anything else. I don't really care whether the standard chooses int128, int256, int512, etc. or long long long, long long long long, etc. or reallylong, superlong, incrediblylong, etc. but I hope they define new stuff instead of encouraging implementations to redefine what's there already. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > I really am glad they haven't. I think it's super-annoying that we > need hacks like UINT64_FORMAT all over the place. I think it was a > mistake not to nail down the size that each type is expected to be in > the original C standard, Well, mumble. One must remember that when C was designed, there was a LOT more variability in hardware designs than we see today. They could not have put a language with fixed ideas about datatype widths onto, say, PDP-10s (36-bit words) or Crays (60-bit, IIRC). But it is a darn shame that people weren't more consistent about mapping the C types onto machines with S/360-like addressing. > and making more changes to the conventions > now would cause a whole bunch of unnecessary code churn, probably for > almost everybody using C. The error in your thinking is believing that there *is* a convention. There isn't; see "long". Anyway, my point is that we have created a set of type names that have the semantics we want, and we should avoid confusing those with underlying C types that are *not* guaranteed to be the same thing. regards, tom lane
On Mon, Dec 13, 2021 at 10:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > I really am glad they haven't. I think it's super-annoying that we > > need hacks like UINT64_FORMAT all over the place. I think it was a > > mistake not to nail down the size that each type is expected to be in > > the original C standard, > > Well, mumble. One must remember that when C was designed, there was > a LOT more variability in hardware designs than we see today. They > could not have put a language with fixed ideas about datatype widths > onto, say, PDP-10s (36-bit words) or Crays (60-bit, IIRC). But it > is a darn shame that people weren't more consistent about mapping > the C types onto machines with S/360-like addressing. Sure. > > and making more changes to the conventions > > now would cause a whole bunch of unnecessary code churn, probably for > > almost everybody using C. > > The error in your thinking is believing that there *is* a convention. > There isn't; see "long". I mean I pretty much pointed out exactly that thing with my mention of UINT64_FORMAT, so I'm not sure why you're making it seem like I didn't know that. > Anyway, my point is that we have created a set of type names that > have the semantics we want, and we should avoid confusing those with > underlying C types that are *not* guaranteed to be the same thing. I agree entirely, but it's still an annoyance when dealing with printf format codes and other operating-system defined types whose width we don't know. Standardization here makes it easier to write good code; different conventions make it harder. I'm guessing that other people have noticed that too, and that's why we're getting stuff like __int128 instead of redefining long long. -- Robert Haas EDB: http://www.enterprisedb.com
On 13.12.21 15:44, Tom Lane wrote: > Our current hard-coded uses of long long are all written on the > assumption that it's*at least* 64 bits, so we'd survive OK on > such a platform so long as we don't start confusing it with > *exactly* 64 bits. OK, makes sense. Here is an alternative patch. It introduces two light-weight macros strtoi64() and strtou64() (compare e.g., strtoimax() in POSIX) in c.h and removes pg_strtouint64(). This moves the portability layer from numutils.c to c.h, so it's closer to the rest of the int64 portability code. And that way it is available to not just server code. And it resolves the namespace collision with the pg_strtointNN() functions in numutils.c.
Attachment
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: > OK, makes sense. Here is an alternative patch. It introduces two > light-weight macros strtoi64() and strtou64() (compare e.g., strtoimax() > in POSIX) in c.h and removes pg_strtouint64(). This moves the > portability layer from numutils.c to c.h, so it's closer to the rest of > the int64 portability code. And that way it is available to not just > server code. And it resolves the namespace collision with the > pg_strtointNN() functions in numutils.c. Works for me. I'm not in a position to verify that this'll work on Windows, but the buildfarm will tell us that quickly enough. regards, tom lane