Thread: Remove pg_strtouint64(), use strtoull() directly

Remove pg_strtouint64(), use strtoull() directly

From
Peter Eisentraut
Date:
pg_strtouint64() is a wrapper around strtoull/strtoul/_strtoui64, but it 
seems no longer necessary to have this indirection.

msvc/Solution.pm claims HAVE_STRTOULL, so the "MSVC only" part seems 
unnecessary.  Also, we have code in c.h to substitute alternatives for 
strtoull() if not found, and that would appear to cover all currently 
supported platforms, so having a further fallback in pg_strtouint64() 
seems unnecessary.

(AFAICT, the only buildfarm member that does not have strtoull() 
directly but relies on the code in c.h is gaur.  So we can hang on to 
that code for a while longer, but its utility is also fading away.)

Therefore, remove pg_strtouint64(), and use strtoull() directly in all 
call sites.

(This is also useful because we have pg_strtointNN() functions that have 
a different API than this pg_strtouintNN().  So removing the latter 
makes this problem go away.)
Attachment

Re: Remove pg_strtouint64(), use strtoull() directly

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> Therefore, remove pg_strtouint64(), and use strtoull() directly in all 
> call sites.

Our experience with the variable size of "long" has left a sufficiently
bad taste in my mouth that I'm not enthused about adding hard-wired
assumptions that "long long" is identical to int64.  So this seems like
it's going in the wrong direction, and giving up portability that we
might want back someday.

I'd be okay with making pg_strtouint64 into a really thin wrapper
(ie a macro, at least on most platforms).  But please let's not
give up the notational distinction.

            regards, tom lane



Re: Remove pg_strtouint64(), use strtoull() directly

From
Peter Eisentraut
Date:
On 10.12.21 16:25, Tom Lane wrote:
> Our experience with the variable size of "long" has left a sufficiently
> bad taste in my mouth that I'm not enthused about adding hard-wired
> assumptions that "long long" is identical to int64.  So this seems like
> it's going in the wrong direction, and giving up portability that we
> might want back someday.

What kind of scenario do you have in mind?  Someone making their long 
long int 128 bits?




Re: Remove pg_strtouint64(), use strtoull() directly

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> On 10.12.21 16:25, Tom Lane wrote:
>> Our experience with the variable size of "long" has left a sufficiently
>> bad taste in my mouth that I'm not enthused about adding hard-wired
>> assumptions that "long long" is identical to int64.  So this seems like
>> it's going in the wrong direction, and giving up portability that we
>> might want back someday.

> What kind of scenario do you have in mind?  Someone making their long 
> long int 128 bits?

Yeah, exactly.  That seems like a natural evolution:
    short -> 2 bytes
    int -> 4 bytes
    long -> 8 bytes
    long long -> 16 bytes
so I'm surprised that vendors haven't done that already instead
of inventing hacks like __int128.

Our current hard-coded uses of long long are all written on the
assumption that it's *at least* 64 bits, so we'd survive OK on
such a platform so long as we don't start confusing it with
*exactly* 64 bits.

            regards, tom lane



Re: Remove pg_strtouint64(), use strtoull() directly

From
Robert Haas
Date:
On Mon, Dec 13, 2021 at 9:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yeah, exactly.  That seems like a natural evolution:
>         short -> 2 bytes
>         int -> 4 bytes
>         long -> 8 bytes
>         long long -> 16 bytes
> so I'm surprised that vendors haven't done that already instead
> of inventing hacks like __int128.

I really am glad they haven't. I think it's super-annoying that we
need hacks like UINT64_FORMAT all over the place. I think it was a
mistake not to nail down the size that each type is expected to be in
the original C standard, and making more changes to the conventions
now would cause a whole bunch of unnecessary code churn, probably for
almost everybody using C. It's not like people are writing high-level
applications in C these days; it's all low-level stuff that is likely
to care about the width of a word. It seems much more sensible to
standardize on names for words of all lengths in the standard than to
do anything else. I don't really care whether the standard chooses
int128, int256, int512, etc. or long long long, long long long long,
etc. or reallylong, superlong, incrediblylong, etc. but I hope they
define new stuff instead of encouraging implementations to redefine
what's there already.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Remove pg_strtouint64(), use strtoull() directly

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I really am glad they haven't. I think it's super-annoying that we
> need hacks like UINT64_FORMAT all over the place. I think it was a
> mistake not to nail down the size that each type is expected to be in
> the original C standard,

Well, mumble.  One must remember that when C was designed, there was
a LOT more variability in hardware designs than we see today.  They
could not have put a language with fixed ideas about datatype widths
onto, say, PDP-10s (36-bit words) or Crays (60-bit, IIRC).  But it
is a darn shame that people weren't more consistent about mapping
the C types onto machines with S/360-like addressing.

> and making more changes to the conventions
> now would cause a whole bunch of unnecessary code churn, probably for
> almost everybody using C.

The error in your thinking is believing that there *is* a convention.
There isn't; see "long".

Anyway, my point is that we have created a set of type names that
have the semantics we want, and we should avoid confusing those with
underlying C types that are *not* guaranteed to be the same thing.

            regards, tom lane



Re: Remove pg_strtouint64(), use strtoull() directly

From
Robert Haas
Date:
On Mon, Dec 13, 2021 at 10:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > I really am glad they haven't. I think it's super-annoying that we
> > need hacks like UINT64_FORMAT all over the place. I think it was a
> > mistake not to nail down the size that each type is expected to be in
> > the original C standard,
>
> Well, mumble.  One must remember that when C was designed, there was
> a LOT more variability in hardware designs than we see today.  They
> could not have put a language with fixed ideas about datatype widths
> onto, say, PDP-10s (36-bit words) or Crays (60-bit, IIRC).  But it
> is a darn shame that people weren't more consistent about mapping
> the C types onto machines with S/360-like addressing.

Sure.

> > and making more changes to the conventions
> > now would cause a whole bunch of unnecessary code churn, probably for
> > almost everybody using C.
>
> The error in your thinking is believing that there *is* a convention.
> There isn't; see "long".

I mean I pretty much pointed out exactly that thing with my mention of
UINT64_FORMAT, so I'm not sure why you're making it seem like I didn't
know that.

> Anyway, my point is that we have created a set of type names that
> have the semantics we want, and we should avoid confusing those with
> underlying C types that are *not* guaranteed to be the same thing.

I agree entirely, but it's still an annoyance when dealing with printf
format codes and other operating-system defined types whose width we
don't know. Standardization here makes it easier to write good code;
different conventions make it harder. I'm guessing that other people
have noticed that too, and that's why we're getting stuff like
__int128 instead of redefining long long.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Remove pg_strtouint64(), use strtoull() directly

From
Peter Eisentraut
Date:
On 13.12.21 15:44, Tom Lane wrote:
> Our current hard-coded uses of long long are all written on the
> assumption that it's*at least*  64 bits, so we'd survive OK on
> such a platform so long as we don't start confusing it with
> *exactly*  64 bits.

OK, makes sense.  Here is an alternative patch.  It introduces two 
light-weight macros strtoi64() and strtou64() (compare e.g., strtoimax() 
in POSIX) in c.h and removes pg_strtouint64().  This moves the 
portability layer from numutils.c to c.h, so it's closer to the rest of 
the int64 portability code.  And that way it is available to not just 
server code.  And it resolves the namespace collision with the 
pg_strtointNN() functions in numutils.c.
Attachment

Re: Remove pg_strtouint64(), use strtoull() directly

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> OK, makes sense.  Here is an alternative patch.  It introduces two 
> light-weight macros strtoi64() and strtou64() (compare e.g., strtoimax() 
> in POSIX) in c.h and removes pg_strtouint64().  This moves the 
> portability layer from numutils.c to c.h, so it's closer to the rest of 
> the int64 portability code.  And that way it is available to not just 
> server code.  And it resolves the namespace collision with the 
> pg_strtointNN() functions in numutils.c.

Works for me.  I'm not in a position to verify that this'll work
on Windows, but the buildfarm will tell us that quickly enough.

            regards, tom lane