Re: refactoring - share str2*int64 functions - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: refactoring - share str2*int64 functions
Date
Msg-id alpine.DEB.2.21.1908290705310.28828@lancre
Whole thread Raw
In response to Re: refactoring - share str2*int64 functions  (Michael Paquier <michael@paquier.xyz>)
Responses Re: refactoring - share str2*int64 functions
List pgsql-hackers
Bonjour Michaël,

>>  - *ptr && WHATEVER(*ptr)
>>   *ptr is redundant, WHATEVER yields false on '\0', and it costs on each
>>   char but at the end. It might be debatable in some places, e.g. it is
>>   likely that there are no spaces in the string, but likely that there are
>>   more than one digit.
>
> Still this makes the checks less robust?

I do not see any downside, but maybe I lack imagination.

On my laptop isWHATEVER is implementd through an array mapping characters 
to a bitfield saying whether each char is WHATEVER, depending on the bit. 
This array is well defined for index 0 ('\0').

If an implementation is based on comparisons, for isdigit it would be:

    c >= '0' && c <= '9'

Then checking c != '\0' is redundant with c >= '0'.

Given the way the character checks are used in the function, we do not go 
beyond the end of the string because we only proceed further when a 
character is actually recognized, else we return.

So I cannot see any robustness issue, just a few cycles saved.

>> Hmmm. Have you looked at the fallback cases when the corresponding builtins
>> are not available?
>>
>> I'm unsure of a reliable way to detect a generic unsigned int overflow
>> without expensive dividing back and having to care about zero…
>
> Mr Freund has mentioned that here:
> https://www.postgresql.org/message-id/20190717184820.iqz7schxdbucmdmu@alap3.anarazel.de

Yep, that is what I mean by expensive: there is an integer division, which 
is avoided if b is known to be 10, hence is not zero and the limit value 
can be checked directly on the input without having to perform a division 
each time.

>> So I was pretty happy with my two discreet, small and efficient tests.
>
> That's also a matter of code and interface consistency IMHO.

Possibly.

ISTM that part of the motivation is to reduce costs for heavily used 
conversions, eg on COPY. Function "scanf" is overly expensive because it 
has to interpret its format, and we have to check for overflows…

Anyway, if we assume that the builtins exist and rely on efficient 
hardware check, maybe we do not care about the fallback cases which would 
just be slow but never executed.

Note that all this is moot, as all instances of string to uint64 
conversion do not check for errors.

Attached v7 does create uint64 overflow inline functions. The stuff yet is 
not moved to "common/int.c", a file which does not exists, even if 
"common/int.h" does.

-- 
Fabien.
Attachment

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: doc: update PL/pgSQL sample loop function
Next
From: Masahiko Sawada
Date:
Subject: Re: Resume vacuum and autovacuum from interruption and cancellation