Re: refactoring - share str2*int64 functions - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: refactoring - share str2*int64 functions
Date
Msg-id 20190914060236.GG15406@paquier.xyz
Whole thread Raw
In response to Re: refactoring - share str2*int64 functions  (Andres Freund <andres@anarazel.de>)
Responses Re: refactoring - share str2*int64 functions
Re: refactoring - share str2*int64 functions
List pgsql-hackers
On Fri, Sep 13, 2019 at 06:38:31PM -0700, Andres Freund wrote:
> On 2019-09-10 12:05:25 +0900, Michael Paquier wrote:
>> On Mon, Sep 09, 2019 at 05:27:04AM -0700, Andres Freund wrote:
>> Attached is an updated patch?  How does it look?  I have left the
>> parts of readfuncs.c for now as there are more issues behind that than
>> doing a single switch, short reads are one, long reads a second.
>
> Hm? I don't know what you mean by those issues.

I think that we have more issues than it looks.  For example:
- Switching UINT to use pg_strtouint32() causes an incompatibility
issue compared to atoui().
- Switching INT to use pg_strtoint32() causes a set of warnings as for
example with AttrNumber:
72 |  (void) pg_strtoint32(token, &local_node->fldname)
   |                              ^~~~~~~~~~~~~~~~~~~~~
   |                              |
   |                              AttrNumber * {aka short int *}
And it is not like we should use a cast either, as we could hide real
issues.     Hence it seems to me that we need to have a new routine
definition for shorter integers and switch more flags to that.
- Switching LONG to use pg_strtoint64() leads to another set of
issues, particularly one could see an assertion failure related to Agg
nodes.  I am not sure either that we should use int64 here as the size
can be at least 32b.
- Switching OID to use pg_strtoint32 causes a failure with initdb.

So while I agree with you that a switch should be doable, there is a
large set of issues to ponder about here, and the patch already does a
lot, so I really think that we had better do a closer lookup at those
issues separately, once the basics are in place, and consider them if
they actually make sense.  There is much more than just doing a direct
switch in this area with the family of ato*() system calls.

>> And the patch already does a lot.  There could be also an argument for
>> having extra _check wrappers for the unsigned portions but these would
>> be mostly unused in the backend code, so I have left that out on
>> purpose.
>
> I'd value consistency higher here.

Okay, no objections to that.

>> @@ -2338,8 +2339,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
>>                      CreateTableAsStmt *ctastmt = (CreateTableAsStmt *) stmt->utilityStmt;
>>
>>                      if (strncmp(completionTag, "SELECT ", 7) == 0)
>> -                        _SPI_current->processed =
>> -                            pg_strtouint64(completionTag + 7, NULL, 10);
>> +                        (void) pg_strtouint64(completionTag + 7, &_SPI_current->processed);
>
> I'd just use the checked version here, seems like a good thing to check
> for, and I can't imagine it matters performance wise.

Yeah, makes sense.  I don't think it matters either for
pg_stat_statements in the same context.  So changed that part as
well.

>> @@ -63,8 +63,16 @@ Datum
>>  int2in(PG_FUNCTION_ARGS)
>>  {
>>      char       *num = PG_GETARG_CSTRING(0);
>> +    int16        res;
>> +    pg_strtoint_status status;
>>
>> -    PG_RETURN_INT16(pg_strtoint16(num));
>> +    /* Use a custom set of error messages here adapted to the data type */
>> +    status = pg_strtoint16(num, &res);
>
> I don't know what that comment is supposed to mean?

I mean here that the _check equivalent cannot be used as any error
messages generated need to be consistent with the SQL data type.  I
have updated the comment, does it look better now?

>> +/*
>> + * pg_strtoint64_check
>> + *
>> + * Convert input string to a signed 64-bit integer.
>> + *
>> + * This throws ereport() upon bad input format or overflow.
>> + */
>> +int64
>> +pg_strtoint64_check(const char *s)
>> +{
>> +    int64        result;
>> +    pg_strtoint_status status = pg_strtoint64(s, &result);
>> +
>> +    if (unlikely(status != PG_STRTOINT_OK))
>> +        pg_strtoint_error(status, s, "int64");
>> +    return result;
>> +}
>
> I think I'd just put these as inlines in the header.

I have not considered that.  This bloats a bit more builtins.h.  We
could live with that, or just move that into a separate header in
include/utils/, say int.h?  Even if common/int.h exists?

Attached is an updated patch.  Perhaps you have something else in
mind?
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [HACKERS] [PATCH] pageinspect function to decode infomasks
Next
From: Michael Paquier
Date:
Subject: Re: [HACKERS] [PATCH] pageinspect function to decode infomasks