Re: refactoring - share str2*int64 functions - Mailing list pgsql-hackers
From | Fabien COELHO |
---|---|
Subject | Re: refactoring - share str2*int64 functions |
Date | |
Msg-id | alpine.DEB.2.21.1909041027080.8057@lancre Whole thread Raw |
In response to | Re: refactoring - share str2*int64 functions (Michael Paquier <michael@paquier.xyz>) |
Responses |
Re: refactoring - share str2*int64 functions
|
List | pgsql-hackers |
Bonjour Michaël, >> Attached a rebased version which implements the int64/uint64 stuff. It is >> basically the previous patch without the overflow inlined functions. > > - if (!strtoint64(yytext, true, &yylval->ival)) > + if (unlikely(pg_strtoint64(yytext, &yylval->ival) != PG_STRTOINT_OK)) > It seems to me that we should have unlikely() only within > pg_strtoint64(), pg_strtouint64(), etc. From a compiler perspective, the (un)likely tip is potentially useful on any test. We know when parsing a that it is very unlikely that the string conversion would fail, so we can tell that, so that the compiler knows which branch it should optimize first. You can argue against that if the functions are inlined, because maybe the compiler would propagate the information, but for distinct functions compiled separately the information is useful at each level. > - /* skip leading spaces; cast is consistent with strtoint64 */ > - while (*ptr && isspace((unsigned char) *ptr)) > + /* skip leading spaces; cast is consistent with pg_strtoint64 */ > + while (isspace((unsigned char) *ptr)) > ptr++; > What do you think about splitting this part in two? I would suggest > to do the refactoring in a first patch, and the consider all the > optimizations for the routines you have in mind afterwards. I would not bother. > I think that we don't actually need is_an_int() and str2int64() at all > in pgbench.c as we could just check for the return code of > pg_strtoint64() and switch to the double parsing only if we don't have > PG_STRTOINT_OK. Yep, you are right, now that the conversion functions does not error out a message, its failure can be used as a test. The version attached changes slightly the semantics, because on int overflows a double conversion will be attempted instead of erroring. I do not think that it is worth the effort of preserving the previous semantic of erroring. > scanint8() only has one caller in the backend with your patch, and we > don't check after its return result in int8.c, so I would suggest to > move the error handling directly in int8in() and to remove scanint8(). Ok. > I think that we should also introduce the [u]int[16|32] flavors and > expand them in the code in a single patch, in a way consistent with > what you have done for int64/uint64 as the state that we reach with > the patch is kind of weird as there are existing versions numutils.c. Before dealing with the 16/32 versions, which involve quite a significant amount of changes, I would want a clear message that "the 64 bit approach" is the model to follow. Moreover, I'm unsure how to rename the existing pg_strtoint32 and others which call ereport, if the name is used for the common error returning version. > Have you done some performance testing of the patch? The COPY > bulkload is a good example of that: > https://www.postgresql.org/message-id/20190717181428.krqpmduejbqr4m6g%40alap3.anarazel.de I have done no such thing for now. I would not expect any significant performance difference when loading int8 things because basically scanint8 has just been renamed pg_strtoint64 and made global, and that is more or less all. It might be a little bit slower because possible the compiler cannot inline the conversion, but on the other hand, the *likely hints and removed tests may marginaly help performance. I think that the only way to test performance significantly would be to write a specific program that loops over a conversion. -- Fabien.
Attachment
pgsql-hackers by date: