Thread: [PATCH] Add function to_oct
Hello! This patch is a new function based on the implementation of to_hex(int). Since support for octal integer literals was added, to_oct(int) allows octal values to be easily stored and returned in query results. to_oct(0o755) = '755' This is probably most useful for storing file system permissions. -- Eric Radman
Attachment
2022年12月21日(水) 7:08 Eric Radman <ericshane@eradman.com>:> > Hello! > > This patch is a new function based on the implementation of to_hex(int). > > Since support for octal integer literals was added, to_oct(int) allows > octal values to be easily stored and returned in query results. > > to_oct(0o755) = '755' > > This is probably most useful for storing file system permissions. Seems like it would be convenient to have. Any reason why there's no matching "to_oct(bigint)" version? Patch has been added to the next commitfest [1], thanks. [1] https://commitfest.postgresql.org/41/4071/ Regards Ian Barwick
On Wed, Dec 21, 2022 at 08:36:40AM +0900, Ian Lawrence Barwick wrote: > 2022年12月21日(水) 7:08 Eric Radman <ericshane@eradman.com>:> > > Hello! > > > > This patch is a new function based on the implementation of to_hex(int). > > > > Since support for octal integer literals was added, to_oct(int) allows > > octal values to be easily stored and returned in query results. > > > > to_oct(0o755) = '755' > > > > This is probably most useful for storing file system permissions. > > Seems like it would be convenient to have. Any reason why there's > no matching "to_oct(bigint)" version? I couldn't think of a reason someone might want an octal representation of a bigint. Certainly it would be easy to add if there is value in supporting all of the same argument types.
2022年12月21日(水) 10:42 Eric Radman <ericshane@eradman.com>: > > On Wed, Dec 21, 2022 at 08:36:40AM +0900, Ian Lawrence Barwick wrote: > > 2022年12月21日(水) 7:08 Eric Radman <ericshane@eradman.com>:> > > > Hello! > > > > > > This patch is a new function based on the implementation of to_hex(int). > > > > > > Since support for octal integer literals was added, to_oct(int) allows > > > octal values to be easily stored and returned in query results. > > > > > > to_oct(0o755) = '755' > > > > > > This is probably most useful for storing file system permissions. > > > > Seems like it would be convenient to have. Any reason why there's > > no matching "to_oct(bigint)" version? > > I couldn't think of a reason someone might want an octal > representation of a bigint. Certainly it would be easy to add > if there is value in supporting all of the same argument types. Yeah, I am struggling to think of a practical application other than symmetry with to_hex(). Regards Ian Barwick
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation: not tested This is a mini-review of to_oct :-) The function seems useful to me, and is obviously correct. I don't know whether optimization at such a low level is considered in PostgreSQL, but here goes. The calculation of quotient and remainder can be replaced by less costly masking and shifting. Defining #define OCT_DIGIT_BITS 3 #define OCT_DIGIT_BITMASK 0x7 the content of the loop can be replaced by *--ptr = digits[value & OCT_DIGIT_BITMASK]; value >>= OCT_DIGIT_BITS; Also, the check for ptr > buf in the while loop can be removed. The check is superfluous, since buf cannot possibly be exhaustedby a 32 bit integer (the maximum octal number being 37777777777). Best regards Dag Lem The new status of this patch is: Waiting on Author
On Thu, Dec 22, 2022 at 10:08:17AM +0000, Dag Lem wrote: > > The calculation of quotient and remainder can be replaced by less costly masking and shifting. > > Defining > > #define OCT_DIGIT_BITS 3 > #define OCT_DIGIT_BITMASK 0x7 > > the content of the loop can be replaced by > > *--ptr = digits[value & OCT_DIGIT_BITMASK]; > value >>= OCT_DIGIT_BITS; > > Also, the check for ptr > buf in the while loop can be removed. The > check is superfluous, since buf cannot possibly be exhausted by a 32 > bit integer (the maximum octal number being 37777777777). I integrated these suggestions in the attached -v2 patch and tested range of values manually. Also picked an OID > 8000 as suggested by unused_oids. ..Eric
Attachment
On Thu, 22 Dec 2022 at 23:11, Eric Radman <ericshane@eradman.com> wrote: > > On Thu, Dec 22, 2022 at 10:08:17AM +0000, Dag Lem wrote: > > > > The calculation of quotient and remainder can be replaced by less costly masking and shifting. > > > > Defining > > > > #define OCT_DIGIT_BITS 3 > > #define OCT_DIGIT_BITMASK 0x7 > > > > the content of the loop can be replaced by > > > > *--ptr = digits[value & OCT_DIGIT_BITMASK]; > > value >>= OCT_DIGIT_BITS; > > > > Also, the check for ptr > buf in the while loop can be removed. The > > check is superfluous, since buf cannot possibly be exhausted by a 32 > > bit integer (the maximum octal number being 37777777777). > > I integrated these suggestions in the attached -v2 patch and tested > range of values manually. > > Also picked an OID > 8000 as suggested by unused_oids. Few suggestions 1) We could use to_oct instead of to_oct32 as we don't have multiple implementations for to_oct diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 98d90d9338..fde0b24563 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -3687,6 +3687,9 @@ { oid => '2090', descr => 'convert int8 number to hex', proname => 'to_hex', prorettype => 'text', proargtypes => 'int8', prosrc => 'to_hex64' }, +{ oid => '8335', descr => 'convert int4 number to oct', + proname => 'to_oct', prorettype => 'text', proargtypes => 'int4', + prosrc => 'to_oct32' }, 2) Similarly we could change this to "to_oct" +/* + * Convert an int32 to a string containing a base 8 (oct) representation of + * the number. + */ +Datum +to_oct32(PG_FUNCTION_ARGS) +{ + uint32 value = (uint32) PG_GETARG_INT32(0); 3) I'm not sure if AS "77777777" is required, but I also noticed it is done similarly in to_hex too: +-- +-- test to_oct +-- +select to_oct(256*256*256 - 1) AS "77777777"; + 77777777 +---------- + 77777777 +(1 row) 4) You could add a commit message for the patch Regards, Vignesh
vignesh C <vignesh21@gmail.com> writes: > Few suggestions > 1) We could use to_oct instead of to_oct32 as we don't have multiple > implementations for to_oct That seems (a) shortsighted and (b) inconsistent with the naming pattern used for to_hex, so I doubt it'd be an improvement. regards, tom lane
On 20.12.22 23:08, Eric Radman wrote: > This patch is a new function based on the implementation of to_hex(int). > > Since support for octal integer literals was added, to_oct(int) allows > octal values to be easily stored and returned in query results. > > to_oct(0o755) = '755' > > This is probably most useful for storing file system permissions. Note this subsequent discussion about the to_hex function: https://www.postgresql.org/message-id/flat/CAEZATCVbkL1ynqpsKiTDpch34%3DSCr5nnau%3DnfNmiy2nM3SJHtw%40mail.gmail.com Also, I think there is no "to binary" function, so perhaps if we're going down this road one way or the other, we should probably complete the set.
On Thu, Feb 23, 2023 at 6:32 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
On 20.12.22 23:08, Eric Radman wrote:
> This patch is a new function based on the implementation of to_hex(int).
>
> Since support for octal integer literals was added, to_oct(int) allows
> octal values to be easily stored and returned in query results.
>
> to_oct(0o755) = '755'
>
> This is probably most useful for storing file system permissions.
Note this subsequent discussion about the to_hex function:
https://www.postgresql.org/message-id/flat/CAEZATCVbkL1ynqpsKiTDpch34%3DSCr5nnau%3DnfNmiy2nM3SJHtw%40mail.gmail.com
Also, I think there is no "to binary" function, so perhaps if we're
going down this road one way or the other, we should probably complete
the set.
The code reads clearly. It works as expected (being an old PDP-11 guy!)... And the docs make sense and build as well.
Nothing larger than an int gets in. I was "missing" the bigint version, but read through ALL of the messages to see (and agree)
That that's okay.
That that's okay.
Marked Ready for Committer.
Thanks, Kirk
On Tue, Apr 04, 2023 at 08:45:36PM -0400, Kirk Wolak wrote: > Marked Ready for Committer. I started taking a look at this and ended up adding to_binary() and a bigint version of to_oct() for completeness. While I was at it, I moved the base-conversion logic out to a separate static function that to_binary/oct/hex all use. From the other discussion referenced upthread, it sounds like we might want to replace to_binary/oct/hex with a more generic base-conversion function. Maybe we should try to do that instead. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Tue, Jul 25, 2023 at 04:24:26PM -0700, Nathan Bossart wrote: > I started taking a look at this and ended up adding to_binary() and a > bigint version of to_oct() for completeness. While I was at it, I moved > the base-conversion logic out to a separate static function that > to_binary/oct/hex all use. Bleh, this patch seems to fail horribly on 32-bit builds. I'll look into it soon. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Tue, Jul 25, 2023 at 05:16:56PM -0700, Nathan Bossart wrote: > On Tue, Jul 25, 2023 at 04:24:26PM -0700, Nathan Bossart wrote: >> I started taking a look at this and ended up adding to_binary() and a >> bigint version of to_oct() for completeness. While I was at it, I moved >> the base-conversion logic out to a separate static function that >> to_binary/oct/hex all use. > > Bleh, this patch seems to fail horribly on 32-bit builds. I'll look into > it soon. Here's a new version of the patch with the silly mistakes fixed. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Tue, Jul 25, 2023 at 08:29:17PM -0700, Nathan Bossart wrote: > Here's a new version of the patch with the silly mistakes fixed. If there are no objections, I'd like to commit this patch soon. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On 8/15/23 06:11, Nathan Bossart wrote: > On Tue, Jul 25, 2023 at 08:29:17PM -0700, Nathan Bossart wrote: >> Here's a new version of the patch with the silly mistakes fixed. > > If there are no objections, I'd like to commit this patch soon. I just took a look at this (and the rest of the thread). I am a little bit disappointed that we don't have a generic base conversion function, but even if we did I think these specialized functions would still be useful. No objection from me. -- Vik Fearing
On Tue, Jul 25, 2023 at 05:16:56PM -0700, Nathan Bossart wrote:
> [v4]
If we're not going to have a general SQL conversion function, here are some comments on the current patch.
Not needed if the definition is above the callers.
+ * Workhorse for to_binary, to_oct, and to_hex. Note that base must be either
+ * 2, 8, or 16.
Why wouldn't it work with any base <= 16?
- *ptr = '\0';
+ Assert(base == 2 || base == 8 || base == 16);
+ *ptr = '\0';
Spurious whitespace change?
- char buf[32]; /* bigger than needed, but reasonable */
+ char *buf = palloc(sizeof(uint64) * BITS_PER_BYTE + 1);
Why is this no longer allocated on the stack? Maybe needs a comment about the size calculation.
+static char *
+convert_to_base(uint64 value, int base)
On my machine this now requires a function call and a DIV instruction, even though the patch claims not to support anything but a power of two. It's tiny enough to declare it inline so the compiler can specialize for each call site.
+{ oid => '5101', descr => 'convert int4 number to binary',
Needs to be over 8000.
--
John Naylor
EDB: http://www.enterprisedb.com
+ char *buf = palloc(sizeof(uint64) * BITS_PER_BYTE + 1);
Why is this no longer allocated on the stack? Maybe needs a comment about the size calculation.
+static char *
+convert_to_base(uint64 value, int base)
On my machine this now requires a function call and a DIV instruction, even though the patch claims not to support anything but a power of two. It's tiny enough to declare it inline so the compiler can specialize for each call site.
+{ oid => '5101', descr => 'convert int4 number to binary',
Needs to be over 8000.
--
John Naylor
EDB: http://www.enterprisedb.com
On Tue, Aug 15, 2023 at 07:58:17AM +0200, Vik Fearing wrote: > On 8/15/23 06:11, Nathan Bossart wrote: >> If there are no objections, I'd like to commit this patch soon. > > I just took a look at this (and the rest of the thread). I am a little bit > disappointed that we don't have a generic base conversion function, but even > if we did I think these specialized functions would still be useful. > > No objection from me. Thanks for taking a look. I don't mean for this to preclude a generic base conversion function that would supersede the functions added by this patch. However, I didn't want to hold up $SUBJECT because of something that may or may not happen after lots of discussion. If it does happen within the v17 development cycle, we could probably just remove to_oct/binary. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Tue, Aug 15, 2023 at 01:53:25PM +0700, John Naylor wrote: > If we're not going to have a general SQL conversion function, here are some > comments on the current patch. I appreciate the review. > +static char *convert_to_base(uint64 value, int base); > > Not needed if the definition is above the callers. Done. > + * Workhorse for to_binary, to_oct, and to_hex. Note that base must be > either > + * 2, 8, or 16. > > Why wouldn't it work with any base <= 16? You're right. I changed this in v5. > - *ptr = '\0'; > + Assert(base == 2 || base == 8 || base == 16); > > + *ptr = '\0'; > > Spurious whitespace change? I think this might just be a weird artifact of the diff algorithm. > - char buf[32]; /* bigger than needed, but reasonable */ > + char *buf = palloc(sizeof(uint64) * BITS_PER_BYTE + 1); > > Why is this no longer allocated on the stack? Maybe needs a comment about > the size calculation. It really should be. IIRC I wanted to avoid passing a pre-allocated buffer to convert_to_base(), but I don't remember why. I fixed this in v5. > +static char * > +convert_to_base(uint64 value, int base) > > On my machine this now requires a function call and a DIV instruction, even > though the patch claims not to support anything but a power of two. It's > tiny enough to declare it inline so the compiler can specialize for each > call site. This was on my list of things to check before committing. I assumed that it would be automatically inlined, but given your analysis, I went ahead and added the inline keyword. My compiler took the hint and inlined the function, and it used SHR instead of DIV instructions. The machine code for to_hex32/64 is still a couple of instructions longer than before (probably because of the uint64 casts), but I don't know if we need to worry about micro-optimizing these functions any further. > +{ oid => '5101', descr => 'convert int4 number to binary', > > Needs to be over 8000. Done. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Wed, Aug 16, 2023 at 12:17 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Tue, Aug 15, 2023 at 01:53:25PM +0700, John Naylor wrote:
> > - *ptr = '\0';
> > + Assert(base == 2 || base == 8 || base == 16);
> >
> > + *ptr = '\0';
> >
> > Spurious whitespace change?
>
> I think this might just be a weird artifact of the diff algorithm.
Don't believe everything you think. :-)
```
*ptr = '\0';
do
```
to
```
*ptr = '\0';
do
```
> > - char buf[32]; /* bigger than needed, but reasonable */
> > + char *buf = palloc(sizeof(uint64) * BITS_PER_BYTE + 1);
> >
> > Why is this no longer allocated on the stack? Maybe needs a comment about
> > the size calculation.
>
> It really should be. IIRC I wanted to avoid passing a pre-allocated buffer
> to convert_to_base(), but I don't remember why. I fixed this in v5.
Now I'm struggling to understand why each and every instance has its own nominal buffer, passed down to the implementation. All we care about is the result -- is there some reason not to confine the buffer declaration to the general implementation?
--
John Naylor
EDB: http://www.enterprisedb.com
On Wed, Aug 16, 2023 at 10:35:27AM +0700, John Naylor wrote: > ``` > *ptr = '\0'; > > do > ``` > > to > > ``` > *ptr = '\0'; > do > ``` Oh, I misunderstood. I thought you meant that there might be a whitespace change on that line, not the surrounding ones. This is fixed in v6. > Now I'm struggling to understand why each and every instance has its own > nominal buffer, passed down to the implementation. All we care about is the > result -- is there some reason not to confine the buffer declaration to the > general implementation? We can do that if we use a static variable, which is what I've done in v6. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Wed, Aug 16, 2023 at 9:24 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Wed, Aug 16, 2023 at 10:35:27AM +0700, John Naylor wrote:
> > Now I'm struggling to understand why each and every instance has its own
> > nominal buffer, passed down to the implementation. All we care about is the
> > result -- is there some reason not to confine the buffer declaration to the
> > general implementation?
>
> We can do that if we use a static variable, which is what I've done in v6.
That makes it a lexically-scoped global variable, which we don't need either. Can we have the internal function allocate on the stack, then call cstring_to_text() on that, returning the text result? That does its own palloc.
> > nominal buffer, passed down to the implementation. All we care about is the
> > result -- is there some reason not to confine the buffer declaration to the
> > general implementation?
>
> We can do that if we use a static variable, which is what I've done in v6.
That makes it a lexically-scoped global variable, which we don't need either. Can we have the internal function allocate on the stack, then call cstring_to_text() on that, returning the text result? That does its own palloc.
Or maybe better, save the starting pointer, compute the length at the end, and call cstring_to_text_with_len()? (It seems we wouldn't need the nul-terminator then, either.)
On Thu, Aug 17, 2023 at 12:35:54PM +0700, John Naylor wrote: > That makes it a lexically-scoped global variable, which we don't need > either. Can we have the internal function allocate on the stack, then > call cstring_to_text() on that, returning the text result? That does its > own palloc. > > Or maybe better, save the starting pointer, compute the length at the end, > and call cstring_to_text_with_len()? (It seems we wouldn't need > the nul-terminator then, either.) Works for me. I did it that way in v7. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Thu, Aug 17, 2023 at 10:26 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Thu, Aug 17, 2023 at 12:35:54PM +0700, John Naylor wrote:
> > That makes it a lexically-scoped global variable, which we don't need
> > either. Can we have the internal function allocate on the stack, then
> > call cstring_to_text() on that, returning the text result? That does its
> > own palloc.
> >
> > Or maybe better, save the starting pointer, compute the length at the end,
> > and call cstring_to_text_with_len()? (It seems we wouldn't need
> > the nul-terminator then, either.)
>
> Works for me. I did it that way in v7.
This looks nicer, but still doesn't save the starting pointer, and so needs to lug around that big honking macro. This is what I mean:
static inline text *
convert_to_base(uint64 value, int base)
{
const char *digits = "0123456789abcdef";
/* We size the buffer for to_binary's longest possible return value. */
char buf[sizeof(uint64) * BITS_PER_BYTE];
char * const end = buf + sizeof(buf);
char *ptr = end;
Assert(base > 1);
Assert(base <= 16);
do
{
*--ptr = digits[value % base];
value /= base;
} while (ptr > buf && value);
return cstring_to_text_with_len(ptr, end - ptr);
}
convert_to_base(uint64 value, int base)
{
const char *digits = "0123456789abcdef";
/* We size the buffer for to_binary's longest possible return value. */
char buf[sizeof(uint64) * BITS_PER_BYTE];
char * const end = buf + sizeof(buf);
char *ptr = end;
Assert(base > 1);
Assert(base <= 16);
do
{
*--ptr = digits[value % base];
value /= base;
} while (ptr > buf && value);
return cstring_to_text_with_len(ptr, end - ptr);
}
On Thu, 17 Aug 2023 at 16:26, Nathan Bossart <nathandbossart@gmail.com> wrote: > > Works for me. I did it that way in v7. > I note that there are no tests for negative inputs. Doing a quick test, shows that this changes the current behaviour, because all inputs are now treated as 64-bit: HEAD: select to_hex((-1234)::int); to_hex ---------- fffffb2e With patch: select to_hex((-1234)::int); to_hex ------------------ fffffffffffffb2e The way that negative inputs are handled really should be documented, or at least it should include a couple of examples. Regards, Dean
On Sat, Aug 19, 2023 at 11:41:56AM +0700, John Naylor wrote: > This looks nicer, but still doesn't save the starting pointer, and so needs > to lug around that big honking macro. This is what I mean: > > static inline text * > convert_to_base(uint64 value, int base) > { > const char *digits = "0123456789abcdef"; > /* We size the buffer for to_binary's longest possible return value. */ > char buf[sizeof(uint64) * BITS_PER_BYTE]; > char * const end = buf + sizeof(buf); > char *ptr = end; > > Assert(base > 1); > Assert(base <= 16); > > do > { > *--ptr = digits[value % base]; > value /= base; > } while (ptr > buf && value); > > return cstring_to_text_with_len(ptr, end - ptr); > } I will use this in v8. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Sat, Aug 19, 2023 at 08:35:46AM +0100, Dean Rasheed wrote: > I note that there are no tests for negative inputs. I added some in v8. > Doing a quick test, shows that this changes the current behaviour, > because all inputs are now treated as 64-bit: > > HEAD: > > select to_hex((-1234)::int); > to_hex > ---------- > fffffb2e > > With patch: > > select to_hex((-1234)::int); > to_hex > ------------------ > fffffffffffffb2e Good catch. In v8, I fixed this by first casting the input to uint32 for the 32-bit versions of the functions. This prevents the conversion to uint64 from setting the rest of the bits. AFAICT this behavior is pretty well defined in the standard. > The way that negative inputs are handled really should be documented, > or at least it should include a couple of examples. I used your suggestion and noted that the output is the two's complement representation [0]. [0] https://postgr.es/m/CAEZATCVbkL1ynqpsKiTDpch34%3DSCr5nnau%3DnfNmiy2nM3SJHtw%40mail.gmail.com -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Sun, 20 Aug 2023 at 16:25, Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Sat, Aug 19, 2023 at 08:35:46AM +0100, Dean Rasheed wrote: > > > The way that negative inputs are handled really should be documented, > > or at least it should include a couple of examples. > > I used your suggestion and noted that the output is the two's complement > representation [0]. > Hmm, I think just including the doc text update, without the examples of positive and negative inputs, might not be sufficient to make the meaning clear to everyone. Something else that bothers me slightly is the function naming -- "hexadecimal" gets abbreviated to "hex", "octal" gets abbreviated to "oct", but "binary" is left as-is. I think it ought to be "to_bin()" on consistency grounds, even though I understand the words "to bin" could be interpreted differently. (Looking elsewhere for precedents, Python has bin(), oct() and hex() functions.) Also, I think the convention is to always list functions alphabetically, so to_oct() should really come after to_hex(). Regards, Dean
On Mon, Aug 21, 2023 at 09:31:37AM +0100, Dean Rasheed wrote: > Hmm, I think just including the doc text update, without the examples > of positive and negative inputs, might not be sufficient to make the > meaning clear to everyone. I added some examples for negative inputs. > Something else that bothers me slightly is the function naming -- > "hexadecimal" gets abbreviated to "hex", "octal" gets abbreviated to > "oct", but "binary" is left as-is. I think it ought to be "to_bin()" > on consistency grounds, even though I understand the words "to bin" > could be interpreted differently. (Looking elsewhere for precedents, > Python has bin(), oct() and hex() functions.) I renamed it to to_bin(). > Also, I think the convention is to always list functions > alphabetically, so to_oct() should really come after to_hex(). I reordered the functions in the docs. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Mon, 21 Aug 2023 at 20:15, Nathan Bossart <nathandbossart@gmail.com> wrote: > > I added some examples for negative inputs. > > I renamed it to to_bin(). > > I reordered the functions in the docs. > OK, cool. This looks good to me. Regards, Dean
On Tue, Aug 22, 2023 at 3:10 AM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> OK, cool. This looks good to me.
LGTM too.
--
John Naylor
EDB: http://www.enterprisedb.com
On 20.08.23 17:25, Nathan Bossart wrote: >> Doing a quick test, shows that this changes the current behaviour, >> because all inputs are now treated as 64-bit: >> >> HEAD: >> >> select to_hex((-1234)::int); >> to_hex >> ---------- >> fffffb2e >> >> With patch: >> >> select to_hex((-1234)::int); >> to_hex >> ------------------ >> fffffffffffffb2e > Good catch. In v8, I fixed this by first casting the input to uint32 for > the 32-bit versions of the functions. This prevents the conversion to > uint64 from setting the rest of the bits. AFAICT this behavior is pretty > well defined in the standard. What standard? I don't understand the reason for this handling of negative values. I would expect that, say, to_hex(-1234) would return '-' || to_hex(1234).
On Tue, Aug 22, 2023 at 04:20:02PM +0200, Peter Eisentraut wrote: > On 20.08.23 17:25, Nathan Bossart wrote: >> > Doing a quick test, shows that this changes the current behaviour, >> > because all inputs are now treated as 64-bit: >> > >> > HEAD: >> > >> > select to_hex((-1234)::int); >> > to_hex >> > ---------- >> > fffffb2e >> > >> > With patch: >> > >> > select to_hex((-1234)::int); >> > to_hex >> > ------------------ >> > fffffffffffffb2e >> Good catch. In v8, I fixed this by first casting the input to uint32 for >> the 32-bit versions of the functions. This prevents the conversion to >> uint64 from setting the rest of the bits. AFAICT this behavior is pretty >> well defined in the standard. > > What standard? C99 > I don't understand the reason for this handling of negative values. I would > expect that, say, to_hex(-1234) would return '-' || to_hex(1234). For this patch set, I was trying to retain the current behavior, which is to return the two's complement representation. I'm open to changing this if there's agreement on what the output should be. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On 22.08.23 16:26, Nathan Bossart wrote: >> I don't understand the reason for this handling of negative values. I would >> expect that, say, to_hex(-1234) would return '-' || to_hex(1234). > For this patch set, I was trying to retain the current behavior, which is > to return the two's complement representation. I'm open to changing this > if there's agreement on what the output should be. Ah, if there is existing behavior then we should probably keep it.
On Tue, Aug 22, 2023 at 04:57:02PM +0200, Peter Eisentraut wrote: > On 22.08.23 16:26, Nathan Bossart wrote: >> > I don't understand the reason for this handling of negative values. I would >> > expect that, say, to_hex(-1234) would return '-' || to_hex(1234). >> For this patch set, I was trying to retain the current behavior, which is >> to return the two's complement representation. I'm open to changing this >> if there's agreement on what the output should be. > > Ah, if there is existing behavior then we should probably keep it. Committed. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com