Thread: [PATCH] Add function to_oct

[PATCH] Add function to_oct

From

Eric Radman

Date:

20 December 2022, 22:08:13

Hello!

This patch is a new function based on the implementation of to_hex(int).

Since support for octal integer literals was added, to_oct(int) allows
octal values to be easily stored and returned in query results.

  to_oct(0o755) = '755'

This is probably most useful for storing file system permissions.

--
Eric Radman

Attachment

add-function-to_oct.patch

Re: [PATCH] Add function to_oct

From

Ian Lawrence Barwick

Date:

20 December 2022, 23:36:40

2022年12月21日(水) 7:08 Eric Radman <ericshane@eradman.com>:>
> Hello!
>
> This patch is a new function based on the implementation of to_hex(int).
>
> Since support for octal integer literals was added, to_oct(int) allows
> octal values to be easily stored and returned in query results.
>
>   to_oct(0o755) = '755'
>
> This is probably most useful for storing file system permissions.

Seems like it would be convenient to have. Any reason why there's
no matching "to_oct(bigint)" version?

Patch has been added to the next commitfest [1], thanks.

[1] https://commitfest.postgresql.org/41/4071/

Regards

Ian Barwick

Re: [PATCH] Add function to_oct

From

Eric Radman

Date:

21 December 2022, 01:42:10

On Wed, Dec 21, 2022 at 08:36:40AM +0900, Ian Lawrence Barwick wrote:
> 2022年12月21日(水) 7:08 Eric Radman <ericshane@eradman.com>:>
> > Hello!
> >
> > This patch is a new function based on the implementation of to_hex(int).
> >
> > Since support for octal integer literals was added, to_oct(int) allows
> > octal values to be easily stored and returned in query results.
> >
> >   to_oct(0o755) = '755'
> >
> > This is probably most useful for storing file system permissions.
> 
> Seems like it would be convenient to have. Any reason why there's
> no matching "to_oct(bigint)" version?

I couldn't think of a reason someone might want an octal
representation of a bigint.  Certainly it would be easy to add
if there is value in supporting all of the same argument types.

Re: [PATCH] Add function to_oct

From

Ian Lawrence Barwick

Date:

21 December 2022, 01:56:29

2022年12月21日(水) 10:42 Eric Radman <ericshane@eradman.com>:
>
> On Wed, Dec 21, 2022 at 08:36:40AM +0900, Ian Lawrence Barwick wrote:
> > 2022年12月21日(水) 7:08 Eric Radman <ericshane@eradman.com>:>
> > > Hello!
> > >
> > > This patch is a new function based on the implementation of to_hex(int).
> > >
> > > Since support for octal integer literals was added, to_oct(int) allows
> > > octal values to be easily stored and returned in query results.
> > >
> > >   to_oct(0o755) = '755'
> > >
> > > This is probably most useful for storing file system permissions.
> >
> > Seems like it would be convenient to have. Any reason why there's
> > no matching "to_oct(bigint)" version?
>
> I couldn't think of a reason someone might want an octal
> representation of a bigint.  Certainly it would be easy to add
> if there is value in supporting all of the same argument types.

Yeah, I am struggling to think of a practical application other than
symmetry with to_hex().


Regards

Ian Barwick

Re: [PATCH] Add function to_oct

From

Dag Lem

Date:

22 December 2022, 10:08:17

The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:       not tested
Spec compliant:           not tested
Documentation:            not tested

This is a mini-review of to_oct :-)

The function seems useful to me, and is obviously correct.

I don't know whether optimization at such a low level is considered in PostgreSQL, but here goes.

The calculation of quotient and remainder can be replaced by less costly masking and shifting.

Defining

#define OCT_DIGIT_BITS 3
#define OCT_DIGIT_BITMASK 0x7

the content of the loop can be replaced by

        *--ptr = digits[value & OCT_DIGIT_BITMASK];
        value >>= OCT_DIGIT_BITS;

Also, the check for ptr > buf in the while loop can be removed. The check is superfluous, since buf cannot possibly be
exhaustedby a 32 bit integer (the maximum octal number being 37777777777).
 


Best regards

Dag Lem

The new status of this patch is: Waiting on Author

Re: [PATCH] Add function to_oct

From

Eric Radman

Date:

22 December 2022, 17:41:24

On Thu, Dec 22, 2022 at 10:08:17AM +0000, Dag Lem wrote:
> 
> The calculation of quotient and remainder can be replaced by less costly masking and shifting.
> 
> Defining
> 
> #define OCT_DIGIT_BITS 3
> #define OCT_DIGIT_BITMASK 0x7
> 
> the content of the loop can be replaced by
> 
>         *--ptr = digits[value & OCT_DIGIT_BITMASK];
>         value >>= OCT_DIGIT_BITS;
> 
> Also, the check for ptr > buf in the while loop can be removed. The
> check is superfluous, since buf cannot possibly be exhausted by a 32
> bit integer (the maximum octal number being 37777777777).

I integrated these suggestions in the attached -v2 patch and tested
range of values manually.

Also picked an OID > 8000 as suggested by unused_oids.

..Eric

Attachment

add-function-to_oct-v2.patch

Re: [PATCH] Add function to_oct

From

vignesh C

Date:

07 January 2023, 11:02:26

On Thu, 22 Dec 2022 at 23:11, Eric Radman <ericshane@eradman.com> wrote:
>
> On Thu, Dec 22, 2022 at 10:08:17AM +0000, Dag Lem wrote:
> >
> > The calculation of quotient and remainder can be replaced by less costly masking and shifting.
> >
> > Defining
> >
> > #define OCT_DIGIT_BITS 3
> > #define OCT_DIGIT_BITMASK 0x7
> >
> > the content of the loop can be replaced by
> >
> >               *--ptr = digits[value & OCT_DIGIT_BITMASK];
> >               value >>= OCT_DIGIT_BITS;
> >
> > Also, the check for ptr > buf in the while loop can be removed. The
> > check is superfluous, since buf cannot possibly be exhausted by a 32
> > bit integer (the maximum octal number being 37777777777).
>
> I integrated these suggestions in the attached -v2 patch and tested
> range of values manually.
>
> Also picked an OID > 8000 as suggested by unused_oids.

Few suggestions
1) We could use to_oct instead of to_oct32 as we don't have multiple
implementations for to_oct
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 98d90d9338..fde0b24563 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -3687,6 +3687,9 @@
 { oid => '2090', descr => 'convert int8 number to hex',
   proname => 'to_hex', prorettype => 'text', proargtypes => 'int8',
   prosrc => 'to_hex64' },
+{ oid => '8335', descr => 'convert int4 number to oct',
+  proname => 'to_oct', prorettype => 'text', proargtypes => 'int4',
+  prosrc => 'to_oct32' },

2) Similarly we could change this to "to_oct"
+/*
+ * Convert an int32 to a string containing a base 8 (oct) representation of
+ * the number.
+ */
+Datum
+to_oct32(PG_FUNCTION_ARGS)
+{
+       uint32          value = (uint32) PG_GETARG_INT32(0);

3) I'm not sure if  AS "77777777" is required, but I also noticed it
is done similarly in to_hex too:
+--
+-- test to_oct
+--
+select to_oct(256*256*256 - 1) AS "77777777";
+ 77777777
+----------
+ 77777777
+(1 row)

4) You could add a commit message for the patch

Regards,
Vignesh

Re: [PATCH] Add function to_oct

From

Tom Lane

Date:

07 January 2023, 15:29:49

vignesh C <vignesh21@gmail.com> writes:
> Few suggestions
> 1) We could use to_oct instead of to_oct32 as we don't have multiple
> implementations for to_oct

That seems (a) shortsighted and (b) inconsistent with the naming
pattern used for to_hex, so I doubt it'd be an improvement.

            regards, tom lane

Re: [PATCH] Add function to_oct

From

Peter Eisentraut

Date:

23 February 2023, 11:32:42

On 20.12.22 23:08, Eric Radman wrote:
> This patch is a new function based on the implementation of to_hex(int).
> 
> Since support for octal integer literals was added, to_oct(int) allows
> octal values to be easily stored and returned in query results.
> 
>    to_oct(0o755) = '755'
> 
> This is probably most useful for storing file system permissions.

Note this subsequent discussion about the to_hex function: 
https://www.postgresql.org/message-id/flat/CAEZATCVbkL1ynqpsKiTDpch34%3DSCr5nnau%3DnfNmiy2nM3SJHtw%40mail.gmail.com

Also, I think there is no "to binary" function, so perhaps if we're 
going down this road one way or the other, we should probably complete 
the set.

Re: [PATCH] Add function to_oct

From

Kirk Wolak

Date:

05 April 2023, 00:45:36

On Thu, Feb 23, 2023 at 6:32 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:

On 20.12.22 23:08, Eric Radman wrote:
> This patch is a new function based on the implementation of to_hex(int).
>
> Since support for octal integer literals was added, to_oct(int) allows
> octal values to be easily stored and returned in query results.
>
> to_oct(0o755) = '755'
>
> This is probably most useful for storing file system permissions.

Note this subsequent discussion about the to_hex function:
https://www.postgresql.org/message-id/flat/CAEZATCVbkL1ynqpsKiTDpch34%3DSCr5nnau%3DnfNmiy2nM3SJHtw%40mail.gmail.com

Also, I think there is no "to binary" function, so perhaps if we're
going down this road one way or the other, we should probably complete
the set.

The code reads clearly. It works as expected (being an old PDP-11 guy!)... And the docs make sense and build as well.

Nothing larger than an int gets in. I was "missing" the bigint version, but read through ALL of the messages to see (and agree)
That that's okay.

Marked Ready for Committer.

Thanks, Kirk

Re: [PATCH] Add function to_oct

From

Nathan Bossart

Date:

25 July 2023, 23:24:26

On Tue, Apr 04, 2023 at 08:45:36PM -0400, Kirk Wolak wrote:
> Marked Ready for Committer.

I started taking a look at this and ended up adding to_binary() and a
bigint version of to_oct() for completeness.  While I was at it, I moved
the base-conversion logic out to a separate static function that
to_binary/oct/hex all use.

From the other discussion referenced upthread, it sounds like we might want
to replace to_binary/oct/hex with a more generic base-conversion function.
Maybe we should try to do that instead.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

v3-0001-add-to_binary-and-to_oct.patch

Re: [PATCH] Add function to_oct

From

Nathan Bossart

Date:

26 July 2023, 00:16:56

On Tue, Jul 25, 2023 at 04:24:26PM -0700, Nathan Bossart wrote:
> I started taking a look at this and ended up adding to_binary() and a
> bigint version of to_oct() for completeness.  While I was at it, I moved
> the base-conversion logic out to a separate static function that
> to_binary/oct/hex all use.

Bleh, this patch seems to fail horribly on 32-bit builds.  I'll look into
it soon.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: [PATCH] Add function to_oct

From

Nathan Bossart

Date:

26 July 2023, 03:29:17

On Tue, Jul 25, 2023 at 05:16:56PM -0700, Nathan Bossart wrote:
> On Tue, Jul 25, 2023 at 04:24:26PM -0700, Nathan Bossart wrote:
>> I started taking a look at this and ended up adding to_binary() and a
>> bigint version of to_oct() for completeness.  While I was at it, I moved
>> the base-conversion logic out to a separate static function that
>> to_binary/oct/hex all use.
> 
> Bleh, this patch seems to fail horribly on 32-bit builds.  I'll look into
> it soon.

Here's a new version of the patch with the silly mistakes fixed.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

v4-0001-add-to_binary-and-to_oct.patch

Re: [PATCH] Add function to_oct

From

Nathan Bossart

Date:

15 August 2023, 04:11:29

On Tue, Jul 25, 2023 at 08:29:17PM -0700, Nathan Bossart wrote:
> Here's a new version of the patch with the silly mistakes fixed.

If there are no objections, I'd like to commit this patch soon.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: [PATCH] Add function to_oct

From

Vik Fearing

Date:

15 August 2023, 05:58:17

On 8/15/23 06:11, Nathan Bossart wrote:
> On Tue, Jul 25, 2023 at 08:29:17PM -0700, Nathan Bossart wrote:
>> Here's a new version of the patch with the silly mistakes fixed.
> 
> If there are no objections, I'd like to commit this patch soon.

I just took a look at this (and the rest of the thread).  I am a little 
bit disappointed that we don't have a generic base conversion function, 
but even if we did I think these specialized functions would still be 
useful.

No objection from me.
-- 
Vik Fearing

Re: [PATCH] Add function to_oct

From

John Naylor

Date:

15 August 2023, 06:53:25

On Tue, Jul 25, 2023 at 05:16:56PM -0700, Nathan Bossart wrote:

> [v4]

If we're not going to have a general SQL conversion function, here are some comments on the current patch.

+static char *convert_to_base(uint64 value, int base);

Not needed if the definition is above the callers.

+ * Workhorse for to_binary, to_oct, and to_hex. Note that base must be either
+ * 2, 8, or 16.

Why wouldn't it work with any base <= 16?

- *ptr = '\0';
+ Assert(base == 2 || base == 8 || base == 16);

+ *ptr = '\0';

Spurious whitespace change?

- char buf[32]; /* bigger than needed, but reasonable */
+ char *buf = palloc(sizeof(uint64) * BITS_PER_BYTE + 1);

Why is this no longer allocated on the stack? Maybe needs a comment about the size calculation.

+static char *
+convert_to_base(uint64 value, int base)

On my machine this now requires a function call and a DIV instruction, even though the patch claims not to support anything but a power of two. It's tiny enough to declare it inline so the compiler can specialize for each call site.

+{ oid => '5101', descr => 'convert int4 number to binary',

Needs to be over 8000.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: [PATCH] Add function to_oct

From

Nathan Bossart

Date:

15 August 2023, 15:42:08

On Tue, Aug 15, 2023 at 07:58:17AM +0200, Vik Fearing wrote:
> On 8/15/23 06:11, Nathan Bossart wrote:
>> If there are no objections, I'd like to commit this patch soon.
> 
> I just took a look at this (and the rest of the thread).  I am a little bit
> disappointed that we don't have a generic base conversion function, but even
> if we did I think these specialized functions would still be useful.
> 
> No objection from me.

Thanks for taking a look.  I don't mean for this to preclude a generic base
conversion function that would supersede the functions added by this patch.
However, I didn't want to hold up $SUBJECT because of something that may or
may not happen after lots of discussion.  If it does happen within the v17
development cycle, we could probably just remove to_oct/binary.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: [PATCH] Add function to_oct

From

Nathan Bossart

Date:

15 August 2023, 17:17:15

On Tue, Aug 15, 2023 at 01:53:25PM +0700, John Naylor wrote:
> If we're not going to have a general SQL conversion function, here are some
> comments on the current patch.

I appreciate the review.

> +static char *convert_to_base(uint64 value, int base);
> 
> Not needed if the definition is above the callers.

Done.

> + * Workhorse for to_binary, to_oct, and to_hex.  Note that base must be
> either
> + * 2, 8, or 16.
> 
> Why wouldn't it work with any base <= 16?

You're right.  I changed this in v5.

> - *ptr = '\0';
> + Assert(base == 2 || base == 8 || base == 16);
> 
> + *ptr = '\0';
> 
> Spurious whitespace change?

I think this might just be a weird artifact of the diff algorithm.

> - char buf[32]; /* bigger than needed, but reasonable */
> + char   *buf = palloc(sizeof(uint64) * BITS_PER_BYTE + 1);
> 
> Why is this no longer allocated on the stack? Maybe needs a comment about
> the size calculation.

It really should be.  IIRC I wanted to avoid passing a pre-allocated buffer
to convert_to_base(), but I don't remember why.  I fixed this in v5.

> +static char *
> +convert_to_base(uint64 value, int base)
> 
> On my machine this now requires a function call and a DIV instruction, even
> though the patch claims not to support anything but a power of two. It's
> tiny enough to declare it inline so the compiler can specialize for each
> call site.

This was on my list of things to check before committing.  I assumed that
it would be automatically inlined, but given your analysis, I went ahead
and added the inline keyword.  My compiler took the hint and inlined the
function, and it used SHR instead of DIV instructions.  The machine code
for to_hex32/64 is still a couple of instructions longer than before
(probably because of the uint64 casts), but I don't know if we need to
worry about micro-optimizing these functions any further.

> +{ oid => '5101', descr => 'convert int4 number to binary',
> 
> Needs to be over 8000.

Done.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

v5-0001-add-to_binary-and-to_oct.patch

Re: [PATCH] Add function to_oct

From

John Naylor

Date:

16 August 2023, 03:35:27

On Wed, Aug 16, 2023 at 12:17 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Tue, Aug 15, 2023 at 01:53:25PM +0700, John Naylor wrote:

> > - *ptr = '\0';
> > + Assert(base == 2 || base == 8 || base == 16);
> >
> > + *ptr = '\0';
> >
> > Spurious whitespace change?
>
> I think this might just be a weird artifact of the diff algorithm.

Don't believe everything you think. :-)

```
*ptr = '\0';

do
```

to

```
*ptr = '\0';
do
```

> > - char buf[32]; /* bigger than needed, but reasonable */
> > + char *buf = palloc(sizeof(uint64) * BITS_PER_BYTE + 1);
> >
> > Why is this no longer allocated on the stack? Maybe needs a comment about
> > the size calculation.
>
> It really should be. IIRC I wanted to avoid passing a pre-allocated buffer
> to convert_to_base(), but I don't remember why. I fixed this in v5.

Now I'm struggling to understand why each and every instance has its own nominal buffer, passed down to the implementation. All we care about is the result -- is there some reason not to confine the buffer declaration to the general implementation?

--
John Naylor
EDB: http://www.enterprisedb.com

Re: [PATCH] Add function to_oct

From

Nathan Bossart

Date:

16 August 2023, 14:24:10

On Wed, Aug 16, 2023 at 10:35:27AM +0700, John Naylor wrote:
> ```
> *ptr = '\0';
> 
> do
> ```
> 
> to
> 
> ```
> *ptr = '\0';
> do
> ```

Oh, I misunderstood.  I thought you meant that there might be a whitespace
change on that line, not the surrounding ones.  This is fixed in v6.

> Now I'm struggling to understand why each and every instance has its own
> nominal buffer, passed down to the implementation. All we care about is the
> result -- is there some reason not to confine the buffer declaration to the
> general implementation?

We can do that if we use a static variable, which is what I've done in v6.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

v6-0001-add-to_binary-and-to_oct.patch

Re: [PATCH] Add function to_oct

From

John Naylor

Date:

17 August 2023, 05:35:54

On Wed, Aug 16, 2023 at 9:24 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Wed, Aug 16, 2023 at 10:35:27AM +0700, John Naylor wrote:

> > Now I'm struggling to understand why each and every instance has its own
> > nominal buffer, passed down to the implementation. All we care about is the
> > result -- is there some reason not to confine the buffer declaration to the
> > general implementation?
>
> We can do that if we use a static variable, which is what I've done in v6.

That makes it a lexically-scoped global variable, which we don't need either. Can we have the internal function allocate on the stack, then call cstring_to_text() on that, returning the text result? That does its own palloc.

Or maybe better, save the starting pointer, compute the length at the end, and call cstring_to_text_with_len()? (It seems we wouldn't need the nul-terminator then, either.)

--
John Naylor
EDB: http://www.enterprisedb.com

Re: [PATCH] Add function to_oct

From

Nathan Bossart

Date:

17 August 2023, 15:26:28

On Thu, Aug 17, 2023 at 12:35:54PM +0700, John Naylor wrote:
> That makes it a lexically-scoped global variable, which we don't need
> either. Can we have the internal function allocate on the stack, then
> call cstring_to_text() on that, returning the text result? That does its
> own palloc.
> 
> Or maybe better, save the starting pointer, compute the length at the end,
> and call cstring_to_text_with_len()?  (It seems we wouldn't need
> the nul-terminator then, either.)

Works for me.  I did it that way in v7.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

v7-0001-add-to_binary-and-to_oct.patch

Re: [PATCH] Add function to_oct

From

John Naylor

Date:

19 August 2023, 04:41:56

On Thu, Aug 17, 2023 at 10:26 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Thu, Aug 17, 2023 at 12:35:54PM +0700, John Naylor wrote:
> > That makes it a lexically-scoped global variable, which we don't need
> > either. Can we have the internal function allocate on the stack, then
> > call cstring_to_text() on that, returning the text result? That does its
> > own palloc.
> >
> > Or maybe better, save the starting pointer, compute the length at the end,
> > and call cstring_to_text_with_len()? (It seems we wouldn't need
> > the nul-terminator then, either.)
>
> Works for me. I did it that way in v7.

This looks nicer, but still doesn't save the starting pointer, and so needs to lug around that big honking macro. This is what I mean:

static inline text *
convert_to_base(uint64 value, int base)
{
const char *digits = "0123456789abcdef";
/* We size the buffer for to_binary's longest possible return value. */
char buf[sizeof(uint64) * BITS_PER_BYTE];
char * const end = buf + sizeof(buf);
char *ptr = end;

Assert(base > 1);
Assert(base <= 16);

do
{
*--ptr = digits[value % base];
value /= base;
} while (ptr > buf && value);

return cstring_to_text_with_len(ptr, end - ptr);
}

--
John Naylor
EDB: http://www.enterprisedb.com

Re: [PATCH] Add function to_oct

From

Dean Rasheed

Date:

19 August 2023, 07:35:46

On Thu, 17 Aug 2023 at 16:26, Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> Works for me.  I did it that way in v7.
>

I note that there are no tests for negative inputs.

Doing a quick test, shows that this changes the current behaviour,
because all inputs are now treated as 64-bit:

HEAD:

select to_hex((-1234)::int);
  to_hex
----------
 fffffb2e

With patch:

select to_hex((-1234)::int);
      to_hex
------------------
 fffffffffffffb2e

The way that negative inputs are handled really should be documented,
or at least it should include a couple of examples.

Regards,
Dean

Re: [PATCH] Add function to_oct

From

Nathan Bossart

Date:

20 August 2023, 15:19:37

On Sat, Aug 19, 2023 at 11:41:56AM +0700, John Naylor wrote:
> This looks nicer, but still doesn't save the starting pointer, and so needs
> to lug around that big honking macro. This is what I mean:
> 
> static inline text *
> convert_to_base(uint64 value, int base)
> {
>   const char *digits = "0123456789abcdef";
>   /* We size the buffer for to_binary's longest possible return value. */
>   char    buf[sizeof(uint64) * BITS_PER_BYTE];
>   char     * const end =  buf + sizeof(buf);
>   char     *ptr = end;
> 
>   Assert(base > 1);
>   Assert(base <= 16);
> 
>   do
>   {
>     *--ptr = digits[value % base];
>     value /= base;
>   } while (ptr > buf && value);
> 
>   return cstring_to_text_with_len(ptr,  end - ptr);
> }

I will use this in v8.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: [PATCH] Add function to_oct

From

Nathan Bossart

Date:

20 August 2023, 15:25:51

On Sat, Aug 19, 2023 at 08:35:46AM +0100, Dean Rasheed wrote:
> I note that there are no tests for negative inputs.

I added some in v8.

> Doing a quick test, shows that this changes the current behaviour,
> because all inputs are now treated as 64-bit:
> 
> HEAD:
> 
> select to_hex((-1234)::int);
>   to_hex
> ----------
>  fffffb2e
> 
> With patch:
> 
> select to_hex((-1234)::int);
>       to_hex
> ------------------
>  fffffffffffffb2e

Good catch.  In v8, I fixed this by first casting the input to uint32 for
the 32-bit versions of the functions.  This prevents the conversion to
uint64 from setting the rest of the bits.  AFAICT this behavior is pretty
well defined in the standard.

> The way that negative inputs are handled really should be documented,
> or at least it should include a couple of examples.

I used your suggestion and noted that the output is the two's complement
representation [0].

[0] https://postgr.es/m/CAEZATCVbkL1ynqpsKiTDpch34%3DSCr5nnau%3DnfNmiy2nM3SJHtw%40mail.gmail.com

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

v8-0001-add-to_binary-and-to_oct.patch

Re: [PATCH] Add function to_oct

From

Dean Rasheed

Date:

21 August 2023, 08:31:37

On Sun, 20 Aug 2023 at 16:25, Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Sat, Aug 19, 2023 at 08:35:46AM +0100, Dean Rasheed wrote:
>
> > The way that negative inputs are handled really should be documented,
> > or at least it should include a couple of examples.
>
> I used your suggestion and noted that the output is the two's complement
> representation [0].
>

Hmm, I think just including the doc text update, without the examples
of positive and negative inputs, might not be sufficient to make the
meaning clear to everyone.

Something else that bothers me slightly is the function naming --
"hexadecimal" gets abbreviated to "hex", "octal" gets abbreviated to
"oct", but "binary" is left as-is. I think it ought to be "to_bin()"
on consistency grounds, even though I understand the words "to bin"
could be interpreted differently. (Looking elsewhere for precedents,
Python has bin(), oct() and hex() functions.)

Also, I think the convention is to always list functions
alphabetically, so to_oct() should really come after to_hex().

Regards,
Dean

Re: [PATCH] Add function to_oct

From

Nathan Bossart

Date:

21 August 2023, 19:15:28

On Mon, Aug 21, 2023 at 09:31:37AM +0100, Dean Rasheed wrote:
> Hmm, I think just including the doc text update, without the examples
> of positive and negative inputs, might not be sufficient to make the
> meaning clear to everyone.

I added some examples for negative inputs.

> Something else that bothers me slightly is the function naming --
> "hexadecimal" gets abbreviated to "hex", "octal" gets abbreviated to
> "oct", but "binary" is left as-is. I think it ought to be "to_bin()"
> on consistency grounds, even though I understand the words "to bin"
> could be interpreted differently. (Looking elsewhere for precedents,
> Python has bin(), oct() and hex() functions.)

I renamed it to to_bin().

> Also, I think the convention is to always list functions
> alphabetically, so to_oct() should really come after to_hex().

I reordered the functions in the docs.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

v9-0001-add-to_bin-and-to_oct.patch

Re: [PATCH] Add function to_oct

From

Dean Rasheed

Date:

21 August 2023, 20:10:43

On Mon, 21 Aug 2023 at 20:15, Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> I added some examples for negative inputs.
>
> I renamed it to to_bin().
>
> I reordered the functions in the docs.
>

OK, cool. This looks good to me.

Regards,
Dean

Re: [PATCH] Add function to_oct

From

John Naylor

Date:

22 August 2023, 03:17:45

On Tue, Aug 22, 2023 at 3:10 AM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> OK, cool. This looks good to me.

LGTM too.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: [PATCH] Add function to_oct

From

Peter Eisentraut

Date:

22 August 2023, 14:20:02

On 20.08.23 17:25, Nathan Bossart wrote:
>> Doing a quick test, shows that this changes the current behaviour,
>> because all inputs are now treated as 64-bit:
>>
>> HEAD:
>>
>> select to_hex((-1234)::int);
>>    to_hex
>> ----------
>>   fffffb2e
>>
>> With patch:
>>
>> select to_hex((-1234)::int);
>>        to_hex
>> ------------------
>>   fffffffffffffb2e
> Good catch.  In v8, I fixed this by first casting the input to uint32 for
> the 32-bit versions of the functions.  This prevents the conversion to
> uint64 from setting the rest of the bits.  AFAICT this behavior is pretty
> well defined in the standard.

What standard?

I don't understand the reason for this handling of negative values.  I 
would expect that, say, to_hex(-1234) would return '-' || to_hex(1234).

Re: [PATCH] Add function to_oct

From

Nathan Bossart

Date:

22 August 2023, 14:26:08

On Tue, Aug 22, 2023 at 04:20:02PM +0200, Peter Eisentraut wrote:
> On 20.08.23 17:25, Nathan Bossart wrote:
>> > Doing a quick test, shows that this changes the current behaviour,
>> > because all inputs are now treated as 64-bit:
>> > 
>> > HEAD:
>> > 
>> > select to_hex((-1234)::int);
>> >    to_hex
>> > ----------
>> >   fffffb2e
>> > 
>> > With patch:
>> > 
>> > select to_hex((-1234)::int);
>> >        to_hex
>> > ------------------
>> >   fffffffffffffb2e
>> Good catch.  In v8, I fixed this by first casting the input to uint32 for
>> the 32-bit versions of the functions.  This prevents the conversion to
>> uint64 from setting the rest of the bits.  AFAICT this behavior is pretty
>> well defined in the standard.
> 
> What standard?

C99

> I don't understand the reason for this handling of negative values.  I would
> expect that, say, to_hex(-1234) would return '-' || to_hex(1234).

For this patch set, I was trying to retain the current behavior, which is
to return the two's complement representation.  I'm open to changing this
if there's agreement on what the output should be.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: [PATCH] Add function to_oct

From

Peter Eisentraut

Date:

22 August 2023, 14:57:02

On 22.08.23 16:26, Nathan Bossart wrote:
>> I don't understand the reason for this handling of negative values.  I would
>> expect that, say, to_hex(-1234) would return '-' || to_hex(1234).
> For this patch set, I was trying to retain the current behavior, which is
> to return the two's complement representation.  I'm open to changing this
> if there's agreement on what the output should be.

Ah, if there is existing behavior then we should probably keep it.

Re: [PATCH] Add function to_oct

From

Nathan Bossart

Date:

23 August 2023, 14:54:30

On Tue, Aug 22, 2023 at 04:57:02PM +0200, Peter Eisentraut wrote:
> On 22.08.23 16:26, Nathan Bossart wrote:
>> > I don't understand the reason for this handling of negative values.  I would
>> > expect that, say, to_hex(-1234) would return '-' || to_hex(1234).
>> For this patch set, I was trying to retain the current behavior, which is
>> to return the two's complement representation.  I'm open to changing this
>> if there's agreement on what the output should be.
> 
> Ah, if there is existing behavior then we should probably keep it.

Committed.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com