Thread: [PATCH] Add get_bytes() and set_bytes() functions

[PATCH] Add get_bytes() and set_bytes() functions

From
Aleksander Alekseev
Date:
Hi,

While discussing another patch [1] it was discovered that we don't
have a convenient way of casting a bytea to an integer / bigint and
vice versa, extracting integers larger than one byte from byteas, etc.

For instance, casting '\x11223344' to 0x11223344 may look like this:

```
WITH vals AS (
      SELECT '\x11223344'::bytea AS x
)
SELECT
    (get_byte(x, 0) :: bigint << 24) |
    (get_byte(x, 1) << 16) |
    (get_byte(x, 2) << 8) |
    get_byte(x, 3)
FROM vals;
```

There seems to be a demand for this functionality [2][3] and it costs
us nothing to maintain it, so I propose adding it.

The proposed patch adds get_bytes() and set_bytes() functions. The
semantics is similar to get_byte() and set_byte() we already have but
the functions operate with bigints rather than bytes and the user can
specify the size of the integer. This allows working with int2s,
int4s, int8s or even int5s if needed.

Examples:

```
SELECT get_bytes('\x1122334455667788'::bytea, 1, 2) = 0x2233;
 ?column?
----------
 t

SELECT set_bytes('\x1122334455667788'::bytea, 1, 2, 0xAABB);
     set_bytes
--------------------
 \x11aabb4455667788
```

Thoughts?

[1]: https://postgr.es/m/CAJ7c6TNMTGnqnG%3DyXXUQh9E88JDckmR45H2Q%2B%3DucaCLMOW1QQw%40mail.gmail.com
[2]: https://stackoverflow.com/questions/32944267/postgresql-converting-bytea-to-bigint
[3]: https://postgr.es/m/AANLkTikip9xs8iXc8e%2BMgz1T1701i8Xk6QtbVB3KJQzX%40mail.gmail.com
-- 
Best regards,
Aleksander Alekseev

Attachment

Re: [PATCH] Add get_bytes() and set_bytes() functions

From
"Joel Jacobson"
Date:
On Wed, Aug 14, 2024, at 13:01, Aleksander Alekseev wrote:
> The proposed patch adds get_bytes() and set_bytes() functions. The
> semantics is similar to get_byte() and set_byte() we already have but
> the functions operate with bigints rather than bytes and the user can
> specify the size of the integer. This allows working with int2s,
> int4s, int8s or even int5s if needed.

+1

I wanted this myself many times.

I wonder if get_bytes() and set_bytes() will behave differently
on little-endian vs big-endian systems?

If so, then I think it would be nice to enforce a consistent byte order
(e.g., big-endian), to ensure consistent behavior across platforms.

Regards,
Joel



Re: [PATCH] Add get_bytes() and set_bytes() functions

From
Aleksander Alekseev
Date:
Hi,

> +1
>
> I wanted this myself many times.
>
> I wonder if get_bytes() and set_bytes() will behave differently
> on little-endian vs big-endian systems?
>
> If so, then I think it would be nice to enforce a consistent byte order
> (e.g., big-endian), to ensure consistent behavior across platforms.

No, the returned value will not depend on the CPU endiness. Current
implementation uses big-endian / network order which in my humble
opinion is what most users would expect.

I believe we also need reverse(bytea) and repeat(bytea, integer)
functions e.g. for those who want little-endian. However I want to
propose them separately when we are done with this patch.

-- 
Best regards,
Aleksander Alekseev