Thread: [PATCH] Add get_bytes() and set_bytes() functions
Hi, While discussing another patch [1] it was discovered that we don't have a convenient way of casting a bytea to an integer / bigint and vice versa, extracting integers larger than one byte from byteas, etc. For instance, casting '\x11223344' to 0x11223344 may look like this: ``` WITH vals AS ( SELECT '\x11223344'::bytea AS x ) SELECT (get_byte(x, 0) :: bigint << 24) | (get_byte(x, 1) << 16) | (get_byte(x, 2) << 8) | get_byte(x, 3) FROM vals; ``` There seems to be a demand for this functionality [2][3] and it costs us nothing to maintain it, so I propose adding it. The proposed patch adds get_bytes() and set_bytes() functions. The semantics is similar to get_byte() and set_byte() we already have but the functions operate with bigints rather than bytes and the user can specify the size of the integer. This allows working with int2s, int4s, int8s or even int5s if needed. Examples: ``` SELECT get_bytes('\x1122334455667788'::bytea, 1, 2) = 0x2233; ?column? ---------- t SELECT set_bytes('\x1122334455667788'::bytea, 1, 2, 0xAABB); set_bytes -------------------- \x11aabb4455667788 ``` Thoughts? [1]: https://postgr.es/m/CAJ7c6TNMTGnqnG%3DyXXUQh9E88JDckmR45H2Q%2B%3DucaCLMOW1QQw%40mail.gmail.com [2]: https://stackoverflow.com/questions/32944267/postgresql-converting-bytea-to-bigint [3]: https://postgr.es/m/AANLkTikip9xs8iXc8e%2BMgz1T1701i8Xk6QtbVB3KJQzX%40mail.gmail.com -- Best regards, Aleksander Alekseev
Attachment
On Wed, Aug 14, 2024, at 13:01, Aleksander Alekseev wrote: > The proposed patch adds get_bytes() and set_bytes() functions. The > semantics is similar to get_byte() and set_byte() we already have but > the functions operate with bigints rather than bytes and the user can > specify the size of the integer. This allows working with int2s, > int4s, int8s or even int5s if needed. +1 I wanted this myself many times. I wonder if get_bytes() and set_bytes() will behave differently on little-endian vs big-endian systems? If so, then I think it would be nice to enforce a consistent byte order (e.g., big-endian), to ensure consistent behavior across platforms. Regards, Joel
Hi, > +1 > > I wanted this myself many times. > > I wonder if get_bytes() and set_bytes() will behave differently > on little-endian vs big-endian systems? > > If so, then I think it would be nice to enforce a consistent byte order > (e.g., big-endian), to ensure consistent behavior across platforms. No, the returned value will not depend on the CPU endiness. Current implementation uses big-endian / network order which in my humble opinion is what most users would expect. I believe we also need reverse(bytea) and repeat(bytea, integer) functions e.g. for those who want little-endian. However I want to propose them separately when we are done with this patch. -- Best regards, Aleksander Alekseev