Thread: encode/decode support for base64url
Hello,
Sometimes support for base64url from RFC 4648 would be useful.
Does anyone else need a patch like this?
Przemysław Sztoch | Mobile +48 509 99 00 66
> On 4 Mar 2025, at 09:54, Przemysław Sztoch <przemyslaw@sztoch.pl> wrote: > Sometimes support for base64url from RFC 4648 would be useful. > Does anyone else need a patch like this? While not a frequent ask, it has been mentioned in the past. I think it would make sense to add so please do submit a patch for it for consideration. -- Daniel Gustafsson
Hi, > > Sometimes support for base64url from RFC 4648 would be useful. > > Does anyone else need a patch like this? > > While not a frequent ask, it has been mentioned in the past. I think it would > make sense to add so please do submit a patch for it for consideration. IMO it would be nice to have. Would you like to submit such a patch or are you merely suggesting an idea for others to implement? -- Best regards, Aleksander Alekseev
On 7 Mar 2025, at 4:40 PM, Aleksander Alekseev <aleksander@timescale.com> wrote:Hi,Sometimes support for base64url from RFC 4648 would be useful.
Does anyone else need a patch like this?
While not a frequent ask, it has been mentioned in the past. I think it would
make sense to add so please do submit a patch for it for consideration.
IMO it would be nice to have.
Would you like to submit such a patch or are you merely suggesting an
idea for others to implement?
--
Best regards,
Aleksander Alekseev
On 7 Mar 2025, at 4:40 PM, Aleksander Alekseev <aleksander@timescale.com> wrote:Hi,Sometimes support for base64url from RFC 4648 would be useful.
Does anyone else need a patch like this?
While not a frequent ask, it has been mentioned in the past. I think it would
make sense to add so please do submit a patch for it for consideration.
IMO it would be nice to have.
Would you like to submit such a patch or are you merely suggesting an
idea for others to implement?
--
Best regards,
Aleksander AlekseevJust to confirm:In a plan SQL flavor, we’re talking about something like this, correct?CREATE FUNCTION base64url_encode(input bytea) RETURNS text AS $$SELECT regexp_replace(replace(replace(encode(input, 'base64'), '+', '-'), '/', '_'),'=+$', '', 'g');$$ LANGUAGE sql IMMUTABLE;CREATE FUNCTION base64url_decode(input text) RETURNS bytea AS $$SELECT decode(rpad(replace(replace(input, '-', '+'), '_', '/'), (length(input) + 3) & ~3, '='),'base64');$$ LANGUAGE sql IMMUTABLE;With minimal testing, this yields the same results with https://base64.guru/standards/base64url/encodeselect base64url_encode('post+gres')base64url_encode------------------cG9zdCtncmVz(1 row)
Attachment
> On 10 Mar 2025, at 12:28, Florents Tselai <florents.tselai@gmail.com> wrote: > Here's a C implementation for this, along with some tests and documentation. > Tests are copied from cpython's implementation of urlsafe_b64encode and urlsafe_b64decode. + <function>base64url_encode</function> ( <parameter>input</parameter> <type>bytea</type> ) Shouldn't this be modelled around how base64 works with the encode() and decode() functions, ie encode('123\001', 'base64')? https://www.postgresql.org/docs/devel/functions-binarystring.html -- Daniel Gustafsson
> On 10 Mar 2025, at 12:28, Florents Tselai <florents.tselai@gmail.com> wrote:
> Here's a C implementation for this, along with some tests and documentation.
> Tests are copied from cpython's implementation of urlsafe_b64encode and urlsafe_b64decode.
+ <function>base64url_encode</function> ( <parameter>input</parameter> <type>bytea</type> )
Shouldn't this be modelled around how base64 works with the encode() and
decode() functions, ie encode('123\001', 'base64')?
https://www.postgresql.org/docs/devel/functions-binarystring.html
--
Daniel Gustafsson
> Oh well - you're probably right. > I guess I was blinded by my convenience. > Adding a 'base64url' option there is more appropriate. I agree with it too. It is neater to add "base64url" as a new option for encode() and decode() SQL functions in encode.c. In addition, you may also want to add the C versions of base64rul encode and decode functions to "src/common/base64.c" as new API calls so that the frontend, backend applications and extensions can also have access to these base64url conversions. Cary Huang ------------- HighGo Software Inc. (Canada) cary.huang@highgo.ca www.highgo.ca
Hi,Sometimes support for base64url from RFC 4648 would be useful. Does anyone else need a patch like this?While not a frequent ask, it has been mentioned in the past. I think it would make sense to add so please do submit a patch for it for consideration.IMO it would be nice to have. Would you like to submit such a patch or are you merely suggesting an idea for others to implement?
1. It is my current workaround:
SELECT convert_from(decode(rpad(translate(jwt_data, E'-_\n', '+/'), (ceil(length(translate(jwt_data, E'-_\n', '+/')) / 4::float) * 4)::integer, '='::text), 'base64'), 'UTF-8')::jsonb AS jwt_json
But it's not very elegant. I won't propose my own patch, but if someone does it, I'll be very grateful for it. :-)
2. My colleagues also have a proposal to add hex_space, dec and dec_space.
hex_space and dec_space for obvious readability in some conditions.
dec and dec_space are also sometimes much more convenient for debugging and interpreting binary data by humans. 3. In addition to base64, sometimes base32 would be useful (both from rfc4648), which doesn't have such problems:
The resulting character set is all one case, which can often be beneficial when using a case-insensitive filesystem, DNS names, spoken language, or human memory. The result can be used as a file name because it cannot possibly contain the '/' symbol, which is the Unix path separator.
Przemysław Sztoch | Mobile +48 509 99 00 66
> Oh well - you're probably right.
> I guess I was blinded by my convenience.
> Adding a 'base64url' option there is more appropriate.
I agree with it too. It is neater to add "base64url" as a new option for
encode() and decode() SQL functions in encode.c.
In addition, you may also want to add the C versions of base64rul encode
and decode functions to "src/common/base64.c" as new API calls so that
the frontend, backend applications and extensions can also have access
to these base64url conversions.
Attachment
On Tue, Mar 11, 2025 at 12:51 AM Cary Huang <cary.huang@highgo.ca> wrote:> Oh well - you're probably right.
> I guess I was blinded by my convenience.
> Adding a 'base64url' option there is more appropriate.
I agree with it too. It is neater to add "base64url" as a new option for
encode() and decode() SQL functions in encode.c.Attaching a v2 with that.
In addition, you may also want to add the C versions of base64rul encode
and decode functions to "src/common/base64.c" as new API calls so that
the frontend, backend applications and extensions can also have access
to these base64url conversions.We could expose this in base64.c - it'll need some more checkingA few more test cases, especially around padding, are necessary.I'll come back to this.
Attachment
Hi Florents, > Here's a v3 with some (hopefully) better test cases. Thanks for the new version of the patch. ``` + encoded_len = pg_base64_encode(src, len, dst); + + /* Convert Base64 to Base64URL */ + for (uint64 i = 0; i < encoded_len; i++) { + if (dst[i] == '+') + dst[i] = '-'; + else if (dst[i] == '/') + dst[i] = '_'; + } ``` Although it is a possible implementation, wouldn't it be better to parametrize pg_base64_encode instead of traversing the string twice? Same for pg_base64_decode. You can refactor pg_base64_encode and make it a wrapper for pg_base64_encode_impl if needed. ``` +-- Flaghsip Test case against base64. +-- Notice the = padding removed at the end and special chars. +SELECT encode('\x69b73eff', 'base64'); -- Expected: abc+/w== + encode +---------- + abc+/w== +(1 row) + +SELECT encode('\x69b73eff', 'base64url'); -- Expected: abc-_w + encode +-------- + abc-_w +(1 row) ``` I get the idea, but calling base64 is redundant IMO. It only takes several CPU cycles during every test run without much value. I suggest removing it and testing corner cases for base64url instead, which is missing at the moment. Particularly there should be tests for encoding/decoding strings of 0/1/2/3/4 characters and making sure that decode(encode(x)) = x, always. On top of that you should cover with tests the cases of invalid output for decode(). -- Best regards, Aleksander Alekseev