Re: Aggregate versions of hashing functions (md5, sha1, etc...) - Mailing list pgsql-general

From Merlin Moncure
Subject Re: Aggregate versions of hashing functions (md5, sha1, etc...)
Date
Msg-id CAHyXU0znf84Cnv4cTdG7SQqEaXs_eiX+W4oOERq89K5g_tBk9A@mail.gmail.com
Whole thread Raw
In response to Re: Aggregate versions of hashing functions (md5, sha1, etc...)  (Dominique Devienne <ddevienne@gmail.com>)
Responses Re: Aggregate versions of hashing functions (md5, sha1, etc...)
List pgsql-general
On Fri, Jul 11, 2025 at 10:17 AM Dominique Devienne <ddevienne@gmail.com> wrote:
On Fri, Jul 11, 2025 at 6:05 PM Florents Tselai
<florents.tselai@gmail.com> wrote:
> On Fri, Jul 11, 2025, 18:27 Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>> [...] create an extension that incorporates the code.
>
> That's an ideal use case for an extension indeed .

Extensions are of no use to me, unfortunately, unless built-in and
official. So if I have to wait for v19, so be it. But the ball has to
get rolling at least.

Right -- exactly.   The problem is cloud providers do not allow 3rd party extensions.  You can work around this if your extensions are available at the SQL level, or can be built with standard extensions.  

Candidly, it's going to be tough sledding to get your needs incorporated into contrib.   I'm not saying it wont happen -- let's just say holding breath until solution is not advisable.  I know exactly where you're at.  

This is why I built an SQL available extension that does lz4 compression; it's the only way to compress data locally before sending it out to AWS via the s3 API.  

Aside: This may be an unpopular position, but I think the postgres extension system is useless for 3rd party contributions until there is some way to introduce items in the vein of npm, pip, etc.

 I think the only short term path I know of:
1. write or find a C library that does something similar to what you need
2. compile that to plv8 with emscriptem
3. Wrap with plv8 function handlers

Getting it to reasonable performance is possible if you compile WASM and work out amortizing start up costs. However, due to how plv8 and emscripten interact from memory standpoint, there's a lot of copying to move memory in/out, and you will have to work under a manageable buffer size, say 1mb.  If you're curious I can take you through it.

merlin

pgsql-general by date:

Previous
From: Greg Sabino Mullane
Date:
Subject: Re: I have a suspicious query
Next
From: Adrian Klaver
Date:
Subject: Re: Aggregate versions of hashing functions (md5, sha1, etc...)