Re: Add RISC-V Zbb popcount optimization - Mailing list pgsql-hackers

From Greg Burd
Subject Re: Add RISC-V Zbb popcount optimization
Date
Msg-id 8c1287b3-4714-4b29-823e-c35ffcc5726c@app.fastmail.com
Whole thread Raw
In response to Re: Add RISC-V Zbb popcount optimization  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Sun, Mar 22, 2026, at 2:01 PM, Andres Freund wrote:
> Hi,
>
> On 2026-03-22 13:43:43 -0400, Greg Burd wrote:
>> On Sat, Mar 21, 2026, at 10:14 PM, John Naylor wrote:
>> > On Sat, Mar 21, 2026 at 11:56 PM Greg Burd <greg@burd.me> wrote:
>> >> Attached is a small patch that enables hardware popcount on RISC-V when available and also sets the arch flag to
'rv64gc_zbb'flag when appropriate. 
>> >
>> > I have to ask what the point is -- isn't that like putting a 4-inch
>> > exhaust tip on a go-kart?
>> The point is to go fast, right? And to look cool (with awesome 4-inch exhaust tips) if possible! ;-P
>>
>> gburd@rv:~/ws/postgres$ gcc -O2 -o popcnt-wo-zbb riscv-popcnt.c
>> gburd@rv:~/ws/postgres$ gcc -O2 -march=rv64gc_zbb -o popcnt-zbb riscv-popcnt.c
>> gburd@rv:~/ws/postgres$ ./popcnt-wo-zbb && ./popcnt-zbb
>> sw popcount:    0.196 sec  (    510.08 MB/s)
>> hw popcount:    0.293 sec  (    341.48 MB/s)
>>
>> diff: 0.67x
>> match: 406261900 bits counted
>> sw popcount:    0.182 sec  (    548.86 MB/s)
>> hw popcount:    0.044 sec  (   2279.89 MB/s)
>>
>> diff: 4.15x
>> match: 406261900 bits counted
>>
>> But my first email/patch was incomplete/rushed, I should have followed the pattern used for similar ARM-specific
logic. v2 attached along with a test program. 
>
> Sure, but what PG workloads are actually affected to a meaningful degree by
> this? And are those, on riscv, actually most bottlenecked by popcount
> performance?
>
> I'm also pretty doubtful all the effort to e.g. add AVX 512 popcount was spent
> all that effectively - hard to believe there's any real world workloads where
> that gain is worth the squeeze. At least for aarch64 and x86-64 there's real
> world use of those platforms, making niche-y perf improvements somewhat
> worthwhile. Whereas there's afaict not yet a whole lot of riscv production
> adoption.
>
> Once you add CPU dispatch to the cost it gets a heck of a lot less clearly
> worthwhile. You need heuristics to decide when the dispatch cost is worth it
> and even then it's going to slow down your non-worthwhile case somewhat.
>
> That's one of the things that make's riscv's decision to put so many crucial
> features into optional extensions so annoying for people that write
> non-embedded software.

Hey Andres,

All fair points.  RISC-V is annoying, the idea of CPU extensions is just one reason.  To be honest, I'm not sure it is
worthit either!  That said, this patch isn't a huge "squeeze" (or unprecedented) and it does provide some "juice" (4x
faster). It has the shape of the ARM equivalent, so to me it fell into that category of things we'd commit. 

But I get it, as I said to start - all fair points.

> - Andres

best.

-greg



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_waldump: support decoding of WAL inside tarfile
Next
From: Jim Jones
Date:
Subject: Re: [PoC] XMLCast (SQL/XML X025)