> I see that there was some discussion about a Neon implementation upthread,
> but I'm not sure we concluded anything. For popcount, we first added a
> Neon version before adding the SVE version, which required more complicated
> configure/runtime checks. Presumably Neon is available on more hardware
> than SVE, so that could be a good place to start here, too.
We have added the Neon versions of hex encode/decode.
Here are the microbenchmark numbers.
hex_encode - m7g.4xlarge
Input | Head | Neon
-------+--------+--------
32 | 18.056 | 5.957
40 | 22.127 | 10.205
48 | 26.214 | 14.151
64 | 33.613 | 6.164
128 | 66.060 | 11.372
256 |130.225 | 18.543
512 |267.105 | 33.977
1024 |515.603 | 64.462
hex_decode - m7g.4xlarge
Input | Head | Neon
-------+--------+--------
32 | 26.669 | 9.462
40 | 36.320 | 19.347
48 | 45.971 | 19.099
64 | 58.468 | 17.648
128 |113.250 | 30.437
256 |218.743 | 56.824
512 |414.133 |107.212
1024 |828.493 |210.740
> Also, I'd strongly encourage you to get involved with others' patches on
> the mailing lists (e.g., reviewing, testing). Patch submissions are great,
> but this community depends on other types of participation, too. IME
> helping others with their patches also tends to incentivize others to help
> with yours.
Sure, we will try to test/review patches on areas we have experience.
> On that note, I was hoping you could give us feedback on whether the
> improvement in PG18 made any difference at all in your real-world
> use-case, i.e. not just in a microbenchmark, but also including
> transmission of the hex-encoded values across the network to the
> client (that I assume must decode them again).
Yes, the improvement in v18 did help, check the attached perf graphs.
We used a python script to send and receive binary data from postgres.
For simple select queries on a bytea column, hex_encode was taking
42% of the query execution time in v17, this was reduced to 33% in v18,
resulting in around 18% improvement in overall query time.
The proposed patch further reduces the hex_encode function usage to
5.6%, another 25% improvement in total query time.
We observed similar improvements for insert queries on the bytea column.
hex_decode usage decreased from 15.5% to 5.5%, a 5-8% query level
improvement depending on which storage type is used.
------
Chiranmoy