Re: [PATCH] Hex-coding optimizations using SVE on ARM. - Mailing list pgsql-hackers

From Chiranmoy.Bhattacharya@fujitsu.com"
Subject Re: [PATCH] Hex-coding optimizations using SVE on ARM.
Date
Msg-id OS9PR01MB15185B278E343A9BA5F0F6AB19700A@OS9PR01MB15185.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [PATCH] Hex-coding optimizations using SVE on ARM.  (John Naylor <johncnaylorls@gmail.com>)
List pgsql-hackers
> I see that there was some discussion about a Neon implementation upthread,
> but I'm not sure we concluded anything.  For popcount, we first added a
> Neon version before adding the SVE version, which required more complicated
> configure/runtime checks.  Presumably Neon is available on more hardware
> than SVE, so that could be a good place to start here, too.

We have added the Neon versions of hex encode/decode.
Here are the microbenchmark numbers.

hex_encode - m7g.4xlarge
 Input |  Head  |  Neon
-------+--------+--------
    32 | 18.056 |  5.957
    40 | 22.127 | 10.205
    48 | 26.214 | 14.151
    64 | 33.613 |  6.164
   128 | 66.060 | 11.372
   256 |130.225 | 18.543
   512 |267.105 | 33.977
  1024 |515.603 | 64.462

hex_decode - m7g.4xlarge
 Input |  Head  |  Neon
-------+--------+--------
    32 | 26.669 |  9.462
    40 | 36.320 | 19.347
    48 | 45.971 | 19.099
    64 | 58.468 | 17.648
   128 |113.250 | 30.437
   256 |218.743 | 56.824
   512 |414.133 |107.212
  1024 |828.493 |210.740


> Also, I'd strongly encourage you to get involved with others' patches on
> the mailing lists (e.g., reviewing, testing).  Patch submissions are great,
> but this community depends on other types of participation, too.  IME
> helping others with their patches also tends to incentivize others to help
> with yours.

Sure, we will try to test/review patches on areas we have experience.


> On that note, I was hoping you could give us feedback on whether the
> improvement in PG18 made any difference at all in your real-world
> use-case, i.e. not just in a microbenchmark, but also including
> transmission of the hex-encoded values across the network to the
> client (that I assume must decode them again).

Yes, the improvement in v18 did help, check the attached perf graphs.
We used a python script to send and receive binary data from postgres.
For simple select queries on a bytea column, hex_encode was taking
42% of the query execution time in v17, this was reduced to 33% in v18,
resulting in around 18% improvement in overall query time.

The proposed patch further reduces the hex_encode function usage to
5.6%, another 25% improvement in total query time.

We observed similar improvements for insert queries on the bytea column.
hex_decode usage decreased from 15.5% to 5.5%, a 5-8% query level
improvement depending on which storage type is used.

------
Chiranmoy
Attachment

pgsql-hackers by date:

Previous
From: Mihail Nikalayeu
Date:
Subject: Re: Unexpected changes of CurrentResourceOwner and CurrentMemoryContext
Next
From: Shlok Kyal
Date:
Subject: Re: How can end users know the cause of LR slot sync delays?