Re: Sort functions with specialized comparators - Mailing list pgsql-hackers

From Stepan Neretin
Subject Re: Sort functions with specialized comparators
Date
Msg-id CAN-sa+DoLsp18V7nZZ7q+L0_RvWsQFHOU2Hep+Hhnp+4zDk0OA@mail.gmail.com
Whole thread Raw
In response to Re: Sort functions with specialized comparators  (Антуан Виолин <violin.antuan@gmail.com>)
Responses Re: Sort functions with specialized comparators
Re: Sort functions with specialized comparators
List pgsql-hackers


On Mon, Jul 15, 2024 at 12:23 PM Антуан Виолин <violin.antuan@gmail.com> wrote:

Hello all.

I have decided to explore more areas in which I can optimize and have added
two new benchmarks. Do you have any thoughts on this?

postgres=# select bench_int16_sort(1000000);
bench_int16_sort

-----------------------------------------------------------------------------------------------------------------
Time taken by usual sort: 66354981 ns, Time taken by optimized sort:
52151523 ns, Percentage difference: 21.41%
(1 row)

postgres=# select bench_float8_sort(1000000);
bench_float8_sort

------------------------------------------------------------------------------------------------------------------
Time taken by usual sort: 121475231 ns, Time taken by optimized sort:
74458545 ns, Percentage difference: 38.70%
(1 row)


 Hello all
We would like to see the relationship between the length of the sorted array and the performance gain, perhaps some graphs. We also want to see to set a "worst case" test, sorting the array in ascending order when it is initially descending

Best, regards, Antoine Violin

postgres=#


On Mon, Jul 15, 2024 at 10:32 AM Stepan Neretin <sncfmgg@gmail.com> wrote:


On Sat, Jun 8, 2024 at 1:50 AM Stepan Neretin <sncfmgg@gmail.com> wrote:
Hello all.

I am interested in the proposed patch and would like to propose some additional changes that would complement it. My changes would introduce similar optimizations when working with a list of integers or object identifiers. Additionally, my patch includes an extension for benchmarking, which shows an average speedup of 30-40%.

postgres=# SELECT bench_oid_sort(1000000);
                                                 bench_oid_sort                                                
----------------------------------------------------------------------------------------------------------------
 Time taken by list_sort: 116990848 ns, Time taken by list_oid_sort: 80446640 ns, Percentage difference: 31.24%
(1 row)

postgres=# SELECT bench_int_sort(1000000);
                                                 bench_int_sort                                                
----------------------------------------------------------------------------------------------------------------
 Time taken by list_sort: 118168506 ns, Time taken by list_int_sort: 80523373 ns, Percentage difference: 31.86%
(1 row)

What do you think about these changes?

Best regards, Stepan Neretin.

On Fri, Jun 7, 2024 at 11:08 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
Hi!

In a thread about sorting comparators[0] Andres noted that we have infrastructure to help compiler optimize sorting. PFA attached PoC implementation. I've checked that it indeed works on the benchmark from that thread.

postgres=# CREATE TABLE arrays_to_sort AS
   SELECT array_shuffle(a) arr
   FROM
       (SELECT ARRAY(SELECT generate_series(1, 1000000)) a),
       generate_series(1, 10);

postgres=# SELECT (sort(arr))[1] FROM arrays_to_sort; -- original
Time: 990.199 ms
postgres=# SELECT (sort(arr))[1] FROM arrays_to_sort; -- patched
Time: 696.156 ms

The benefit seems to be on the order of magnitude with 30% speedup.

There's plenty of sorting by TransactionId, BlockNumber, OffsetNumber, Oid etc. But this sorting routines never show up in perf top or something like that.

Seems like in most cases we do not spend much time in sorting. But specialization does not cost us much too, only some CPU cycles of a compiler. I think we can further improve speedup by converting inline comparator to value extractor: more compilers will see what is actually going on. But I have no proofs for this reasoning.

What do you think?


Best regards, Andrey Borodin.

[0] https://www.postgresql.org/message-id/flat/20240209184014.sobshkcsfjix6u4r%40awork3.anarazel.de#fc23df2cf314bef35095b632380b4a59

Hello all.

I have decided to explore more areas in which I can optimize and have added two new benchmarks. Do you have any thoughts on this?

postgres=# select bench_int16_sort(1000000);
                                                bench_int16_sort                                                
-----------------------------------------------------------------------------------------------------------------
 Time taken by usual sort: 66354981 ns, Time taken by optimized sort: 52151523 ns, Percentage difference: 21.41%
(1 row)

postgres=# select bench_float8_sort(1000000);
                                                bench_float8_sort                                                
------------------------------------------------------------------------------------------------------------------
 Time taken by usual sort: 121475231 ns, Time taken by optimized sort: 74458545 ns, Percentage difference: 38.70%
(1 row)

postgres=#
 
Best regards, Stepan Neretin.


I run benchmark with my patches:
./pgbench -c 10 -j2 -t1000 -d postgres

pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 10
query mode: simple
number of clients: 10
number of threads: 2
maximum number of tries: 1
number of transactions per client: 1000
number of transactions actually processed: 10000/10000
number of failed transactions: 0 (0.000%)
latency average = 1.609 ms
initial connection time = 24.080 ms
tps = 6214.244789 (without initial connection time)

and without:
./pgbench -c 10 -j2 -t1000 -d postgres

pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 10
query mode: simple
number of clients: 10
number of threads: 2
maximum number of tries: 1
number of transactions per client: 1000
number of transactions actually processed: 10000/10000
number of failed transactions: 0 (0.000%)
latency average = 1.731 ms
initial connection time = 15.177 ms
tps = 5776.173285 (without initial connection time)

tps with my patches increase. What do you think?

Best regards, Stepan Neretin.

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: broken JIT support on Fedora 40
Next
From: Amit Kapila
Date:
Subject: Re: long-standing data loss bug in initial sync of logical replication