Home > mailing lists

Re: Sort functions with specialized comparators - Mailing list pgsql-hackers

From	Stepan Neretin
Subject	Re: Sort functions with specialized comparators
Date	July 15 12:52:31
Msg-id	CAN-sa+DoLsp18V7nZZ7q+L0_RvWsQFHOU2Hep+Hhnp+4zDk0OA@mail.gmail.com Whole thread Raw
In response to	Re: Sort functions with specialized comparators (Антуан Виолин <violin.antuan@gmail.com>)
Responses	Re: Sort functions with specialized comparators Re: Sort functions with specialized comparators
List	pgsql-hackers

Tree view

On Mon, Jul 15, 2024 at 12:23 PM Антуан Виолин <violin.antuan@gmail.com> wrote:

Hello all.
I have decided to explore more areas in which I can optimize and have added
two new benchmarks. Do you have any thoughts on this?
postgres=# select bench_int16_sort(1000000);
bench_int16_sort
-----------------------------------------------------------------------------------------------------------------
Time taken by usual sort: 66354981 ns, Time taken by optimized sort:
52151523 ns, Percentage difference: 21.41%
(1 row)
postgres=# select bench_float8_sort(1000000);
bench_float8_sort
------------------------------------------------------------------------------------------------------------------
Time taken by usual sort: 121475231 ns, Time taken by optimized sort:
74458545 ns, Percentage difference: 38.70%
(1 row)

Hello all
We would like to see the relationship between the length of the sorted array and the performance gain, perhaps some graphs. We also want to see to set a "worst case" test, sorting the array in ascending order when it is initially descending

Best, regards, Antoine Violin

postgres=#

On Mon, Jul 15, 2024 at 10:32 AM Stepan Neretin <sncfmgg@gmail.com> wrote:

On Sat, Jun 8, 2024 at 1:50 AM Stepan Neretin <sncfmgg@gmail.com> wrote:
Hello all.

I am interested in the proposed patch and would like to propose some additional changes that would complement it. My changes would introduce similar optimizations when working with a list of integers or object identifiers. Additionally, my patch includes an extension for benchmarking, which shows an average speedup of 30-40%.

postgres=# SELECT bench_oid_sort(1000000);
bench_oid_sort
----------------------------------------------------------------------------------------------------------------
Time taken by list_sort: 116990848 ns, Time taken by list_oid_sort: 80446640 ns, Percentage difference: 31.24%
(1 row)

postgres=# SELECT bench_int_sort(1000000);
bench_int_sort
----------------------------------------------------------------------------------------------------------------
Time taken by list_sort: 118168506 ns, Time taken by list_int_sort: 80523373 ns, Percentage difference: 31.86%
(1 row)

What do you think about these changes?

Best regards, Stepan Neretin.

On Fri, Jun 7, 2024 at 11:08 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
Hi!

In a thread about sorting comparators[0] Andres noted that we have infrastructure to help compiler optimize sorting. PFA attached PoC implementation. I've checked that it indeed works on the benchmark from that thread.

postgres=# CREATE TABLE arrays_to_sort AS
SELECT array_shuffle(a) arr
FROM
(SELECT ARRAY(SELECT generate_series(1, 1000000)) a),
generate_series(1, 10);

postgres=# SELECT (sort(arr))[1] FROM arrays_to_sort; -- original
Time: 990.199 ms
postgres=# SELECT (sort(arr))[1] FROM arrays_to_sort; -- patched
Time: 696.156 ms

The benefit seems to be on the order of magnitude with 30% speedup.

There's plenty of sorting by TransactionId, BlockNumber, OffsetNumber, Oid etc. But this sorting routines never show up in perf top or something like that.

Seems like in most cases we do not spend much time in sorting. But specialization does not cost us much too, only some CPU cycles of a compiler. I think we can further improve speedup by converting inline comparator to value extractor: more compilers will see what is actually going on. But I have no proofs for this reasoning.

What do you think?

Best regards, Andrey Borodin.

[0] https://www.postgresql.org/message-id/flat/20240209184014.sobshkcsfjix6u4r%40awork3.anarazel.de#fc23df2cf314bef35095b632380b4a59

Hello all.

I have decided to explore more areas in which I can optimize and have added two new benchmarks. Do you have any thoughts on this?

postgres=# select bench_int16_sort(1000000);
bench_int16_sort
-----------------------------------------------------------------------------------------------------------------
Time taken by usual sort: 66354981 ns, Time taken by optimized sort: 52151523 ns, Percentage difference: 21.41%
(1 row)

postgres=# select bench_float8_sort(1000000);
bench_float8_sort
------------------------------------------------------------------------------------------------------------------
Time taken by usual sort: 121475231 ns, Time taken by optimized sort: 74458545 ns, Percentage difference: 38.70%
(1 row)

postgres=#

Best regards, Stepan Neretin.

I run benchmark with my patches:
./pgbench -c 10 -j2 -t1000 -d postgres

pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 10
query mode: simple
number of clients: 10
number of threads: 2
maximum number of tries: 1
number of transactions per client: 1000
number of transactions actually processed: 10000/10000
number of failed transactions: 0 (0.000%)
latency average = 1.609 ms
initial connection time = 24.080 ms
tps = 6214.244789 (without initial connection time)

and without:
./pgbench -c 10 -j2 -t1000 -d postgres

pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 10
query mode: simple
number of clients: 10
number of threads: 2
maximum number of tries: 1
number of transactions per client: 1000
number of transactions actually processed: 10000/10000
number of failed transactions: 0 (0.000%)
latency average = 1.731 ms
initial connection time = 15.177 ms
tps = 5776.173285 (without initial connection time)

tps with my patches increase. What do you think?

Best regards, Stepan Neretin.

pgsql-hackers by date:

From: Thomas Munro
Date: 15 July, 12:52:28
Subject: Re: broken JIT support on Fedora 40

From: Amit Kapila
Date: 15 July, 13:00:51
Subject: Re: long-standing data loss bug in initial sync of logical replication

Re: Sort functions with specialized comparators - Mailing list pgsql-hackers

Previous

Next