Re: Asynchronous MergeAppend - Mailing list pgsql-hackers

From Matheus Alcantara
Subject Re: Asynchronous MergeAppend
Date
Msg-id DDZ2ULUYDQJ4.MXMP02V4GIG@gmail.com
Whole thread Raw
In response to Asynchronous MergeAppend  (Alexander Pyhalov <a.pyhalov@postgrespro.ru>)
List pgsql-hackers
Hi, thanks for working on this!

On Tue Aug 20, 2024 at 6:14 AM -03, Alexander Pyhalov wrote:
>> In addition, I have a question about testing your feature on a
>> benchmark. Are you going to do this?
>>
>
> The main reason for this work is a dramatic performance degradation when
> Append plans with async foreign scan nodes are switched to MergeAppend
> plans with synchronous foreign scans.
>
> I've performed some synthetic tests to prove the benefits of async Merge
> Append. So far tests are performed on one physical host.
>
> For tests I've deployed 3 PostgreSQL instances on ports 5432-5434.
>
> The first instance:
> create server s2 foreign data wrapper postgres_fdw OPTIONS ( port
> '5433', dbname 'postgres', async_capable 'on');
> create server s3 foreign data wrapper postgres_fdw OPTIONS ( port
> '5434', dbname 'postgres', async_capable 'on');
>
> create foreign table players_p1 partition of players for values with
> (modulus 4, remainder 0) server s2;
> create foreign table players_p2 partition of players for values with
> (modulus 4, remainder 1) server s2;
> create foreign table players_p3 partition of players for values with
> (modulus 4, remainder 2) server s3;
> create foreign table players_p4 partition of players for values with
> (modulus 4, remainder 3) server s3;
>
> s2 instance:
> create table players_p1  (id int, name text, score int);
> create table players_p2  (id int, name text, score int);
> create index on players_p1(score);
> create index on players_p2(score);
>
> s3 instance:
> create table players_p3  (id int, name text, score int);
> create table players_p4  (id int, name text, score int);
> create index on players_p3(score);
> create index on players_p4(score);
>
> s1 instance:
> insert into players select i, 'player_' ||i, random()* 100 from
> generate_series(1,100000) i;
>
> pgbench script:
> \set rnd_offset random(0,200)
> \set rnd_limit  random(10,20)
>
> select * from players order by score desc offset :rnd_offset limit
> :rnd_limit;
>
> pgbench was run as:
> pgbench -n -f 1.sql  postgres -T 100 -c 16 -j 16
>
> CPU idle was about 5-10%.
>
> pgbench results:
>
> [...]
> However, if we set number of threads to 1, so that CPU has idle cores,
> we'll see more evident improvements:
>
> Patched, async_capable on:
> pgbench (14.13, server 18devel)
> transaction type: 1.sql
> scaling factor: 1
> query mode: simple
> number of clients: 1
> number of threads: 1
> duration: 100 s
> number of transactions actually processed: 20221
> latency average = 4.945 ms
> initial connection time = 7.035 ms
> tps = 202.221816 (without initial connection time)
>
>
> Patched, async_capable off
> transaction type: 1.sql
> scaling factor: 1
> query mode: simple
> number of clients: 1
> number of threads: 1
> duration: 100 s
> number of transactions actually processed: 14941
> latency average = 6.693 ms
> initial connection time = 7.037 ms
> tps = 149.415688 (without initial connection time)
>
I ran some benchmarks based on v4 attached by Alvaro in [1] using a
smaller number of threads so that some CPU cores would be idle and I
also obtained better results:

Patched, async_capable on:
tps = 4301.567405

Master, async_capable on:
tps = 3847.084545

So I'm +1 for the idea. I know it's been while since the last patch, and
unfortunully it hasn't received reviews since then. Do you still plan to
work on it? I still need to take a look on the code to see if I can help
with some comments.

During the tests I got compiler errors due to fce7c73fba4, so I'm
attaching a v5 with guc_parameters.dat correctly sorted.

The postgres_fdw/regress tests was also failling due to some whitespace
problems, v5 also fix this.

[1] https://www.postgresql.org/message-id/202510251154.isknefznk566%40alvherre.pgsql

--
Matheus Alcantara



Attachment

pgsql-hackers by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: Fix typo in Vietnamese translation file
Next
From: John Naylor
Date:
Subject: Re: tuple radix sort