Subject changed. Earlier it was : spin_delay() for ARM
On Fri, 17 Apr 2020 at 22:54, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Apr 16, 2020 at 3:18 AM Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
> > Not relevant to the PAUSE stuff .... Note that when the parallel
> > clients reach from 24 to 32 (which equals the machine CPUs), the TPS
> > shoots from 454189 to 1097592 which is more than double speed gain
> > with just a 30% increase in parallel sessions.
For referencem the TPS can be seen here :
https://www.postgresql.org/message-id/CAJ3gD9e86GY%3DQfyfZQkb11Z%2BCVWowDiGgGThzKKwHDGU9uA2yA%40mail.gmail.com
>
> I've seen stuff like this too. For instance, check out the graph from
> this 2012 blog post:
>
> http://rhaas.blogspot.com/2012/04/did-i-say-32-cores-how-about-64.html
>
> You can see that the performance growth is basically on a straight
> line up to about 16 cores, but then it kinks downward until about 28,
> after which it kinks sharply upward until about 36 cores.
>
> I think this has something to do with the process scheduling behavior
> of Linux, because I vaguely recall some discussion where somebody did
> benchmarking on the same hardware on both Linux and one of the BSD
> systems, and the effect didn't appear on BSD. They had other problems,
> like a huge drop-off at higher core counts, but they didn't have that
> effect.
Ah I see.
By the way, I have observed this behaviour in both x86 and ARM,
regardless of whether the CPUs are as low as 8, or as high as 32.
But for me, I suspect it's a combination of linux scheduler and
interactions between backends and pgbench clients.
So I used a custom script. I used the same point query that is
used with the -S option, but that query is run again and again on the
server side without the client having to send it again and again, so
the pgbench clients are idle most of the time.
Query used :
select foo(300000);
where foo(int) is defined as :
create or replace function foo(iterations int) returns int as $$
declare
id int; ret int ; counter int = 0;
begin
WHILE counter < iterations
LOOP
counter = counter + 1;
id = random() * 3000000;
select into ret aid from pgbench_accounts where aid = id;
END LOOP;
return ret;
end $$ language plpgsql;
Below are results for 30 scale factor, with 8 CPUs :
Clients TPS
2 1.255327
4 2.414139
6 3.532937
8 4.586583
10 4.557575
12 4.517226
14 4.551455
18 4.593271
You can see that the tps rise is almost linearly proportional to
increase in clients, with no deviation in between, upto 8 clients
where it does not rise because CPUs are fully utilized,
In this custom case as well, the behaviour is same for both x86 and
ARM, regardless of 8 CPUs or 32 CPUs.
--
Thanks,
-Amit Khandekar
Huawei Technologies