RE: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare() - Mailing list pgsql-hackers

From Floris Van Nee
Subject RE: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()
Date
Msg-id 2ca215d085a74e2eabd31b76e97cc9f3@opammb0561.comp.optiver.com
Whole thread Raw
In response to Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
> 
> I could do some tests with the patch on some larger machines. What exact
> tests do you propose? Are there some specific postgresql.conf settings and
> pgbench initialization you recommend for this? And was the test above just
> running 'pgbench -S' select-only with specific -T, -j and -c parameters?
> 

With Andres' instructions I ran a couple of tests. With your patches I can reproduce a speedup of ~3% on single core
testsreliably on a dual-socket 36-core machine for the pgbench select-only test case. When using the full scale test my
resultsare way too noisy even for large runs unfortunately. I also tried some other queries (for example select's that
return10 or 100 rows instead of just 1), but can't see much of a speed-up there either, although it also doesn't hurt.
 

So I guess the most noticeable one is the select-only benchmark for 1 core:

<Master>
transaction type: <builtin: select only>
scaling factor: 300
query mode: prepared
number of clients: 1
number of threads: 1
duration: 600 s
number of transactions actually processed: 30255419
latency average = 0.020 ms
latency stddev = 0.001 ms
tps = 50425.693234 (including connections establishing)
tps = 50425.841532 (excluding connections establishing)

<Patched>
transaction type: <builtin: select only>
scaling factor: 300
query mode: prepared
number of clients: 1
number of threads: 1
duration: 600 s
number of transactions actually processed: 31363398
latency average = 0.019 ms
latency stddev = 0.001 ms
tps = 52272.326597 (including connections establishing)
tps = 52272.476380 (excluding connections establishing)

This is the one with 40 clients, 40 threads. Not really an improvement, and quite still quite noisy.
<Master>
transaction type: <builtin: select only>
scaling factor: 300
query mode: prepared
number of clients: 40
number of threads: 40
duration: 600 s
number of transactions actually processed: 876846915
latency average = 0.027 ms
latency stddev = 0.015 ms
tps = 1461407.539610 (including connections establishing)
tps = 1461422.084486 (excluding connections establishing)

<Patched>
transaction type: <builtin: select only>
scaling factor: 300
query mode: prepared
number of clients: 40
number of threads: 40
duration: 600 s
number of transactions actually processed: 872633979
latency average = 0.027 ms
latency stddev = 0.038 ms
tps = 1454387.326179 (including connections establishing)
tps = 1454396.879195 (excluding connections establishing)

For tests that don't use the full machine (eg. 10 clients, 10 threads) I see speed-ups as well, but not as high as the
single-corerun. It seems there are other bottlenecks (on the machine) coming into play.
 

-Floris


pgsql-hackers by date:

Previous
From: Ranier Vilela
Date:
Subject: Re: [PATCH] Windows port, fix some resources leaks
Next
From: Stephen Frost
Date:
Subject: Re: Removing pg_pltemplate and creating "trustable" extensions