Home > mailing lists

Re: Reduce ProcArrayLock contention - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: Reduce ProcArrayLock contention
Date	July 31, 2015 04:41:53
Msg-id	CAA4eK1KfbrSwLgx9oyMU7AzOOafyNS0eb1QVSa2PnU5+0XXLhg@mail.gmail.com Whole thread Raw
In response to	Re: Reduce ProcArrayLock contention (Andres Freund <andres@anarazel.de>)
Responses	Re: Reduce ProcArrayLock contention
List	pgsql-hackers

Tree view

On Wed, Jul 29, 2015 at 11:48 PM, Andres Freund <andres@anarazel.de> wrote:
>
> On 2015-07-29 12:54:59 -0400, Robert Haas wrote:
> > I would try to avoid changing lwlock.c. It's pretty easy when so
> > doing to create mechanisms that work now but make further upgrades to
> > the general lwlock mechanism difficult. I'd like to avoid that.
>
> I'm massively doubtful that re-implementing parts of lwlock.c is the
> better outcome. Then you have two different infrastructures you need to
> improve over time.

I agree and modified the patch to use 32-bit atomics based on idea

suggested by Robert and didn't modify lwlock.c.

> I understand. IMHO it will be a good idea though to ensure that the patch does not cause regression for other setups such as a less powerful

> machine or while running with lower number of clients.
>

Okay, I have tried it on CentOS VM, but the data is totally screwed (at

same client count across 2 runs the variation is quite high ranging from 300

to 3000 tps) to make any meaning out of it. I think if you want to collect data

on less powerful m/c, then also atleast we should ensure that it is taken in

a way that we are sure that it is not due to noise and there is actual regression,

then only we can decide whether we should investigate that or not.

Can you please try taking data with attached script

(perf_pgbench_tpcb_write.sh), few things you need to change in script

based on your setup (environment variables defined in beginning of script

like PGDATA), other thing is that you need to ensure that name of binaries

for HEAD and Patch should be changed in script if you are naming them

differently. It will generate the performance data in test*.txt files. Also try

to ensure that checkpoint should be configured such that it should not

occur in-between tests, else it will be difficult to conclude the results.

Some parameters you might want to consider for the same are

checkpoint_timeout, checkpoint_completion_target, min_wal_size,

max_wal_size.

>>
>>
>> I was telling that fact even without my patch. Basically I have
>> tried by commenting ProcArrayLock in ProcArrayEndTransaction.
>>

> I did not get that. You mean the TPS is same if you run with commenting out ProcArrayLock in ProcArrayEndTransaction?

Yes, TPS is almost same as with Patch.

> Is that safe to do?

No, that is not safe. I have done that just to see what is the maximum

gain we can get by reducing the contention around ProcArrayLock.

>>
>>
>>
>> No, autovacuum generates I/O due to which sometimes there
>> is more variation in Write tests.

> Sure, but on an average does it still show similar improvements?

Yes, number with and without autovacuum show similar trend, it's just

that you can see somewhat better results (more performance improvement)

with autovacuum=off and do manual vacuum at end of each test.

> Or does the test becomes IO bound and hence the bottleneck shifts somewhere

> else? Can you please post those numbers as well when you get chance?

The numbers posted in initial mail or in this mail are with autovacuum =on.

>>
>>
>>
>> So among these only step 2 can be common among different
>> algorithms, other's need some work specific to each optimization.
>>

> Right, but if we could encapsulate that work in a function that just needs to work on some shared memory, I think we can build such infrastructure.

For now, I have encapsulated the code into 2 separate functions, rather

than extending LWLock.c as that could easily lead to code which might

not be used in future even though currently it sounds like that and in-future

if we need to use same technique elsewhere then we can look into it.

>
>
>> >
>> > Regarding the patch, the compare-and-exchange function calls that you've used would work only for 64-bit machines, right? You would need to use equivalent 32-bit calls on a 32-bit machine.
> >
>>
>> I thought that internal API will automatically take care of it,
>> example for msvc it uses _InterlockedCompareExchange64
>> which if doesn't work on 32-bit systems or is not defined, then
>> we have to use 32-bit version, but I am not certain about
>> that fact.
>>

> Hmm. The pointer will be a 32-bit field on a 32-bit machine. I don't know how exchanging that with 64-bit integer be safe.

True, but that has to be in-general taken care by 64bit atomic API's,

like in this case it should fallback to either 32-bit version of API or

something that can work on 32-bit m/c. I think fallback support

is missing as of now in 64-bit API's which we should have if we want

to use those API's, but anyway for now I have modified the patch to

use 32-bit atomics.

Performance Data with modified patch.

pgbench setup

------------------------

scale factor - 300

Data is on magnetic disk and WAL on ssd.

pgbench -M prepared tpc-b

Data is median of 3 - 30 min runs, the detailed data for

all the 3 runs is in attached document

(perf_write_procarraylock_data.ods)

Head : commit c53f7387

Patch : group_xid_clearing_at_trans_end_rel_v2

Client_Count/Patch_ver (TPS)	1	8	16	32	64	128
HEAD	972	6004	11060	20074	23839	17798
Patch	1005	6260	11368	20318	30775	30215

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

From: Rahila Syed
Date: 31 July 2015, 03:30:31
Subject: Re: [PROPOSAL] VACUUM Progress Checker.

From: Tatsuo Ishii
Date: 31 July 2015, 04:44:56
Subject: Re: Updatable view?

Re: Reduce ProcArrayLock contention - Mailing list pgsql-hackers

Attachment

Previous

Next