Re: Reduce ProcArrayLock contention - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Reduce ProcArrayLock contention
Date
Msg-id CA+TgmoYf1Sta70UBnm=b4Eu8bJSBY69pLJZBrQZembjmgK10_g@mail.gmail.com
Whole thread Raw
In response to Re: Reduce ProcArrayLock contention  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Reduce ProcArrayLock contention  (Andres Freund <andres@anarazel.de>)
Re: Reduce ProcArrayLock contention  (Amit Kapila <amit.kapila16@gmail.com>)
Re: Reduce ProcArrayLock contention  (Robert Haas <robertmhaas@gmail.com>)
Re: Reduce ProcArrayLock contention  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Re: Reduce ProcArrayLock contention  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Mon, Aug 3, 2015 at 8:39 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> I agree and modified the patch to use 32-bit atomics based on idea
>> suggested by Robert and didn't modify lwlock.c.
>
> While looking at patch, I found that the way it was initialising the list
> to be empty was wrong, it was using pgprocno as 0 to initialise the
> list, as 0 is a valid pgprocno.  I think we should use a number greater
> that PROCARRAY_MAXPROC (maximum number of procs in proc
> array).
>
> Apart from above fix, I have modified src/backend/access/transam/README
> to include the information about the improvement this patch brings to
> reduce ProcArrayLock contention.

I spent some time looking at this patch today and found that a good
deal of cleanup work seemed to be needed.  Attached is a cleaned-up
version which makes a number of changes:

1. I got rid of all of the typecasts.  You're supposed to treat
pg_atomic_u32 as a magic data type that is only manipulated via the
primitives provided, not just cast back and forth between that and
u32.

2. I got rid of the memory barriers.  System calls are full barriers,
and so are compare-and-exchange operations.  Between those two facts,
we should be fine without these.

3. I removed the clearXid field you added to PGPROC.  Since that's
just a sentinel, I used nextClearXidElem for that purpose. There
doesn't seem to be a need for two fields.

4. I factored out the actual XID-clearing logic into a new function
ProcArrayEndTransactionInternal instead of repeating it twice. On the
flip side, I merged PushProcAndWaitForXidClear with
PopProcsAndClearXids and renamed the result to ProcArrayGroupClearXid,
since there seemed to be no need to separate them.

5. I renamed PROC_NOT_IN_PGPROCARRAY to INVALID_PGPROCNO, which I
think is more consistent with what we do elsewhere, and made it a
compile-time constant instead of a value that must be computed every
time it's used.

6. I overhauled the comments and README updates.

I'm not entirely happy with the name "nextClearXidElem" but apart from
that I'm fairly happy with this version.  We should probably test it
to make sure I haven't broken anything; on a quick 3-minute pgbench
test on cthulhu (128-way EDB server) with 128 clients I got 2778 tps
with master and 4330 tps with this version of the patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: FSM versus GIN pending list bloat
Next
From: Andres Freund
Date:
Subject: Re: Reduce ProcArrayLock contention