Re: LWLock optimization for multicore Power machines - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: LWLock optimization for multicore Power machines
Date
Msg-id CAPpHfdt5KUU24dzfFXdL-b-zFXuf24uh+FVO599RcaO-35i7uA@mail.gmail.com
Whole thread Raw
In response to [HACKERS] LWLock optimization for multicore Power machines  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: LWLock optimization for multicore Power machines  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, Mar 25, 2017 at 8:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> [ lwlock-power-3.patch ]

I experimented with this patch a bit.  I can't offer help on testing it
on large PPC machines, but I can report that it breaks the build on
Apple PPC machines, apparently because of nonstandard assembly syntax.
I get "Parameter syntax error" on these four lines of assembly:

        and             3,r0,r4
        cmpwi   3,0
        add             3,r0,r5
        stwcx.  3,0,r2

I am not sure what the "3"s are meant to be --- if that's a hard-coded
register number, then it's unacceptable code to begin with, IMO.
You should be able to get gcc to give you a scratch register of its
choosing via appropriate use of the register assignment part of the
asm syntax.  I think there are examples in s_lock.h.
 
Right. Now I use variable bind to register instead of hard-coded register for that purpose.

I'm also unhappy that this patch changes the generic implementation of
LWLockAttemptLock.  That means that we'd need to do benchmarking on *every*
machine type to see what we're paying elsewhere for the benefit of PPC.
It seems unlikely that putting an extra level of function call into that
hot-spot is zero cost.

Lastly, putting machine-specific code into atomics.c is a really bad idea.
We have a convention for where to put such code, and that is not it.

You could address both the second and third concerns, I think, by putting
the PPC asm function into port/atomics/arch-ppc.h (which is where it
belongs anyway) and making the generic implementation be a "static inline"
function in port/atomics/fallback.h.  In that way, at least with compilers
that do inlines sanely, we could expect that this patch doesn't change the
generated code for LWLockAttemptLock at all on machines other than PPC.

I moved PPC implementation of pg_atomic_fetch_mask_add_u32() into port/atomics/arch-ppc.h.  I also had to declare pg_atomic_uint32 there to satisfy usage of this type as argument of pg_atomic_fetch_mask_add_u32_impl().

I moved generic implementation of pg_atomic_fetch_mask_add_u32() into port/atomics/generic.h.  That seems to be more appropriate place for that than fallback.h.  Because fallback.h has definitions for cases when we don't have atomic for this platform at all.  generic.h has definitions of lacking atomics using defined atomics.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: standardized backwards incompatibility tag for commits
Next
From: Tom Lane
Date:
Subject: Re: standardized backwards incompatibility tag for commits