[HACKERS] [BUG FIX] Removing NamedLWLockTrancheArray - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject [HACKERS] [BUG FIX] Removing NamedLWLockTrancheArray
Date
Msg-id 20170303.171342.134582021.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
Responses Re: [HACKERS] [BUG FIX] Removing NamedLWLockTrancheArray  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Re: [HACKERS] [BUG FIX] Removing NamedLWLockTrancheArray  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Hello, some of my collegues found that orafce crashes with
postgresql compliled with dtrace.

=== The cause

The immediate cause was I think that it just did
LWLockNewTrancheId and forget to do LWLockRegisterTranche. But
this is hardly detectable by a module developer.


Typical tracepoint looks like the following.

> TRACE_POSTGRESQL_LWLOCK_WAIT_DONE(T_NAME(lock), mode);

Where T_NAME is (LWLockTrancheArray[(lock)->tranche]), as you see
T_NAME assumes that all tranches pointed from lwlocks are
available on the array but the tranches without
LWLockRegisterTranche are not. What is worse this doesn't harm so
much without dtrace (or LWLOCK_STATS or error in LWLockRelease)
. Even if dtrace is activated one or two unregistred tranches
(faultly) have their seat (filled with trash) at the end of the
array with the current code.

As my understanding there are two ways to use lwlocks in
extension.  One is using LWLockNewTrancheId and
LWLockRegisterTanche on shared memory provided by the module. The
other is using RequestNamedLWLockTranche in _PG_init and
GetNamedLWLockTranche to acquire locks provided by postgresql.

I couldn't find a documentation about lwlock and trance in
extentions, is there any?


=== How to fix this

The most straightforward way to fix this is chekcing the validity
of tranche id on initilization of a lwlock. Even though I think
that degradation won't be a matter here, NamedLWLockTrancheArray
makes the things very ineffective. After some consideration I
decided to remove NamedLWLockTrancheArray.


=== The patch

The attached patch (is for current master) is aming to fix all
the problem by doing the following two things.

- Remove NameLWLockTrancheArray and all tranches are registered in LWLockTrancheArray. This seems to work at least for
!EXEC_BAKCENDenvironment but I haven't tested with EXEC_BACKEND.
 

- Check tranche id in LWLockInitialize.

The first one required refactoring of CreateLWLocks. It is
changed to register tranches first then initialize lwlokcs.

The problem is found with PG9.6 and this should be backpatched at
least to the version. I haven't tested PG9.5 and 9.4 but it seems
to need different amendment. 9.3 doesn't has tranche.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: [HACKERS] Documentation improvements for partitioning
Next
From: Fabien COELHO
Date:
Subject: Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands:\quit_if, \quit_unless)