Re: [HACKERS] Creating a DSA area to provide work space for parallel execution - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Creating a DSA area to provide work space for parallel execution
Date
Msg-id CA+TgmoYsFn6NUW1x0AZtupJGUAs1UDY4dJtCN47_Q6D0sP80PA@mail.gmail.com
Whole thread Raw
In response to Re: Creating a DSA area to provide work space for parallel execution  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Creating a DSA area to provide work space for parallel execution
List pgsql-hackers
On Mon, Dec 5, 2016 at 3:12 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Dec 1, 2016 at 6:35 AM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> On Sat, Nov 26, 2016 at 1:55 AM, Thomas Munro
>> <thomas.munro@enterprisedb.com> wrote:
>>> Here's a new version to apply on top of dsa-v7.patch.
>>
>> Here's a version to go with dsa-v8.patch.
>
> Thomas has spent a fair amount of time beating me up off-list about
> the fact that we have no way to recycle LWLock tranche IDs, and I've
> spent a fair amount of time trying to defend that decision, but it's
> really a problem here.  This is all well and good the way it's written
> provided that there's only one parallel context in existence at a
> time, but should that number ever exceed 1, which it can, then the
> tranche's array_base will be incorrect for LWLocks in one or the other
> tranche.
>
> That *almost* doesn't matter.  If you're not compiling with dtrace or
> LOCK_DEBUG or LWLOCK_STATS, you'll be fine.  And if you are compiling
> with one of those things, I believe the consequences will be no worse
> than an occasional nonsensical lock ID.  It's halfway tempting to just
> accept that as a known wart, but, uggh, that sucks.
>
> It's a bit hard to come up with a better alternative.

After thinking about this some more, I realized that the problem here
boils down to T_ID() not being smart enough to cope with multiple
instances of the same tranche at different base addresses.  We can
either make it more complicated so that it can do that, or (drum roll,
please!) get rid of it altogether.  I don't think making it more
complicated is very desirable, because I think that we end up
computing T_ID() for every lock acquisition and release whenever
--enable-dtrace is used, even if dtrace is not actually in use.  And
the usefulness of T_ID() for debugging is pretty marginal, with one
important exception, which is that currently T_ID() is used to
distinguish between individual LWLocks in the main tranche.  So here's
my proposal:

1. Give every LWLock in the main tranche a separate tranche ID.  This
has been previously proposed, so it's not a revolutionary concept.

2. Always identify LWLocks in pg_stat_activity only by tranche ID,
never offset within the tranche, not even for the main tranche.  This
results in pg_stat_activity saying "LWLock" rather than either
"LWLockNamed" or "LWLockTranche", which is a user-visible behavior
change but not AFAICS a very upsetting one.

3. Change the dtrace probes that currently pass both T_NAME() and
T_ID() to pass only T_NAME().  This is a minor loss of information for
dtrace, but in view of (1) it's not a very significant loss.

4. Change LOCK_DEBUG and LWLOCK_STATS output that identifies locks
using T_NAME() and T_ID() to instead identify them by T_NAME() and the
pointer address.  Since these are only developer debugging facilities
not intended for production builds, I think it's OK to expose the
pointer address, and it's arguably MORE useful to do so than to expose
an offset into an array with an unknown base address.

5. Remove T_ID() from the can't-happen elog() in LWLockRelease().

6. Remove T_ID() itself.  And then, since that's the only client of
array_base/array_stride, remove those too.  And then, since there's
nothing left in LWLockTranche except for the tranche name, get rid of
the whole structure, simplifying a whole bunch of code all over the
system.

Patch implementing all of this attached.  There's further
simplification that could be done here -- with array_stride gone, we
could do more at compile time rather than run-time -- but I think that
can be left for another day.

The overall result of this is a considerable savings of code:

 doc/src/sgml/monitoring.sgml             |  52 ++++-----
 src/backend/access/transam/slru.c        |   6 +-
 src/backend/access/transam/xlog.c        |   9 +-
 src/backend/postmaster/pgstat.c          |  10 +-
 src/backend/replication/logical/origin.c |   8 +-
 src/backend/replication/slot.c           |   8 +-
 src/backend/storage/buffer/buf_init.c    |  16 +--
 src/backend/storage/ipc/procarray.c      |   9 +-
 src/backend/storage/lmgr/lwlock.c        | 175 ++++++++++---------------------
 src/backend/utils/mmgr/dsa.c             |  15 +--
 src/backend/utils/probes.d               |  16 +--
 src/include/access/slru.h                |   1 -
 src/include/pgstat.h                     |   3 +-
 src/include/storage/lwlock.h             |  45 ++------
 14 files changed, 112 insertions(+), 261 deletions(-)

It also noticeably reduces the number of bytes of machine code
generated for lwlock.c:

[rhaas pgsql]$ size src/backend/storage/lmgr/lwlock.o # echo master
__TEXT    __DATA    __OBJC    others    dec    hex
11815    3360    0    46430    61605    f0a5
[rhaas pgsql]$ size src/backend/storage/lmgr/lwlock.o # echo lwlock
__TEXT    __DATA    __OBJC    others    dec    hex
11199    3264    0    45487    59950    ea2e

That's better than a 5% reduction in the code size of a very hot
module just by removing something that almost nobody uses or cares
about.  That's with --enable-dtrace but without --enable-cassert.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Linear vs. nonlinear planner cost estimates