Re: [PATCH] Identify LWLocks in tracepoints - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: [PATCH] Identify LWLocks in tracepoints
Date
Msg-id CAGRY4nxkU1HksXnRA9g+SmQ82xaX-an+Ax+yfkTi8m3DiHfFZQ@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Identify LWLocks in tracepoints  (Craig Ringer <craig.ringer@enterprisedb.com>)
List pgsql-hackers
On Tue, 13 Apr 2021 at 21:40, Craig Ringer
<craig.ringer@enterprisedb.com> wrote:

> Findings:
>
> * A probe without arguments or with simple arguments is just a 'nop' instruction
> * Probes that require function calls, pointer chasing, other
> expression evaluation etc may impose a fixed cost to collect up
> arguments even if the probe is disabled.
> * SDT semaphores can avoid that cost but add a branch, so should
> probably be avoided unless preparing probe arguments is likely to be
> expensive.

Back to the topic directly at hand.

Attached a disassembly of what LWLockAcquire looks like now on my
current build of git master @ 5fe83adad9efd5e3929f0465b44e786dc23c7b55
. This is an --enable-debug --enable-cassert --enable-dtrace build
with -Og -ggdb3.

The three tracepoints in it are all of the form:

   movzwl 0x0(%r13),%edi
   call   0x801c49 <GetLWTrancheName>
   nop

so it's clear we're doing redundant calls to GetLWTrancheName(), as
expected. Not ideal.

Now if I patch it to add the _ENABLED() guards on all the tracepoints,
the probes look like this:

   0x0000000000803176 <+200>:   cmpw   $0x0,0x462da8(%rip)        #
0xc65f26 <postgresql_lwlock__acquire_semaphore>
   0x000000000080317e <+208>:   jne    0x80328b <LWLockAcquire+477>
   .... other interleaved code ...
   0x000000000080328b <+477>:   movzwl 0x0(%r13),%edi
   0x0000000000803290 <+482>:   call   0x801c49 <GetLWTrancheName>
   0x0000000000803295 <+487>:   nop
   0x0000000000803296 <+488>:   jmp    0x803184 <LWLockAcquire+214>

so we avoid the GetLWTrancheName() call at the cost of a test and
possible branch, and a small expansion in total function size. Without
the semaphores, LWLockAcquire is 463 bytes. With them, it's 524 bytes,
which is nothing to sneeze at for code that doesn't do anything
99.999% of the time, but we avoid a bunch of GetLWTrancheName() calls.

If I instead replace T_NAME() with NULL for all tracepoints in
LWLockAcquire, the disassembly shows that the tracepoints now become a
simple

   0x0000000000803176 <+200>:   nop

which is pretty hard to be concerned about.

So at the very least we should be calling GetLWTrancheName() once at
the start of the function if built with dtrace support and remembering
the value, instead of calling it for each tracepoint.

And omitting the tranche name looks like it might be sensible for the
LWLock code. In most places it won't matter, but LWLocks are hot
enough that it possibly might. A simple pg_regress run hits
LWLockAcquire 25 million times, so that's 75 million calls to
GetLWTrancheName().



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Old Postgresql version on i7-1165g7
Next
From: Fujii Masao
Date:
Subject: Re: TRUNCATE on foreign table