Thread: [HACKERS] [sqlsmith] Crash reading pg_stat_activity

[HACKERS] [sqlsmith] Crash reading pg_stat_activity

From
Andreas Seltenreich
Date:
Hi,

testing master as of fe591f8bf6 produced a crash reading
pg_stat_activity (backtrace below).  Digging around with with gdb
revealed that pgstat_get_wait_event() returned an invalid pointer for a
classId PG_WAIT_LWLOCK.

I think the culprit is dsa.c passing a pointer to memory that goes away
on dsa_free() as a name to LWLockRegisterTranche.

regards,
Andreas

Program terminated with signal SIGSEGV, Segmentation fault.
#0  strlen () at ../sysdeps/x86_64/strlen.S:106
(gdb) bt
#1  0x00000000007e03c9 in cstring_to_text (s=0x7fab18d1f954 <error: Cannot access memory at address 0x7fab18d1f954>) at
varlena.c:152
#2  0x0000000000792d7c in pg_stat_get_activity (fcinfo=<optimized out>) at pgstatfuncs.c:805
#3  0x00000000005f0af5 in ExecMakeTableFunctionResult (funcexpr=0x5469f90, econtext=0x5469c80, argContext=<optimized
out>,expectedDesc=0x387b2b0, randomAccess=0 '\000') at execQual.c:2216
 
#4  0x0000000000608633 in FunctionNext (node=node@entry=0x5469b68) at nodeFunctionscan.c:94
#5  0x00000000005f2c22 in ExecScanFetch (recheckMtd=0x608390 <FunctionRecheck>, accessMtd=0x6083a0 <FunctionNext>,
node=0x5469b68)at execScan.c:95
 
#6  ExecScan (node=node@entry=0x5469b68, accessMtd=accessMtd@entry=0x6083a0 <FunctionNext>,
recheckMtd=recheckMtd@entry=0x608390<FunctionRecheck>) at execScan.c:180
 
#7  0x000000000060867f in ExecFunctionScan (node=node@entry=0x5469b68) at nodeFunctionscan.c:268
#8  0x00000000005eb4c8 in ExecProcNode (node=node@entry=0x5469b68) at execProcnode.c:449
#9  0x0000000000602cd0 in ExecLimit (node=node@entry=0x54697f0) at nodeLimit.c:91
#10 0x00000000005eb368 in ExecProcNode (node=node@entry=0x54697f0) at execProcnode.c:531
[...]



Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity

From
Thomas Munro
Date:
On Wed, Dec 28, 2016 at 10:01 AM, Andreas Seltenreich
<seltenreich@gmx.de> wrote:
> testing master as of fe591f8bf6 produced a crash reading
> pg_stat_activity (backtrace below).  Digging around with with gdb
> revealed that pgstat_get_wait_event() returned an invalid pointer for a
> classId PG_WAIT_LWLOCK.
>
> I think the culprit is dsa.c passing a pointer to memory that goes away
> on dsa_free() as a name to LWLockRegisterTranche.

I see, nice detective work!

dsa.c's create_internal and attach_internal call LWLockRegisterTranche
with a pointer to a name in a DSM segment (copied from an argument to
dsa_create), which doesn't have the right extent in this rare race
case: the DSM segment goes away when the last backend detaches, but
after you've detached you might still see others waiting for a lock
that references that tranche ID in pg_stat_activity.  This could be
fixed by copying the tranche name to somewhere with backend lifetime
extent in each backend, but then that would need cleanup.  Maybe we
should replace it with another value when the DSA area is detached,
using a constant string.  Something like
LWLockRegisterTranche(tranche_id, "detached_dsa").  That would be a
new use for the LWLockRegisterTranche function, for
re-registering/renaming and existing tranche ID.  Thoughts?

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity

From
Thomas Munro
Date:
On Wed, Dec 28, 2016 at 10:40 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Wed, Dec 28, 2016 at 10:01 AM, Andreas Seltenreich
> <seltenreich@gmx.de> wrote:
>> testing master as of fe591f8bf6 produced a crash reading
>> pg_stat_activity (backtrace below).  Digging around with with gdb
>> revealed that pgstat_get_wait_event() returned an invalid pointer for a
>> classId PG_WAIT_LWLOCK.
>>
>> I think the culprit is dsa.c passing a pointer to memory that goes away
>> on dsa_free() as a name to LWLockRegisterTranche.
>
> I see, nice detective work!
>
> dsa.c's create_internal and attach_internal call LWLockRegisterTranche
> with a pointer to a name in a DSM segment (copied from an argument to
> dsa_create), which doesn't have the right extent in this rare race
> case: the DSM segment goes away when the last backend detaches, but
> after you've detached you might still see others waiting for a lock
> that references that tranche ID in pg_stat_activity.  This could be
> fixed by copying the tranche name to somewhere with backend lifetime
> extent in each backend, but then that would need cleanup.  Maybe we
> should replace it with another value when the DSA area is detached,
> using a constant string.  Something like
> LWLockRegisterTranche(tranche_id, "detached_dsa").  That would be a
> new use for the LWLockRegisterTranche function, for
> re-registering/renaming and existing tranche ID.  Thoughts?

Or we could introduce a new function
UnregisterLWLockTranche(tranche_id) that is equivalent to
RegisterLWLockTranche(tranche_id, NULL), and call that when detaching
(ie in the hook that runs before we detach the control segment that
holds the name).  It looks like NULL would result in
GetLWLockIdentifier returning "extension".  I guess before DSA came
along only extensions could create lock tranches not known to every
backend, but perhaps now it should return "unknown"?

It does seem a little odd that we are using a well known tranche ID
defined in lwlock.h (as opposed to an ID obtained at runtime by
LWLockNewTrancheId()) but the tranche name has to be registered at
runtime in each backend and is unknown to other backends.  That line
of thinking suggests another potential solution: go and register the
name in RegisterLWLockTranches, and stop registering it in dsa.c.  For
other potential uses of DSA including extensions that call
LWLockNewTrancheId() we'd have to decide whether to make the name an
optional argument, or require those users to register it themselves.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity

From
Andreas Seltenreich
Date:
Thomas Munro writes:

> On Wed, Dec 28, 2016 at 10:40 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>> On Wed, Dec 28, 2016 at 10:01 AM, Andreas Seltenreich <seltenreich@gmx.de> wrote:
>>> testing master as of fe591f8bf6 produced a crash reading
>>> pg_stat_activity (backtrace below).  Digging around with with gdb
>>> revealed that pgstat_get_wait_event() returned an invalid pointer for a
>>> classId PG_WAIT_LWLOCK.
>>>
>>> I think the culprit is dsa.c passing a pointer to memory that goes away
>>> on dsa_free() as a name to LWLockRegisterTranche.
[..]
>> Maybe we should replace it with another value when the DSA area is
>> detached, using a constant string.  Something like

I'm wondering: Is it safe to pass a pointer into a DSA at all?  If I
understand the comments correctly, they are not necessarily mapped (at
the same address) in an unrelated backend looking into pg_stat_activity,
and in this case a dsa_free() is not actually needed to trigger a crash.

> That line of thinking suggests another potential solution: go and
> register the name in RegisterLWLockTranches, and stop registering it
> in dsa.c.  For other potential uses of DSA including extensions that
> call LWLockNewTrancheId() we'd have to decide whether to make the name
> an optional argument, or require those users to register it
> themselves.

Maybe instead of copying the name, just put the passed pointer itself
into the area?  Extensions using LWLockNewTrancheId need to use
shared_preload_libraries anyway, so static strings would be mapped in
all backends.

regards,
Andreas



Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity

From
Thomas Munro
Date:
On Wed, Dec 28, 2016 at 11:38 AM, Andreas Seltenreich
<seltenreich@gmx.de> wrote:
> Thomas Munro writes:
>
>> On Wed, Dec 28, 2016 at 10:40 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>>> On Wed, Dec 28, 2016 at 10:01 AM, Andreas Seltenreich <seltenreich@gmx.de> wrote:
>>>> testing master as of fe591f8bf6 produced a crash reading
>>>> pg_stat_activity (backtrace below).  Digging around with with gdb
>>>> revealed that pgstat_get_wait_event() returned an invalid pointer for a
>>>> classId PG_WAIT_LWLOCK.
>>>>
>>>> I think the culprit is dsa.c passing a pointer to memory that goes away
>>>> on dsa_free() as a name to LWLockRegisterTranche.
> [..]
>>> Maybe we should replace it with another value when the DSA area is
>>> detached, using a constant string.  Something like
>
> I'm wondering: Is it safe to pass a pointer into a DSA at all?  If I
> understand the comments correctly, they are not necessarily mapped (at
> the same address) in an unrelated backend looking into pg_stat_activity,
> and in this case a dsa_free() is not actually needed to trigger a crash.

It is safe, as long as the segment remains mapped.  Each backend that
attaches calls LWLockRegisterTranche giving it the address of the name
in its virtual address space.  The problem is that after the segment
is unmapped, that address is now garbage, which is clearly a bug.
That's why I started talking about how to 'unregister' it (or register
a replacement constant name).  We have some code that always runs
before the control segment containing the name is detached, so that'd
be the perfect place to do that.

>> That line of thinking suggests another potential solution: go and
>> register the name in RegisterLWLockTranches, and stop registering it
>> in dsa.c.  For other potential uses of DSA including extensions that
>> call LWLockNewTrancheId() we'd have to decide whether to make the name
>> an optional argument, or require those users to register it
>> themselves.
>
> Maybe instead of copying the name, just put the passed pointer itself
> into the area?  Extensions using LWLockNewTrancheId need to use
> shared_preload_libraries anyway, so static strings would be mapped in
> all backends.

Yeah that would be another way.  I had this idea that only the process
that creates a DSA area should name it, and then processes attaching
would see the existing tranche ID and name, so could use a narrower
interface.  We could instead do as you say and make processes that
attach provide a pointer to the name too, and make it the caller's
problem to ensure that the pointers remain valid long enough; or go
one step further and make them register/unregister it themselves.

But I'm starting to think that the best way might be to do BOTH of the
things I said in my previous message: make dsa.c register on
create/attach and also unregister before detaching iff the name was
supplied at creation time for the benefit of extension writers, but
make it not do anything at all about tranche name
registration/unregistration if NULL was passed in at create time.
Then register this particular tranche (LWTRANCHE_PARALLEL_QUERY_DSA)
in every process in RegisterLWLockTranches.  That way, you'd get a
useful name in pg_stat_activity for other backends that are running
parallel queries if they are ever waiting for these locks (unlikely
but interesting to know abotu if it happens).

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity

From
Andreas Seltenreich
Date:
Thomas Munro writes:

> On Wed, Dec 28, 2016 at 11:38 AM, Andreas Seltenreich
> <seltenreich@gmx.de> wrote:
>> Thomas Munro writes:
>>
>>> On Wed, Dec 28, 2016 at 10:40 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>>>> On Wed, Dec 28, 2016 at 10:01 AM, Andreas Seltenreich <seltenreich@gmx.de> wrote:
>>>>> testing master as of fe591f8bf6 produced a crash reading
>>>>> pg_stat_activity (backtrace below).  Digging around with with gdb
>>>>> revealed that pgstat_get_wait_event() returned an invalid pointer for a
>>>>> classId PG_WAIT_LWLOCK.
>>>>>
>>>>> I think the culprit is dsa.c passing a pointer to memory that goes away
>>>>> on dsa_free() as a name to LWLockRegisterTranche.
>> [..]
>>>> Maybe we should replace it with another value when the DSA area is
>>>> detached, using a constant string.  Something like
>>
>> I'm wondering: Is it safe to pass a pointer into a DSA at all?  If I
>> understand the comments correctly, they are not necessarily mapped (at
>> the same address) in an unrelated backend looking into pg_stat_activity,
>> and in this case a dsa_free() is not actually needed to trigger a crash.
>
> It is safe, as long as the segment remains mapped.  Each backend that
> attaches calls LWLockRegisterTranche giving it the address of the name
> in its virtual address space.

Hmok, I was under the impression only backends participating in the IPC
call the attach function, not necessarily the ones that could possible
want to resolve the wait_event_info they found in the procArray via
pgstat_get_wait_event().

>> Maybe instead of copying the name, just put the passed pointer itself
>> into the area?  Extensions using LWLockNewTrancheId need to use
>> shared_preload_libraries anyway, so static strings would be mapped in
>> all backends.
>
> Yeah that would be another way.  I had this idea that only the process
> that creates a DSA area should name it, and then processes attaching
> would see the existing tranche ID and name, so could use a narrower
> interface.  We could instead do as you say and make processes that
> attach provide a pointer to the name too, and make it the caller's
> problem to ensure that the pointers remain valid long enough; or go
> one step further and make them register/unregister it themselves.

Hmm, turning the member of the control struct   char lwlock_tranche_name[DSA_MAXLEN];
into   const char *lwlock_tranche_name;
and initializing it with the passed static const char * instead of
copying wouldn't require a change of the interface, would it?

But I really feel like I need to study the code a bit more before
commenting further…

regards,
Andreas



Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity

From
Andreas Seltenreich
Date:
Andreas Seltenreich writes:

> Thomas Munro writes:
>
>> It is safe, as long as the segment remains mapped.  Each backend that
>> attaches calls LWLockRegisterTranche giving it the address of the name
>> in its virtual address space.
>
> Hmok, I was under the impression only backends participating in the IPC
> call the attach function, not necessarily the ones that could possible
> want to resolve the wait_event_info they found in the procArray via
> pgstat_get_wait_event().

Erm, ignore that question: They'll find a NULL in their
LWLockTrancheArray and run into the "extension" case you mentioned.

> But I really feel like I need to study the code a bit more before
> commenting further…

Following this advise now :-)

regards,
Andreas



Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity

From
Thomas Munro
Date:
On Wed, Dec 28, 2016 at 11:57 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> But I'm starting to think that the best way might be to do BOTH of the
> things I said in my previous message: make dsa.c register on
> create/attach and also unregister before detaching iff the name was
> supplied at creation time for the benefit of extension writers, but
> make it not do anything at all about tranche name
> registration/unregistration if NULL was passed in at create time.
> Then register this particular tranche (LWTRANCHE_PARALLEL_QUERY_DSA)
> in every process in RegisterLWLockTranches.  That way, you'd get a
> useful name in pg_stat_activity for other backends that are running
> parallel queries if they are ever waiting for these locks (unlikely
> but interesting to know abotu if it happens).

Maybe something like the attached.

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity

From
Andreas Seltenreich
Date:
Thomas Munro writes:

> [2. text/x-diff; fix-dsa-tranche-registration.patch]

Fuzzing with the patch applied hasn't triggered the crash so far.  It
did happen 5 times with the same amount of fuzzing before patching.

regards,
Andreas



Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity

From
Robert Haas
Date:
On Tue, Dec 27, 2016 at 9:28 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Wed, Dec 28, 2016 at 11:57 AM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> But I'm starting to think that the best way might be to do BOTH of the
>> things I said in my previous message: make dsa.c register on
>> create/attach and also unregister before detaching iff the name was
>> supplied at creation time for the benefit of extension writers, but
>> make it not do anything at all about tranche name
>> registration/unregistration if NULL was passed in at create time.
>> Then register this particular tranche (LWTRANCHE_PARALLEL_QUERY_DSA)
>> in every process in RegisterLWLockTranches.  That way, you'd get a
>> useful name in pg_stat_activity for other backends that are running
>> parallel queries if they are ever waiting for these locks (unlikely
>> but interesting to know abotu if it happens).
>
> Maybe something like the attached.

Now that array_base and array_stride are gone, I don't see any reason
why the DSA machinery needs to be aware of tranche names at all.  So I
propose to rip all that out, as in the attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity

From
Thomas Munro
Date:
On Thu, Jan 5, 2017 at 10:23 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Dec 27, 2016 at 9:28 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> On Wed, Dec 28, 2016 at 11:57 AM, Thomas Munro
>> <thomas.munro@enterprisedb.com> wrote:
>>> But I'm starting to think that the best way might be to do BOTH of the
>>> things I said in my previous message: make dsa.c register on
>>> create/attach and also unregister before detaching iff the name was
>>> supplied at creation time for the benefit of extension writers, but
>>> make it not do anything at all about tranche name
>>> registration/unregistration if NULL was passed in at create time.
>>> Then register this particular tranche (LWTRANCHE_PARALLEL_QUERY_DSA)
>>> in every process in RegisterLWLockTranches.  That way, you'd get a
>>> useful name in pg_stat_activity for other backends that are running
>>> parallel queries if they are ever waiting for these locks (unlikely
>>> but interesting to know abotu if it happens).
>>
>> Maybe something like the attached.
>
> Now that array_base and array_stride are gone, I don't see any reason
> why the DSA machinery needs to be aware of tranche names at all.  So I
> propose to rip all that out, as in the attached.

The way I proposed makes it a lot easier to work with dynamic names so
you can differentiate variable numbers of areas; the names would have
exactly the right extent and they'd get unregistered in each backend
at just the right time.  On the other hand, I don't really like seeing
lock tranche stuff leaking into other APIs like this, and using
tranche IDs in any way other than allocating a small number of them at
startup isn't really supported anyway, so +1 for doing it your way.

At one stage I had an idea that that argument was naming the DSA area,
not the lock tranche, and then the lock tranche happened to use a name
that was built from that name, but that doesn't make any sense if it's
optional depending on whether you already registered the lock
tranche...

- char lwlock_tranche_name[DSA_MAXLEN];

You can remove the now-unused DSA_MAXLEN macro.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity

From
Robert Haas
Date:
On Wed, Jan 4, 2017 at 5:03 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> The way I proposed makes it a lot easier to work with dynamic names so
> you can differentiate variable numbers of areas; the names would have
> exactly the right extent and they'd get unregistered in each backend
> at just the right time.

Only from the perspective of the backend that's using DSA.  From the
perspective of some other backend reading pg_stat_activity, it's all
wrong.  There, you want the name to registered as early as possible -
either at system startup time for builtin things or at module load
time for extensions.  With the way you have it, you'd only be able to
recognize a lock wait as being related to parallel_query_dsa if your
session had previously executed a parallel query.  That's clearly not
desirable.  With this approach, _PG_init can do LWLockRegisterTranche
and then if you stick the library into shared_preload_libraries or
session_preload_libraries *all* backends have that tranche whether
they use the library or not.  If the tranche registry were shared,
then your approach would be fine, but it isn't.

> On the other hand, I don't really like seeing
> lock tranche stuff leaking into other APIs like this, and using
> tranche IDs in any way other than allocating a small number of them at
> startup isn't really supported anyway, so +1 for doing it your way.

OK.

> At one stage I had an idea that that argument was naming the DSA area,
> not the lock tranche, and then the lock tranche happened to use a name
> that was built from that name, but that doesn't make any sense if it's
> optional depending on whether you already registered the lock
> tranche...
>
> - char lwlock_tranche_name[DSA_MAXLEN];
>
> You can remove the now-unused DSA_MAXLEN macro.

Ah, thanks.  Committed with that change.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company