Thread: [HACKERS] [sqlsmith] Crash reading pg_stat_activity
Hi, testing master as of fe591f8bf6 produced a crash reading pg_stat_activity (backtrace below). Digging around with with gdb revealed that pgstat_get_wait_event() returned an invalid pointer for a classId PG_WAIT_LWLOCK. I think the culprit is dsa.c passing a pointer to memory that goes away on dsa_free() as a name to LWLockRegisterTranche. regards, Andreas Program terminated with signal SIGSEGV, Segmentation fault. #0 strlen () at ../sysdeps/x86_64/strlen.S:106 (gdb) bt #1 0x00000000007e03c9 in cstring_to_text (s=0x7fab18d1f954 <error: Cannot access memory at address 0x7fab18d1f954>) at varlena.c:152 #2 0x0000000000792d7c in pg_stat_get_activity (fcinfo=<optimized out>) at pgstatfuncs.c:805 #3 0x00000000005f0af5 in ExecMakeTableFunctionResult (funcexpr=0x5469f90, econtext=0x5469c80, argContext=<optimized out>,expectedDesc=0x387b2b0, randomAccess=0 '\000') at execQual.c:2216 #4 0x0000000000608633 in FunctionNext (node=node@entry=0x5469b68) at nodeFunctionscan.c:94 #5 0x00000000005f2c22 in ExecScanFetch (recheckMtd=0x608390 <FunctionRecheck>, accessMtd=0x6083a0 <FunctionNext>, node=0x5469b68)at execScan.c:95 #6 ExecScan (node=node@entry=0x5469b68, accessMtd=accessMtd@entry=0x6083a0 <FunctionNext>, recheckMtd=recheckMtd@entry=0x608390<FunctionRecheck>) at execScan.c:180 #7 0x000000000060867f in ExecFunctionScan (node=node@entry=0x5469b68) at nodeFunctionscan.c:268 #8 0x00000000005eb4c8 in ExecProcNode (node=node@entry=0x5469b68) at execProcnode.c:449 #9 0x0000000000602cd0 in ExecLimit (node=node@entry=0x54697f0) at nodeLimit.c:91 #10 0x00000000005eb368 in ExecProcNode (node=node@entry=0x54697f0) at execProcnode.c:531 [...]
On Wed, Dec 28, 2016 at 10:01 AM, Andreas Seltenreich <seltenreich@gmx.de> wrote: > testing master as of fe591f8bf6 produced a crash reading > pg_stat_activity (backtrace below). Digging around with with gdb > revealed that pgstat_get_wait_event() returned an invalid pointer for a > classId PG_WAIT_LWLOCK. > > I think the culprit is dsa.c passing a pointer to memory that goes away > on dsa_free() as a name to LWLockRegisterTranche. I see, nice detective work! dsa.c's create_internal and attach_internal call LWLockRegisterTranche with a pointer to a name in a DSM segment (copied from an argument to dsa_create), which doesn't have the right extent in this rare race case: the DSM segment goes away when the last backend detaches, but after you've detached you might still see others waiting for a lock that references that tranche ID in pg_stat_activity. This could be fixed by copying the tranche name to somewhere with backend lifetime extent in each backend, but then that would need cleanup. Maybe we should replace it with another value when the DSA area is detached, using a constant string. Something like LWLockRegisterTranche(tranche_id, "detached_dsa"). That would be a new use for the LWLockRegisterTranche function, for re-registering/renaming and existing tranche ID. Thoughts? -- Thomas Munro http://www.enterprisedb.com
On Wed, Dec 28, 2016 at 10:40 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > On Wed, Dec 28, 2016 at 10:01 AM, Andreas Seltenreich > <seltenreich@gmx.de> wrote: >> testing master as of fe591f8bf6 produced a crash reading >> pg_stat_activity (backtrace below). Digging around with with gdb >> revealed that pgstat_get_wait_event() returned an invalid pointer for a >> classId PG_WAIT_LWLOCK. >> >> I think the culprit is dsa.c passing a pointer to memory that goes away >> on dsa_free() as a name to LWLockRegisterTranche. > > I see, nice detective work! > > dsa.c's create_internal and attach_internal call LWLockRegisterTranche > with a pointer to a name in a DSM segment (copied from an argument to > dsa_create), which doesn't have the right extent in this rare race > case: the DSM segment goes away when the last backend detaches, but > after you've detached you might still see others waiting for a lock > that references that tranche ID in pg_stat_activity. This could be > fixed by copying the tranche name to somewhere with backend lifetime > extent in each backend, but then that would need cleanup. Maybe we > should replace it with another value when the DSA area is detached, > using a constant string. Something like > LWLockRegisterTranche(tranche_id, "detached_dsa"). That would be a > new use for the LWLockRegisterTranche function, for > re-registering/renaming and existing tranche ID. Thoughts? Or we could introduce a new function UnregisterLWLockTranche(tranche_id) that is equivalent to RegisterLWLockTranche(tranche_id, NULL), and call that when detaching (ie in the hook that runs before we detach the control segment that holds the name). It looks like NULL would result in GetLWLockIdentifier returning "extension". I guess before DSA came along only extensions could create lock tranches not known to every backend, but perhaps now it should return "unknown"? It does seem a little odd that we are using a well known tranche ID defined in lwlock.h (as opposed to an ID obtained at runtime by LWLockNewTrancheId()) but the tranche name has to be registered at runtime in each backend and is unknown to other backends. That line of thinking suggests another potential solution: go and register the name in RegisterLWLockTranches, and stop registering it in dsa.c. For other potential uses of DSA including extensions that call LWLockNewTrancheId() we'd have to decide whether to make the name an optional argument, or require those users to register it themselves. -- Thomas Munro http://www.enterprisedb.com
Thomas Munro writes: > On Wed, Dec 28, 2016 at 10:40 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: >> On Wed, Dec 28, 2016 at 10:01 AM, Andreas Seltenreich <seltenreich@gmx.de> wrote: >>> testing master as of fe591f8bf6 produced a crash reading >>> pg_stat_activity (backtrace below). Digging around with with gdb >>> revealed that pgstat_get_wait_event() returned an invalid pointer for a >>> classId PG_WAIT_LWLOCK. >>> >>> I think the culprit is dsa.c passing a pointer to memory that goes away >>> on dsa_free() as a name to LWLockRegisterTranche. [..] >> Maybe we should replace it with another value when the DSA area is >> detached, using a constant string. Something like I'm wondering: Is it safe to pass a pointer into a DSA at all? If I understand the comments correctly, they are not necessarily mapped (at the same address) in an unrelated backend looking into pg_stat_activity, and in this case a dsa_free() is not actually needed to trigger a crash. > That line of thinking suggests another potential solution: go and > register the name in RegisterLWLockTranches, and stop registering it > in dsa.c. For other potential uses of DSA including extensions that > call LWLockNewTrancheId() we'd have to decide whether to make the name > an optional argument, or require those users to register it > themselves. Maybe instead of copying the name, just put the passed pointer itself into the area? Extensions using LWLockNewTrancheId need to use shared_preload_libraries anyway, so static strings would be mapped in all backends. regards, Andreas
On Wed, Dec 28, 2016 at 11:38 AM, Andreas Seltenreich <seltenreich@gmx.de> wrote: > Thomas Munro writes: > >> On Wed, Dec 28, 2016 at 10:40 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: >>> On Wed, Dec 28, 2016 at 10:01 AM, Andreas Seltenreich <seltenreich@gmx.de> wrote: >>>> testing master as of fe591f8bf6 produced a crash reading >>>> pg_stat_activity (backtrace below). Digging around with with gdb >>>> revealed that pgstat_get_wait_event() returned an invalid pointer for a >>>> classId PG_WAIT_LWLOCK. >>>> >>>> I think the culprit is dsa.c passing a pointer to memory that goes away >>>> on dsa_free() as a name to LWLockRegisterTranche. > [..] >>> Maybe we should replace it with another value when the DSA area is >>> detached, using a constant string. Something like > > I'm wondering: Is it safe to pass a pointer into a DSA at all? If I > understand the comments correctly, they are not necessarily mapped (at > the same address) in an unrelated backend looking into pg_stat_activity, > and in this case a dsa_free() is not actually needed to trigger a crash. It is safe, as long as the segment remains mapped. Each backend that attaches calls LWLockRegisterTranche giving it the address of the name in its virtual address space. The problem is that after the segment is unmapped, that address is now garbage, which is clearly a bug. That's why I started talking about how to 'unregister' it (or register a replacement constant name). We have some code that always runs before the control segment containing the name is detached, so that'd be the perfect place to do that. >> That line of thinking suggests another potential solution: go and >> register the name in RegisterLWLockTranches, and stop registering it >> in dsa.c. For other potential uses of DSA including extensions that >> call LWLockNewTrancheId() we'd have to decide whether to make the name >> an optional argument, or require those users to register it >> themselves. > > Maybe instead of copying the name, just put the passed pointer itself > into the area? Extensions using LWLockNewTrancheId need to use > shared_preload_libraries anyway, so static strings would be mapped in > all backends. Yeah that would be another way. I had this idea that only the process that creates a DSA area should name it, and then processes attaching would see the existing tranche ID and name, so could use a narrower interface. We could instead do as you say and make processes that attach provide a pointer to the name too, and make it the caller's problem to ensure that the pointers remain valid long enough; or go one step further and make them register/unregister it themselves. But I'm starting to think that the best way might be to do BOTH of the things I said in my previous message: make dsa.c register on create/attach and also unregister before detaching iff the name was supplied at creation time for the benefit of extension writers, but make it not do anything at all about tranche name registration/unregistration if NULL was passed in at create time. Then register this particular tranche (LWTRANCHE_PARALLEL_QUERY_DSA) in every process in RegisterLWLockTranches. That way, you'd get a useful name in pg_stat_activity for other backends that are running parallel queries if they are ever waiting for these locks (unlikely but interesting to know abotu if it happens). -- Thomas Munro http://www.enterprisedb.com
Thomas Munro writes: > On Wed, Dec 28, 2016 at 11:38 AM, Andreas Seltenreich > <seltenreich@gmx.de> wrote: >> Thomas Munro writes: >> >>> On Wed, Dec 28, 2016 at 10:40 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: >>>> On Wed, Dec 28, 2016 at 10:01 AM, Andreas Seltenreich <seltenreich@gmx.de> wrote: >>>>> testing master as of fe591f8bf6 produced a crash reading >>>>> pg_stat_activity (backtrace below). Digging around with with gdb >>>>> revealed that pgstat_get_wait_event() returned an invalid pointer for a >>>>> classId PG_WAIT_LWLOCK. >>>>> >>>>> I think the culprit is dsa.c passing a pointer to memory that goes away >>>>> on dsa_free() as a name to LWLockRegisterTranche. >> [..] >>>> Maybe we should replace it with another value when the DSA area is >>>> detached, using a constant string. Something like >> >> I'm wondering: Is it safe to pass a pointer into a DSA at all? If I >> understand the comments correctly, they are not necessarily mapped (at >> the same address) in an unrelated backend looking into pg_stat_activity, >> and in this case a dsa_free() is not actually needed to trigger a crash. > > It is safe, as long as the segment remains mapped. Each backend that > attaches calls LWLockRegisterTranche giving it the address of the name > in its virtual address space. Hmok, I was under the impression only backends participating in the IPC call the attach function, not necessarily the ones that could possible want to resolve the wait_event_info they found in the procArray via pgstat_get_wait_event(). >> Maybe instead of copying the name, just put the passed pointer itself >> into the area? Extensions using LWLockNewTrancheId need to use >> shared_preload_libraries anyway, so static strings would be mapped in >> all backends. > > Yeah that would be another way. I had this idea that only the process > that creates a DSA area should name it, and then processes attaching > would see the existing tranche ID and name, so could use a narrower > interface. We could instead do as you say and make processes that > attach provide a pointer to the name too, and make it the caller's > problem to ensure that the pointers remain valid long enough; or go > one step further and make them register/unregister it themselves. Hmm, turning the member of the control struct char lwlock_tranche_name[DSA_MAXLEN]; into const char *lwlock_tranche_name; and initializing it with the passed static const char * instead of copying wouldn't require a change of the interface, would it? But I really feel like I need to study the code a bit more before commenting further… regards, Andreas
Andreas Seltenreich writes: > Thomas Munro writes: > >> It is safe, as long as the segment remains mapped. Each backend that >> attaches calls LWLockRegisterTranche giving it the address of the name >> in its virtual address space. > > Hmok, I was under the impression only backends participating in the IPC > call the attach function, not necessarily the ones that could possible > want to resolve the wait_event_info they found in the procArray via > pgstat_get_wait_event(). Erm, ignore that question: They'll find a NULL in their LWLockTrancheArray and run into the "extension" case you mentioned. > But I really feel like I need to study the code a bit more before > commenting further… Following this advise now :-) regards, Andreas
On Wed, Dec 28, 2016 at 11:57 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > But I'm starting to think that the best way might be to do BOTH of the > things I said in my previous message: make dsa.c register on > create/attach and also unregister before detaching iff the name was > supplied at creation time for the benefit of extension writers, but > make it not do anything at all about tranche name > registration/unregistration if NULL was passed in at create time. > Then register this particular tranche (LWTRANCHE_PARALLEL_QUERY_DSA) > in every process in RegisterLWLockTranches. That way, you'd get a > useful name in pg_stat_activity for other backends that are running > parallel queries if they are ever waiting for these locks (unlikely > but interesting to know abotu if it happens). Maybe something like the attached. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Thomas Munro writes: > [2. text/x-diff; fix-dsa-tranche-registration.patch] Fuzzing with the patch applied hasn't triggered the crash so far. It did happen 5 times with the same amount of fuzzing before patching. regards, Andreas
On Tue, Dec 27, 2016 at 9:28 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > On Wed, Dec 28, 2016 at 11:57 AM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >> But I'm starting to think that the best way might be to do BOTH of the >> things I said in my previous message: make dsa.c register on >> create/attach and also unregister before detaching iff the name was >> supplied at creation time for the benefit of extension writers, but >> make it not do anything at all about tranche name >> registration/unregistration if NULL was passed in at create time. >> Then register this particular tranche (LWTRANCHE_PARALLEL_QUERY_DSA) >> in every process in RegisterLWLockTranches. That way, you'd get a >> useful name in pg_stat_activity for other backends that are running >> parallel queries if they are ever waiting for these locks (unlikely >> but interesting to know abotu if it happens). > > Maybe something like the attached. Now that array_base and array_stride are gone, I don't see any reason why the DSA machinery needs to be aware of tranche names at all. So I propose to rip all that out, as in the attached. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Thu, Jan 5, 2017 at 10:23 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Dec 27, 2016 at 9:28 PM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >> On Wed, Dec 28, 2016 at 11:57 AM, Thomas Munro >> <thomas.munro@enterprisedb.com> wrote: >>> But I'm starting to think that the best way might be to do BOTH of the >>> things I said in my previous message: make dsa.c register on >>> create/attach and also unregister before detaching iff the name was >>> supplied at creation time for the benefit of extension writers, but >>> make it not do anything at all about tranche name >>> registration/unregistration if NULL was passed in at create time. >>> Then register this particular tranche (LWTRANCHE_PARALLEL_QUERY_DSA) >>> in every process in RegisterLWLockTranches. That way, you'd get a >>> useful name in pg_stat_activity for other backends that are running >>> parallel queries if they are ever waiting for these locks (unlikely >>> but interesting to know abotu if it happens). >> >> Maybe something like the attached. > > Now that array_base and array_stride are gone, I don't see any reason > why the DSA machinery needs to be aware of tranche names at all. So I > propose to rip all that out, as in the attached. The way I proposed makes it a lot easier to work with dynamic names so you can differentiate variable numbers of areas; the names would have exactly the right extent and they'd get unregistered in each backend at just the right time. On the other hand, I don't really like seeing lock tranche stuff leaking into other APIs like this, and using tranche IDs in any way other than allocating a small number of them at startup isn't really supported anyway, so +1 for doing it your way. At one stage I had an idea that that argument was naming the DSA area, not the lock tranche, and then the lock tranche happened to use a name that was built from that name, but that doesn't make any sense if it's optional depending on whether you already registered the lock tranche... - char lwlock_tranche_name[DSA_MAXLEN]; You can remove the now-unused DSA_MAXLEN macro. -- Thomas Munro http://www.enterprisedb.com
On Wed, Jan 4, 2017 at 5:03 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > The way I proposed makes it a lot easier to work with dynamic names so > you can differentiate variable numbers of areas; the names would have > exactly the right extent and they'd get unregistered in each backend > at just the right time. Only from the perspective of the backend that's using DSA. From the perspective of some other backend reading pg_stat_activity, it's all wrong. There, you want the name to registered as early as possible - either at system startup time for builtin things or at module load time for extensions. With the way you have it, you'd only be able to recognize a lock wait as being related to parallel_query_dsa if your session had previously executed a parallel query. That's clearly not desirable. With this approach, _PG_init can do LWLockRegisterTranche and then if you stick the library into shared_preload_libraries or session_preload_libraries *all* backends have that tranche whether they use the library or not. If the tranche registry were shared, then your approach would be fine, but it isn't. > On the other hand, I don't really like seeing > lock tranche stuff leaking into other APIs like this, and using > tranche IDs in any way other than allocating a small number of them at > startup isn't really supported anyway, so +1 for doing it your way. OK. > At one stage I had an idea that that argument was naming the DSA area, > not the lock tranche, and then the lock tranche happened to use a name > that was built from that name, but that doesn't make any sense if it's > optional depending on whether you already registered the lock > tranche... > > - char lwlock_tranche_name[DSA_MAXLEN]; > > You can remove the now-unused DSA_MAXLEN macro. Ah, thanks. Committed with that change. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company