Thread: [HACKERS] Error: dsa_area could not attach to a segment that has been freed
[HACKERS] Error: dsa_area could not attach to a segment that has been freed
From
Gaddam Sai Ram
Date:
Hello everyone,
Based on the discussion in the below thread, I built a an extension using DSA(postgres-10 beta-3, linux machine).
Use _PG_init and the shmem hook to reserve a little bit oftraditional shared memory and initialise it to zero. This will beused just to share the DSA handle, but you can't actually create theDSA area in postmaster. In other words, this little bit of sharedmemory is for "discovery", since it can be looked up by name from anybackend.
Yes, I have created memory for DSA handle in shared memory, but not the actual DSA area.
In each backend that wants to use your new in-memory index system,you need to be able to attach or create the DSA area on-demand.Perhaps you could have a get_my_shared_state() function (insert bettername) that uses a static local variable to hold a pointer to somestate. If it's NULL, you know you need to create the state. Thatshould happen only once in each backend, the first time through thefunction. In that case you need to create or attach to the DSA areaas appropriate, which you should wrap inLWLockAcquire(AddinShmemInitLock,LW_EXCLUSIVE)/LWLockRelease(AddinShmemInitLock) to serialise the codeblock. First, look up the bit of traditional shared memory to see ifthere is a DSA handle published in it already. If there is you canattach. If there isn't, you are the first so you need to create, andpublish the handle for others to attach to. Remember whatever stateyou need to remember, such as the dsa_area, in static local variablesso that all future calls to get_my_shared_state() in that backend willbe fast.
Yes, the code is present in gstore_shmem.c(pfa) and the first process to use DSA will create the area, and rest all new processes will either attach it or if it is already attached, it will use the DSA area which is already pinned.
==> I have created a bgworker in pg_init and when it starts it will be the first process to access DSA, so it will create DSA area.
==> I have a small UDF function(simple_udf_func) which I call in a new backend(process). So it will attach the DSA area already created.
==> When I make a call to same UDF function again in the same process, since the area is already attached and pinned, I use the same area which I store in a global variable while attaching/creating. Here I get the problem...
Error details: dsa_area could not attach to a segment that has been freed
While examining in detail, I found this data.
I used dsa_dump() for debugging and I found that during my error case, i get this log:
dsa_area handle 1:
max_total_segment_size: 0
total_segment_size: 0
refcnt: 0
pinned: f
segment bins:
segment bin 0 (at least -2147483648 contiguous pages free):
Clearly, the data in my DSA area has been corrupted in latter case, but my bgworker continues to work proper with same dsa_area handle.
At this stage, the dsa_dump() in my bgworker is as below:
dsa_area handle 1814e630:
max_total_segment_size: 18446744073709551615
total_segment_size: 1048576
refcnt: 3
pinned: t
segment bins:
segment bin 8 (at least 128 contiguous pages free):
segment index 0, usable_pages = 253, contiguous_pages = 220, mapped at 0x7f0abbd58000
As i'm pinning the dsa mapping after attach, it has to stay through out the backend session. But not sure why its freed/corrupted.
Kindly help me in fixing this issue. Attached the copy of my extension, which will reproduce the same issue.
Regards
G. Sai Ram
Attachment
[HACKERS] Re: Error: dsa_area could not attach to a segment that has beenfreed
From
Gaddam Sai Ram
Date:
Kindly help me with the above thread..
Thanks
G. Sai Ram
---- On Fri, 15 Sep 2017 13:21:33 +0530 Gaddam Sai Ram <gaddamsairam.n@zohocorp.com> wrote ----
Hello everyone,Based on the discussion in the below thread, I built a an extension using DSA(postgres-10 beta-3, linux machine).Use _PG_init and the shmem hook to reserve a little bit oftraditional shared memory and initialise it to zero. This will beused just to share the DSA handle, but you can't actually create theDSA area in postmaster. In other words, this little bit of sharedmemory is for "discovery", since it can be looked up by name from anybackend.Yes, I have created memory for DSA handle in shared memory, but not the actual DSA area.In each backend that wants to use your new in-memory index system,you need to be able to attach or create the DSA area on-demand.Perhaps you could have a get_my_shared_state() function (insert bettername) that uses a static local variable to hold a pointer to somestate. If it's NULL, you know you need to create the state. Thatshould happen only once in each backend, the first time through thefunction. In that case you need to create or attach to the DSA areaas appropriate, which you should wrap inLWLockAcquire(AddinShmemInitLock,LW_EXCLUSIVE)/LWLockRelease(AddinShmemInitLock) to serialise the codeblock. First, look up the bit of traditional shared memory to see ifthere is a DSA handle published in it already. If there is you canattach. If there isn't, you are the first so you need to create, andpublish the handle for others to attach to. Remember whatever stateyou need to remember, such as the dsa_area, in static local variablesso that all future calls to get_my_shared_state() in that backend willbe fast.Yes, the code is present in gstore_shmem.c(pfa) and the first process to use DSA will create the area, and rest all new processes will either attach it or if it is already attached, it will use the DSA area which is already pinned.==> I have created a bgworker in pg_init and when it starts it will be the first process to access DSA, so it will create DSA area.==> I have a small UDF function(simple_udf_func) which I call in a new backend(process). So it will attach the DSA area already created.==> When I make a call to same UDF function again in the same process, since the area is already attached and pinned, I use the same area which I store in a global variable while attaching/creating. Here I get the problem...Error details: dsa_area could not attach to a segment that has been freedWhile examining in detail, I found this data.I used dsa_dump() for debugging and I found that during my error case, i get this log:dsa_area handle 1:max_total_segment_size: 0total_segment_size: 0refcnt: 0pinned: fsegment bins:segment bin 0 (at least -2147483648 contiguous pages free):Clearly, the data in my DSA area has been corrupted in latter case, but my bgworker continues to work proper with same dsa_area handle.At this stage, the dsa_dump() in my bgworker is as below:dsa_area handle 1814e630:max_total_segment_size: 18446744073709551615total_segment_size: 1048576refcnt: 3pinned: tsegment bins:segment bin 8 (at least 128 contiguous pages free):segment index 0, usable_pages = 253, contiguous_pages = 220, mapped at 0x7f0abbd58000As i'm pinning the dsa mapping after attach, it has to stay through out the backend session. But not sure why its freed/corrupted.Kindly help me in fixing this issue. Attached the copy of my extension, which will reproduce the same issue.RegardsG. Sai Ram
Re: [HACKERS] Error: dsa_area could not attach to a segment that hasbeen freed
From
Thomas Munro
Date:
On Fri, Sep 15, 2017 at 7:51 PM, Gaddam Sai Ram <gaddamsairam.n@zohocorp.com> wrote: > As i'm pinning the dsa mapping after attach, it has to stay through out the > backend session. But not sure why its freed/corrupted. > > Kindly help me in fixing this issue. Attached the copy of my extension, > which will reproduce the same issue. Your DSA area is pinned and the mapping is pinned, but there is one more thing that goes away automatically unless you nail it to the table: the backend-local dsa_area object which dsa_create() and dsa_attach() return. That's allocated in the "current memory context", so if you do it from your procedure simple_udf_func without making special arrangements it gets automatically freed at end of transaction. If you're going to cache it for the whole life of the backend, you'd better make sure it's allocated in memory context that lives long enough. Where you have dsa_create() and dsa_attach() calls, try coding like this: MemoryContext old_context; old_context = MemoryContextSwitchTo(TopMemoryContext); area = dsa_create(...); MemoryContextSwitchTo(old_context); old_context = MemoryContextSwitchTo(TopMemoryContext); area = dsa_attach(...); MemoryContextSwitchTo(old_context); You'll need to #include "utils/memutils.h". -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Error: dsa_area could not attach to a segment thathas been freed
From
Gaddam Sai Ram
Date:
Thank you very much! That fixed my issue! :)
I was in an assumption that pinning the area will increase its lifetime but yeah after taking memory context into consideration its working fine!
I was in an assumption that pinning the area will increase its lifetime but yeah after taking memory context into consideration its working fine!
Regards
G. Sai Ram
---- On Wed, 20 Sep 2017 11:16:19 +0530 Thomas Munro <thomas.munro@enterprisedb.com> wrote ----
On Fri, Sep 15, 2017 at 7:51 PM, Gaddam Sai Ram<gaddamsairam.n@zohocorp.com> wrote:> As i'm pinning the dsa mapping after attach, it has to stay through out the> backend session. But not sure why its freed/corrupted.>> Kindly help me in fixing this issue. Attached the copy of my extension,> which will reproduce the same issue.Your DSA area is pinned and the mapping is pinned, but there is onemore thing that goes away automatically unless you nail it to thetable: the backend-local dsa_area object which dsa_create() anddsa_attach() return. That's allocated in the "current memorycontext", so if you do it from your procedure simple_udf_func withoutmaking special arrangements it gets automatically freed at end oftransaction. If you're going to cache it for the whole life of thebackend, you'd better make sure it's allocated in memory context thatlives long enough. Where you have dsa_create() and dsa_attach()calls, try coding like this:MemoryContext old_context;old_context = MemoryContextSwitchTo(TopMemoryContext);area = dsa_create(...);MemoryContextSwitchTo(old_context);old_context = MemoryContextSwitchTo(TopMemoryContext);area = dsa_attach(...);MemoryContextSwitchTo(old_context);You'll need to #include "utils/memutils.h".--Thomas Munro
Re: [HACKERS] Error: dsa_area could not attach to a segment that hasbeen freed
From
Thomas Munro
Date:
On Wed, Sep 20, 2017 at 6:14 PM, Gaddam Sai Ram <gaddamsairam.n@zohocorp.com> wrote: > Thank you very much! That fixed my issue! :) > I was in an assumption that pinning the area will increase its lifetime but > yeah after taking memory context into consideration its working fine! So far the success rate in confusing people who first try to make long-lived DSA areas and DSM segments is 100%. Basically, this is all designed to ensure automatic cleanup of resources in short-lived scopes. Good luck for your graph project. I think you're going to have to expend a lot of energy trying to avoid memory leaks if your DSA lives as long as the database cluster, since error paths won't automatically free any memory you allocated in it. Right now I don't have any particularly good ideas for mechanisms to deal with that. PostgreSQL C has exception-like error handling, but doesn't (and probably can't) have a language feature like scoped destructors from C++. IMHO exceptions need either destructors or garbage collection to keep you sane. There is a kind of garbage collection for palloc'd memory and also for other resources like file handles, but if you're using a big long lived DSA area you have nothing like that. You can use PG_TRY/PG_CATCH very carefully to clean up, or (probably better) you can try to make sure that all your interaction with shared memory is no-throw (note that that means using dsa_allocate_extended(x, DSA_ALLOC_NO_OOM), because dsa_allocate itself can raise errors). The first thing I'd try would probably be to keep all shmem-allocating code in as few routines as possible, and use only no-throw operations in the 'critical' regions of them, and maybe look into some kind of undo log of things to free or undo in case of error to manage multi-allocation operations if that turned out to be necessary. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Error: dsa_area could not attach to a segment that hasbeen freed
From
Craig Ringer
Date:
On 20 September 2017 at 16:55, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
On Wed, Sep 20, 2017 at 6:14 PM, Gaddam Sai Ram
<gaddamsairam.n@zohocorp.com> wrote:
> Thank you very much! That fixed my issue! :)
> I was in an assumption that pinning the area will increase its lifetime but
> yeah after taking memory context into consideration its working fine!
So far the success rate in confusing people who first try to make
long-lived DSA areas and DSM segments is 100%. Basically, this is all
designed to ensure automatic cleanup of resources in short-lived
scopes.
90% ;)
I got it working with no significant issues for a long lived segment used to store a pool of shm_mq pairs used for a sort of "connection listener" bgworker. Though I only used DSM+ToC, not DSA. But TBH that may well be luck, as I tend to routinely use memory contexts scoped to the operational lifetime of a subsystem, making most problems like this just vanish without my realising they were there in the first place. Usually.
I got it working with no significant issues for a long lived segment used to store a pool of shm_mq pairs used for a sort of "connection listener" bgworker. Though I only used DSM+ToC, not DSA. But TBH that may well be luck, as I tend to routinely use memory contexts scoped to the operational lifetime of a subsystem, making most problems like this just vanish without my realising they were there in the first place. Usually.
I pretty much shamelessly cribbed from test_shm_mq for the ToC stuff though. It's simple enough when you read it in use, but I'd be lucky to do it without an example.
I had lots more problems with shm_mq than DSM. shm_mq is very obviously designed for short-lived scopes, and falls down badly if you have a pool of queues you want to re-use after the peer detaches. You have to track "in use" flags separately to the shm_mq's own, because it doesn't clear its stored PGPROC entries for receiver/sender on detach. Once you know neither sender nor receiver is still attached, you can memset() the area and create a new queue in it.
You can't just reset the queue for a new peer, and have to do quite a dance to make sure it's safe detach from, overwrite, re-create and re-attach to.
Good luck for your graph project. I think you're going to have to
expend a lot of energy trying to avoid memory leaks if your DSA lives
as long as the database cluster, since error paths won't automatically
free any memory you allocated in it.
Yeah, that's going to be hard. You might land up having lots and lots of little DSM segments.
There is a kind of garbage collection for palloc'd memory and
also for other resources like file handles, but if you're using a big
long lived DSA area you have nothing like that.
We need, IMO, a DSA-backed heirachical MemoryContext system.
We can't use the exact MemoryContext API as-is due to the need for far pointers though :(
Re: [HACKERS] Error: dsa_area could not attach to a segment that hasbeen freed
From
Craig Ringer
Date:
On 20 September 2017 at 17:52, Craig Ringer <craig@2ndquadrant.com> wrote:
On 20 September 2017 at 16:55, Thomas Munro <thomas.munro@enterprisedb.com> wrote: On Wed, Sep 20, 2017 at 6:14 PM, Gaddam Sai Ram
<gaddamsairam.n@zohocorp.com> wrote:
> Thank you very much! That fixed my issue! :)
> I was in an assumption that pinning the area will increase its lifetime but
> yeah after taking memory context into consideration its working fine!
So far the success rate in confusing people who first try to make
long-lived DSA areas and DSM segments is 100%. Basically, this is all
designed to ensure automatic cleanup of resources in short-lived
scopes.90% ;)
I got it working with no significant issues for a long lived segment used to store a pool of shm_mq pairs used for a sort of "connection listener" bgworker. Though I only used DSM+ToC, not DSA.
By the way, dsa.c really needs a cross-reference to shm_toc.c and vice versa. With a hint as to when each is appropriate.
Re: [HACKERS] Error: dsa_area could not attach to a segment that hasbeen freed
From
Robert Haas
Date:
On Wed, Sep 20, 2017 at 5:54 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > By the way, dsa.c really needs a cross-reference to shm_toc.c and vice > versa. With a hint as to when each is appropriate. /me blinks. Aren't those almost-entirely-unrelated facilities? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Craig Ringer <craig@2ndquadrant.com> writes: > On 20 September 2017 at 16:55, Thomas Munro <thomas.munro@enterprisedb.com> > wrote: >> There is a kind of garbage collection for palloc'd memory and >> also for other resources like file handles, but if you're using a big >> long lived DSA area you have nothing like that. > We need, IMO, a DSA-backed heirachical MemoryContext system. Perhaps the ResourceManager subsystem would help here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Error: dsa_area could not attach to a segment that hasbeen freed
From
Thomas Munro
Date:
On Thu, Sep 21, 2017 at 12:59 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Sep 20, 2017 at 5:54 AM, Craig Ringer <craig@2ndquadrant.com> wrote: >> By the way, dsa.c really needs a cross-reference to shm_toc.c and vice >> versa. With a hint as to when each is appropriate. > > /me blinks. > > Aren't those almost-entirely-unrelated facilities? I think I see what Craig means. 1. A DSM segment works if you know how much space you'll need up front so that you can size it. shm_toc provides a way to exchange pointers into it with other backends in the form of shm_toc keys (perhaps implicitly, in the form of well known keys or a convention like executor node ID -> shm_toc key). Examples: Fixed sized state for parallel-aware executor nodes, and fixed size parallel executor infrastructure. 2. A DSA area is good if you don't know how much space you'll need yet. dsa_pointer provides a way to exchange pointers into it with other backends. Examples: A shared cache, an in-memory database object like Gaddam Sai Ram's graph index extension, variable sized state for parallel-aware executor nodes, the shared record typmod registry stuff. Perhaps confusingly we also support DSA areas inside DSM segments, there are DSM segments inside DSA areas. We also use DSM segments as a kind of shared resource cleanup mechanism, and don't yet provide an equivalent for DSA. I haven't proposed anything like that because I feel like there may be a better abstraction of reliable scoped cleanup waiting to be discovered (as I think Craig was also getting at). -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Error: dsa_area could not attach to a segment that hasbeen freed
From
Craig Ringer
Date:
On 21 September 2017 at 05:50, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
On Thu, Sep 21, 2017 at 12:59 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Sep 20, 2017 at 5:54 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>> By the way, dsa.c really needs a cross-reference to shm_toc.c and vice
>> versa. With a hint as to when each is appropriate.
>
> /me blinks.
>
> Aren't those almost-entirely-unrelated facilities?
I think I see what Craig means.
1. A DSM segment works if you know how much space you'll need up
front so that you can size it. shm_toc provides a way to exchange
pointers into it with other backends in the form of shm_toc keys
(perhaps implicitly, in the form of well known keys or a convention
like executor node ID -> shm_toc key). Examples: Fixed sized state
for parallel-aware executor nodes, and fixed size parallel executor
infrastructure.
2. A DSA area is good if you don't know how much space you'll need
yet. dsa_pointer provides a way to exchange pointers into it with
other backends. Examples: A shared cache, an in-memory database
object like Gaddam Sai Ram's graph index extension, variable sized
state for parallel-aware executor nodes, the shared record typmod
registry stuff.
Perhaps confusingly we also support DSA areas inside DSM segments,
there are DSM segments inside DSA areas. We also use DSM segments as
a kind of shared resource cleanup mechanism, and don't yet provide an
equivalent for DSA. I haven't proposed anything like that because I
feel like there may be a better abstraction of reliable scoped cleanup
waiting to be discovered (as I think Craig was also getting at).
Well said, and what I would've wanted to say if I could've figured it out well enough to express it.
Hence needing some kind of README or cross reference to help people know which facility/facilities are suitable for their needs... and actually discover them.
(A hint on RequestAddinShmemSpace etc pointing to DSM + DSA would be good too)
Re: [HACKERS] Error: dsa_area could not attach to a segment thathas been freed
From
Gaddam Sai Ram
Date:
Hi Thomas,
Thanks for cautioning us about possible memory leaks(during error cases) incase of long-lived DSA segements.
Actually we are following an approach to avoid this DSA memory leaks. Let me explain our implementation and please validate and correct us in-case we miss anything.
Implementation:
Basically we have to put our index data into memory (Index Column Value Vs Ctid) which we get in aminsert callback function.
Coming to the implementation, in aminsert Callback function,
- We Switch to CurTransactionContext
- Cache the DMLs of a transaction into dlist(global per process)
- Even if different clients work parallel, it won't be a problem because every client gets one dlist in separate process and it'll have it's own CurTransactionContext
- We have registered transaction callback (using RegisterXactCallback() function). And during event pre-commit(XACT_EVENT_PRE_COMMIT), we populate all the transaction specific DMLs (from dlist) into our in-memory index(DSA) obviously inside PG_TRY/PG_CATCH block.
- In case we got some errors(because of dsa_allocate() or something else) while processing dlist(while populating in-memory index), we cleanup the DSA memory in PG_CATCH block that is allocated/used till that point.
- During other error cases, typically transactions gets aborted and PRE_COMMIT event is not called and hence we don't touch DSA at that time. Hence no need to bother about leaks.
- Even sub transaction case is handled with sub transaction callbacks.
- CurTransactionContext(dlist basically) is automatically cleared after that particular transaction.
I want to know if this approach is good and works well in all cases. Kindly provide your feedback on this.
Regards
G. Sai Ram
---- On Wed, 20 Sep 2017 14:25:43 +0530 Thomas Munro <thomas.munro@enterprisedb.com> wrote ----
On Wed, Sep 20, 2017 at 6:14 PM, Gaddam Sai Ram<gaddamsairam.n@zohocorp.com> wrote:> Thank you very much! That fixed my issue! :)> I was in an assumption that pinning the area will increase its lifetime but> yeah after taking memory context into consideration its working fine!So far the success rate in confusing people who first try to makelong-lived DSA areas and DSM segments is 100%. Basically, this is alldesigned to ensure automatic cleanup of resources in short-livedscopes.Good luck for your graph project. I think you're going to have toexpend a lot of energy trying to avoid memory leaks if your DSA livesas long as the database cluster, since error paths won't automaticallyfree any memory you allocated in it. Right now I don't have anyparticularly good ideas for mechanisms to deal with that. PostgreSQLC has exception-like error handling, but doesn't (and probably can't)have a language feature like scoped destructors from C++. IMHOexceptions need either destructors or garbage collection to keep yousane. There is a kind of garbage collection for palloc'd memory andalso for other resources like file handles, but if you're using a biglong lived DSA area you have nothing like that. You can usePG_TRY/PG_CATCH very carefully to clean up, or (probably better) youcan try to make sure that all your interaction with shared memory isno-throw (note that that means using dsa_allocate_extended(x,DSA_ALLOC_NO_OOM), because dsa_allocate itself can raise errors). Thefirst thing I'd try would probably be to keep all shmem-allocatingcode in as few routines as possible, and use only no-throw operationsin the 'critical' regions of them, and maybe look into some kind ofundo log of things to free or undo in case of error to managemulti-allocation operations if that turned out to be necessary.--Thomas Munro