Home > mailing lists
RE: Copy data to DSA area - Mailing list pgsql-hackers

From	Ideriha, Takeshi
Subject	RE: Copy data to DSA area
Date	November 9, 2018 01:19:25
Msg-id	4E72940DA2BF16479384A86D54D0988A6F1EFEC3@G01JPEXMBKW04 Whole thread
In response to	Re: Copy data to DSA area (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses	Re: Copy data to DSA area
List	pgsql-hackers
Tree view
Hi, thank you for all the comment. 
It's really helpful.

>From: Thomas Munro [mailto:thomas.munro@enterprisedb.com]
>Sent: Wednesday, November 7, 2018 1:35 PM
>
>On Wed, Nov 7, 2018 at 3:34 PM Ideriha, Takeshi <ideriha.takeshi@jp.fujitsu.com>
>wrote:
>> Related to my development (putting relcache and catcache onto shared
>> memory)[1],
>>
>> I have some questions about how to copy variables into shared memory, especially
>DSA area, and its implementation way.
>>
>> Under the current architecture when initializing some data, we
>> sometimes copy certain data using some specified functions
>>
>> such as CreateTupleDescCopyConstr(), datumCopy(), pstrdup() and so on.
>> These copy functions calls palloc() and allocates the
>>
>> copied data into current memory context.
>
>Yeah, I faced this problem in typcache.c, and you can see the function
>share_typledesc() which copies TupleDesc objects into a DSA area.
>This doesn't really address the fundamental problem here though... see below.

I checked share_tupledesc(). My original motivation came from copying tupledesc 
with constraint like CreateTupleDescCopyConstr().
But yes, as you stated here and bellow without having to copy
tupledesc constraint, this method makes sense.

>
>> But on the shared memory, palloc() cannot be used. Now I'm trying to
>> use DSA (and dshash) to handle data on the shared memory
>>
>> so for example dsa_allocate() is needed instead of palloc(). I hit upon three ways to
>implementation.
>>
>> A. Copy existing functions and write equivalent DSA version copy
>> functions like CreateDSATupleDescCopyConstr(),
>>
>>    datumCopyDSA(), dsa_strdup()
>>
>>    In these functions the logic is same as current one but would be replaced palloc()
>with dsa_allocate().
>>
>>    But this way looks too straight forward and code becomes less readable and
>maintainable.
>>
>> B. Using current functions and copy data on local memory context temporarily and
>copy it again to DSA area.
>>
>>    This method looks better compared to the plan A because we don't need to write
>clone functions with copy-paste.
>>
>>    But copying twice would degrade the performance.
>
>It's nice when you can construct objects directly at an address supplied by the caller.
>In other words, if allocation and object initialization are two separate steps, you can
>put the object anywhere you like without copying.  That could be on the stack, in an
>array, inside another object, in regular heap memory, in traditional shared memory, in
>a DSM segment or in DSA memory.  I asked for an alloc/init separation in the Bloom
>filter code for the same reason.  But this still isn't the real problem here...

Yes, actually I tried to create a new function TupleDescCopyConstr() which is almost same 
as TupleDescCopy() except also copying constraints. This is supposed to separate allocation 
and initialization. But as you pointed out bellow, I had to manage object graph with pointes 
and faced the difficulty.

>> C. Enhance the feature of palloc() and MemoryContext.
>>
>>    This is a rough idea but, for instance, make a new global flag to tell palloc() to
>use DSA area instead of normal MemoryContext.
>>
>>    MemoryContextSwitchToDSA(dsa_area *area) indicates following palloc() to
>allocate memory to DSA.
>>
>>    And MemoryContextSwitchBack(dsa_area) indicates to palloc is used as normal
>one.
>>
>>    MemoryContextSwitchToDSA(dsa_area);
>>
>>    palloc(size);
>>
>>    MemoryContextSwitchBack(dsa_area);
>>
>> Plan C seems a handy way for DSA user because people can take advantage of
>existing functions.
>
>The problem with plan C is that palloc() has to return a native pointer, but in order to
>do anything useful with this memory (ie to share it) you also need to get the
>dsa_pointer, but the palloc() interface doesn't allow for that.  Any object that holds a
>pointer allocated with DSA-hiding-behind-palloc() will be useless for another process.

Agreed. I didn't have much consideration on this point.

>> What do you think about these ideas?
>
>The real problem is object graphs with pointers.  I solved that problem for TupleDesc
>by making them *almost* entirely flat, in commit c6293249.  I say 'almost' because it
>won't work for constraints or defaults, but that didn't matter for the typcache.c case
>because it doesn't use those.  In other words I danced carefully around the edge of
>the problem.
>
>In theory, object graphs, trees, lists etc could be implemented in a way that allows for
>"flat" storage, if they can be allocated in contiguous memory and refer to sub-objects
>within that space using offsets from the start of the space, and then they could be used
>without having to know whether they are in DSM/DSA memory or regular memory.
>That seems like a huge project though.  Otherwise they have to hold dsa_pointer, and
>deal with that in many places.  You can see this in the Parallel Hash code.  I had to
>make the hash table data structure able to deal with raw pointers OR dsa_pointer.
>That's would be theoretically doable, but really quite painful, for the whole universe of
>PostgreSQL node types and data structures.

Yeah, thank you for summarizing this point. That's one of the difficulty I've faced with 
but failed to state it in my previous email. 
I agreed your point.
Making everything "flat" is not a realistic choice and holding dsa_pointer here and there 
and switching native pointer or dsa_pointer using union type or other things can be done but still huge one.


>I know of 3 ideas that would make your idea C work, so that you could share
>something as complicated as a query plan directly without having to deserialise it to
>use it:
>
>1.  Allow the creation of DSA areas inside the traditional fixed memory segment
>(instead of DSM), in a fixed-sized space reserved by the postmaster.  That is, use
>dsa.c's ability to allocate and free memory, and possibly free a whole area at once to
>avoid leaking memory in some cases (like MemoryContexts), but in this mode
>dsa_pointer would be directly castable to a raw pointer.  Then you could provide a
>regular MemoryContext interface for it, and use it via palloc(), as you said, and all the
>code that knows how to construct lists and trees and plan nodes etc would All Just
>Work.  It would be your plan C, and all the pointers would be usable in every process,
>but limited in total size at start-up time.
>
>2.  Allow regular DSA in DSM to use raw pointers into DSM segments, by mapping
>segments at the same address in every backend.  This involves reserving a large
>virtual address range up front in the postmaster, and then managing the space,
>trapping SEGV to map/unmap segments into parts of that address space as necessary
>(instead of doing that in dsa_get_address()).  AFAIK that would work, but it seems to
>be a bit weird to go to such lengths.  It would be a kind of home-made simulation of
>threads.  On the other hand, that is what we're already doing in dsa.c, except more
>slowly due to extra software address translation from funky pseudo-addresses.
>
>3.  Something something threads.

I'm thinking to go with plan 1. No need to think about address translation
seems tempting. Plan 2 (as well as plan 3) looks a big project.

Regards,
Takeshi Ideriha
pgsql-hackers by date:
From: Thomas Munro
Date: 09 November 2018, 01:16:02
Subject: Strange corruption in psql output on mereswine
From: Edmund Horner
Date: 09 November 2018, 01:23:42
Subject: Re: Cache relation sizes?
RE: Copy data to DSA area - Mailing list pgsql-hackers

Previous

Next