Re: Experimental dynamic memory allocation of postgresql shared memory - Mailing list pgsql-hackers

From Aleksey Demakov
Subject Re: Experimental dynamic memory allocation of postgresql shared memory
Date
Msg-id CAFCwUrCZfbSCxbFZv3GiNRPZTTtEmNqvd6H=hA6xkxKyCJY1hQ@mail.gmail.com
Whole thread Raw
In response to Re: Experimental dynamic memory allocation of postgresql shared memory  (Aleksey Demakov <ademakov@gmail.com>)
List pgsql-hackers
Sorry for unclear language. Late Friday evening in my place is to blame.

On Sat, Jun 18, 2016 at 12:23 AM, Aleksey Demakov <ademakov@gmail.com> wrote:
> On Fri, Jun 17, 2016 at 10:54 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Fri, Jun 17, 2016 at 12:34 PM, Aleksey Demakov <ademakov@gmail.com> wrote:
>>>> I expect that to be useful for parallel query and anything else where
>>>> processes need to share variable-size data.  However, that's different
>>>> from this because ours can grown to arbitrary size and shrink again by
>>>> allocating and freeing with DSM segments.  We also do everything with
>>>> relative pointers since DSM segments can be mapped at different
>>>> addresses in different processes, whereas this would only work with
>>>> memory carved out of the main shared memory segment (or some new DSM
>>>> facility that guaranteed identical placement in every address space).
>>>>
>>>
>>> I believe it would be perfectly okay to allocate huge amount of address
>>> space with mmap on startup.  If the pages are not touched, the OS VM
>>> subsystem will not commit them.
>>
>> In my opinion, that's not going to fly.  If I thought otherwise, I
>> would not have developed the DSM facility in the first place.
>>
>> First, the behavior in this area is highly dependent on choice of
>> operating system and configuration parameters.  We've had plenty of
>> experience with requiring non-default configuration parameters to run
>> PostgreSQL, and it's all bad.  I don't really want to have to tell
>> users that they must run with a particular value of
>> vm.overcommit_memory in order to run the server.  Nor do I want to
>> tell users of other operating systems that their ability to run
>> PostgreSQL is dependent on the behavior their OS has in this area.  I
>> had a MacBook Pro up until a year or two ago where a sufficiently
>> shared memory request would cause a kernel panic.  That bug will
>> probably be fixed at some point if it hasn't been already, but
>> probably by returning an error rather than making it work.
>>
>> Second, there's no way to give memory back once you've touched it.  If
>> you decide to do a hash join on a 250GB inner table using a shared
>> hash table, you're going to have 250GB in swap-backed pages floating
>> around when you're done.  If the user has swap configured (and more
>> and more people don't), the operating system will eventually page
>> those out, but until that happens those pages are reducing the amount
>> of page cache that's available, and after it happens they're using up
>> swap.  In either case, the space consumed is consumed to no purpose.
>> You don't care about that hash table any more once the query
>> completes; there's just no way to tell the operating system that.  If
>> your workload follows an entirely predictable pattern and you always
>> have about the same amount of usage of this facility then you can just
>> reuse the same pages and everything is fine.  But if your usage
>> fluctuates I believe it will be a big problem.  With DSM, we can and
>> do explicitly free the memory back to the OS as soon as we don't need
>> it any more - and that's a big benefit.
>>
>
> Essentially this is pessimizing for the lowest common denominator
> among OSes. Having a contiguous address space makes things so
> much simpler that considering this case, IMHO, is well worth of it.
>
> You are right that this might highly depend on the OS. But you are
> only partially right that it's impossible to give the memory back once
> you touched it. It is possible in many cases with additional measures.
> That is with additional control over memory mapping. Surprisingly, in
> this case windows has the most straightforward solution. VirtualAlloc
> has separate MEM_RESERVE and MEM_COMMIT flags. On various
> Unix flavours it is possible to play with mmap MAP_NORESERVE
> flag and madvise syscall. Finally, it's possible to repeatedly mmap
> and munmap on portions of a contiguous address space providing
> a given addr argument for both of them. The last option might, of
> course, is susceptible to hijacking this portion of the address by an
> inadvertent caller of mmap with NULL addr argument. But probably
> this could be avoided by imposing a disciplined use of mmap in
> postgresql core and extensions.
>
> Thus providing a single contiguous shared address space is doable.
> The other question is how much it would buy. As for development
> time of an allocator it is a clear win. In terms of easy passing direct
> memory pointers between backends this a clear win again.
>
> In terms of resulting performance, I don't know. This would take
> a few cycles on every step. You have a shared hash table. You
> cannot keep pointers there. You need to store offsets against the
> base address. Any reference would involve additional arithmetics.
> When these things add up, the net effect might become noticeable.
>
> Regards,
> Aleksey



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Experimental dynamic memory allocation of postgresql shared memory
Next
From: Aleksey Demakov
Date:
Subject: Re: Experimental dynamic memory allocation of postgresql shared memory