Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL - Mailing list pgsql-hackers

From Claudio Freire
Subject Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
Date
Msg-id CAGTBQpZ2tYj9XkZS8DeYQRX2BS-fRCNS9JvVWmCZquBByC+yqA@mail.gmail.com
Whole thread Raw
In response to Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
List pgsql-hackers
On Fri, Jan 10, 2014 at 3:23 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Jan 9, 2014 at 12:46 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
>> On Thu, Jan 9, 2014 at 2:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> It would be nice to have better operating system support for this.
>>> For example, IIUC, 64-bit Linux has 128TB of address space available
>>> for user processes.  When you clone(), it can either share the entire
>>> address space (i.e. it's a thread) or none of it (i.e. it's a
>>> process).  There's no option to, say, share 64TB and not the other
>>> 64TB, which would be ideal for us.  We could then map dynamic shared
>>> memory segments into the shared portion of the address space and do
>>> backend-private allocations in the unshared part.  Of course, even if
>>> we had that, it wouldn't be portable, so who knows how much good it
>>> would do.  But it would be awfully nice to have the option.
>>
>> You can map a segment at fork time, and unmap it after forking. That
>> doesn't really use RAM, since it's supposed to be lazily allocated (it
>> can be forced to be so, I believe, with PROT_NONE and MAP_NORESERVE,
>> but I don't think that's portable).
>>
>> That guarantees it's free.
>
> It guarantees that it is free as of the moment you unmap it, but it
> doesn't guarantee that future memory allocations or shared library
> loads couldn't stomp on the space.

You would only unmap prior to remapping, only the to-be-mapped
portion, so I don't see a problem.

> Also, that not-portable thing is a bit of a problem.  I've got no
> problem with the idea that third-party code may be platform-specific,
> but I think the stuff we ship in core has got to work on more or less
> all reasonably modern systems.
>
>> Next, you can map shared memory at explicit addresses (linux's mmap
>> has support for that, and I seem to recall Windows did too).
>>
>> All you have to do, is some book-keeping in shared memory (so all
>> processes can coordinate new mappings).
>
> I did something like this back in 1998 or 1999 at the operating system
> level, and it turned out not to work very well.  I was working on an
> experimental research operating system kernel, and we wanted to add
> support for mmap(), so we set aside a portion of the virtual address
> space for file mappings.  That region was shared across all processes
> in the system.  One problem is that there's no guarantee the space is
> big enough for whatever you want to map; and the other problem is that
> it can easily get fragmented.  Now, 64-bit address spaces go some way
> to ameliorating these concerns so maybe it can be made to work, but I
> would be a teeny bit cautious about using the word "just" to describe
> the complexity involved.

Ok, yes, fragmentation could be an issue if the address range is not
"humongus enough".



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: nested hstore patch
Next
From: Peter Geoghegan
Date:
Subject: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE