Re: [HACKERS] Memory mapping (Was: Safe/Fast I/O ...) - Mailing list pgsql-hackers

From ocie@paracel.com
Subject Re: [HACKERS] Memory mapping (Was: Safe/Fast I/O ...)
Date
Msg-id 9804150143.AA26661@dolomite.paracel.com
Whole thread Raw
In response to Memory mapping (Was: Safe/Fast I/O ...)  (Michal Mosiewicz <mimo@interdata.com.pl>)
List pgsql-hackers
Michal Mosiewicz wrote:
>
> While having some spare two hours I was just looking at the current code
> of postgres. I was trying to estimate how would it fit to the current
> postgres guts.
>
> Finally I've found more proofs that memory mapping would do a lot to
> current performance, but I must admit that current storage manager is
> pretty read/write oriented. It would be easier to integrate memory
> mapping into buffer manager. Actually buffer manager role is to map some
> parts of files into memory buffers. However it takes a lot to get
> through several layers (smgr and finally md).
>
> I noticed that one of the very important features of mmaping is that you
> can sync the buffer (even some part of it), not the whole file. So if
> there would be some kind of page level locking, it would be absolutly
> necessary to make sure that only committed pages are synced and we don't
> overload the IO with unfinished things.
>
> Also, I think that there is no need to create buffers in shared memory.
> I have just tested that if you map files with MAP_SHARED attribute set,
> then each proces is working on exactly the same copy of memory.

This means that the processes can share the memory, but these pages
must be explicitly mapped in the other process before it can get to
them and must be explicitly unmapped from all processes before the
memory is freed up.

It seems like there are basically two ways we could use this.

1) mmap in all files that might be used and just access them directly.

2) mmap in pages from files as they are needed and munmap the pages
out when they are no longer needed.

#1 seems easier, but it does limit us to 2gb databases on 32 bit
machines.

#2 could be done by having a sort of mmap helper.  As soon as process
A knows that it will need (might need?) a given page from a given
file, it communicates this to another process B, which attempts to
create a shared mmap for that page.  When process A actually needs to
use the page, it uses the real mmap, which should be fast if process B
has already mapped this page into memory.

Other processes could make use of this mapping (following proper
locking etiquette), each making their request to B, which simply
increments a counter on that mapping for each request after the first
one.  When a process is done with one of these mappings, it unmaps the
page itself, and then tells B that it is done with the page.  When B
sees that the count on this page has gone to zero, it can either
remove its own map, or retain it in some sort of cache in case it is
requested again in the near future.  Either way, when B figures the
page is no longer being used, it unmaps the page itself.

This mapping might get synced by the OS at unknown intervals, but
processes can sync the pages themselves, say at the end of a
transaction.

Ocie

pgsql-hackers by date:

Previous
From: "Thomas G. Lockhart"
Date:
Subject: Re: [HACKERS] Division by Zero
Next
From: Ryan Kirkpatrick
Date:
Subject: Re: [HACKERS] Linux/Alpha and pgsql....