Re: Linux max on shared buffers? - Mailing list pgsql-general

From Curt Sampson
Subject Re: Linux max on shared buffers?
Date
Msg-id Pine.NEB.4.44.0207290141350.28234-100000@angelic.cynic.net
Whole thread Raw
In response to Re: Linux max on shared buffers?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Sun, 28 Jul 2002, Tom Lane wrote:

> Hm.  What's the particular syscall being used for reference here?

It's a one-byte write() to /dev/null.

> And how does it compare to the sorts of activities we'd actually be
> concerned about (open, close, mmap)?

Well, I don't see that open and close are relevant, since that part of
the file handling would be exactly the same if you continued to use the
same file handle caching code we use now.

lmbench does have a test for mmap latency which tells you how long it
takes, on average, to mmap the first given number of bytes of a file.
Unfortunately, it's not giving me output for anything smaller than about
half a megabyte (perhaps because it's too fast to measure accurately?),
but here are the times, in microseconds, for sizes from that to 1 GB on
my 1533 MHz Athlon:

    0.524288 7.688
    1.048576 15
    2.097152 22
    4.194304 40
    16.777216 169
    33.554432 358
    67.108864 740
    134.217728 2245
    268.435456 5080
    536.870912 9971
    805.306368 14927
    1073.741824 19898

It seems roughly linear, so I'm guessing that an 8k mmap would be
around 0.1-0.2 microseconds, or ten times the cost of a syscall.

Really, I need to write a better benchmark for this. I'm a bit busy
this week, but I'll try to find time to do that.

Keep in mind, though, that mmap is generally quite heavily optimized,
because it's so heavily used. Almost all programs in the system are
dynamically linked (on some systems, such as Linux and Solaris, they
essentially all are), and thus they all use mmap to map in their
libraries.

> I'm not convinced that futzing with a process' memory mapping tables
> is free, however ... especially not if you're creating a large number
> of separate small mappings.

It's not free, no. But then again, memory copies are really, really
expensive.

In NetBSD, at least, you probably don't want to keep a huge number of
mappings around becuase they're stored as a linked list (ordered by
address) that's searched linearly when you need to add or delete a
mapping (though there's a hint for the most recent entry).

> If mmap provokes a TLB flush for your process, it's going to be
> expensive (just how expensive will be hard to measure, too, since most
> of the cycles will be expended after returning from mmap).

True enough, though blowing out your cache with copies is also not
cheap. But measuring this should not be hard; writing a little
program to do a bunch of copies versus a bunch of mmaps of random blocks
from a file should only be a couple of hours work. I'll work on this in my
spare time and report the results.

cjs
--
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Linux max on shared buffers?
Next
From: "Alex Cheung Tin Ka"
Date:
Subject: questions in query on 7.1 and 7.2