Thread: can shared cache be swapped to disk?

can shared cache be swapped to disk?

From

flyusa2010 fly

Date:

19 December 2010, 02:11:35

hi, folks!

I see that shared cache is implemented by system v shared memory. I wonder whether data in this area can be swapped out to disk.

Isn't it bad that we read data from disk, put data in shared cache, and finally data in shared cache is swapped to disk again!

Why not use shmctl(..SHM_LOCK..) to pin data in main memory?

Thanks!

Re: can shared cache be swapped to disk?

From

Jeff Janes

Date:

19 December 2010, 03:59:41

On Sat, Dec 18, 2010 at 10:11 PM, flyusa2010 fly <flyusa2010@gmail.com> wrote:
> hi, folks!
> I see that shared cache is implemented by system v shared memory. I wonder
> whether data in this area can be swapped out to disk.
> Isn't it bad that we read data from disk, put data in shared cache, and
> finally data in shared cache is swapped to disk again!
> Why not use shmctl(..SHM_LOCK..) to pin data in main memory?
> Thanks!

I've tried that on a recent linux kernel, to see if it would allow
shared_buffers to usefully be a large fraction of total memory.  It
didn't help.  So either swapping wasn't the problem in the first
place, or the kernel ignores the order.

Cheers,

Jeff

Re: can shared cache be swapped to disk?

From

Martijn van Oosterhout

Date:

19 December 2010, 10:14:37

On Sat, Dec 18, 2010 at 11:59:33PM -0800, Jeff Janes wrote:
> On Sat, Dec 18, 2010 at 10:11 PM, flyusa2010 fly <flyusa2010@gmail.com> wrote:
> > hi, folks!
> > I see that shared cache is implemented by system v shared memory. I wonder
> > whether data in this area can be swapped out to disk.
> > Isn't it bad that we read data from disk, put data in shared cache, and
> > finally data in shared cache is swapped to disk again!
> > Why not use shmctl(..SHM_LOCK..) to pin data in main memory?
> > Thanks!
>
> I've tried that on a recent linux kernel, to see if it would allow
> shared_buffers to usefully be a large fraction of total memory.  It
> didn't help.  So either swapping wasn't the problem in the first
> place, or the kernel ignores the order.

Correct. The kernel ignores locking requests because it's a great way
to DOS a machine. For example, mlock() of large blocks of memory is
also not permitted for similar reasons.

The way you make sure shared memory doesn't get swapped out is to make
sure it gets used. (i.e. don't give 2GB shared memory when your
database is 100MB). And don't make your shared memory so large that
you're creating significant memory pressure, otherwise the kernel might
choose to swap our your shared memory rather than say the webserver.

Your shared memory should be reasonably sized, but you should make sure
the kernel has enough "cache" memory it can throw away first.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
>                                       - Charles de Gaulle

Re: can shared cache be swapped to disk?

From

Jeff Janes

Date:

04 January 2011, 13:51:14

On Sun, Dec 19, 2010 at 6:14 AM, Martijn van Oosterhout
<kleptog@svana.org> wrote:
> On Sat, Dec 18, 2010 at 11:59:33PM -0800, Jeff Janes wrote:
>> On Sat, Dec 18, 2010 at 10:11 PM, flyusa2010 fly <flyusa2010@gmail.com> wrote:
>> > hi, folks!
>> > I see that shared cache is implemented by system v shared memory. I wonder
>> > whether data in this area can be swapped out to disk.
>> > Isn't it bad that we read data from disk, put data in shared cache, and
>> > finally data in shared cache is swapped to disk again!
>> > Why not use shmctl(..SHM_LOCK..) to pin data in main memory?
>> > Thanks!
>>
>> I've tried that on a recent linux kernel, to see if it would allow
>> shared_buffers to usefully be a large fraction of total memory.  It
>> didn't help.  So either swapping wasn't the problem in the first
>> place, or the kernel ignores the order.
>
> Correct. The kernel ignores locking requests because it's a great way
> to DOS a machine. For example, mlock() of large blocks of memory is
> also not permitted for similar reasons.

Does it ignore such requests in general, or only under certain situations?

If the latter, do you know what those situations are?

If the former, that seems incredibly bogus.  There are plenty of ways
to DOS a machine.  The main way you prevent DOS by your own authorized
users (other than firing them) on linux is by "setrlimit", not by
claiming to implement a feature you haven't actually implemented, or
by implementing a feature but rendering it completely useless for the
purpose it was intended for.

RLIMIT_MEMLOCK exists, it has a small default hard limit, and only
root can increase that.  If root has gone out of its way to grant the
postgres user a higher limit, the kernel should respect that, at least
up until the situation become truly desperate.

However, I don't have any evidence it is being ignored.  I just know
that locking the shared memory did not improve things, but I didn't
verify that shared memory getting swapped out was the problem in the
first place.

> The way you make sure shared memory doesn't get swapped out is to make
> sure it gets used. (i.e. don't give 2GB shared memory when your
> database is 100MB). And don't make your shared memory so large that
> you're creating significant memory pressure, otherwise the kernel might
> choose to swap our your shared memory rather than say the webserver.
>
> Your shared memory should be reasonably sized, but you should make sure
> the kernel has enough "cache" memory it can throw away first.

Unfortunately it is hard to know what the kernel considers to be
significant memory pressure.

My experience (from mostly non-pgsql work) is that kernel has what I
would consider enough cache memory to throw away, but for some reason
doesn't throw it away but does more counter productive things instead.

Cheers,

Jeff

Re: can shared cache be swapped to disk?

From

Martijn van Oosterhout

Date:

04 January 2011, 18:53:07

On Tue, Jan 04, 2011 at 09:51:05AM -0800, Jeff Janes wrote:
> > Correct. The kernel ignores locking requests because it's a great way
> > to DOS a machine. For example, mlock() of large blocks of memory is
> > also not permitted for similar reasons.
>
> Does it ignore such requests in general, or only under certain situations?
>
> If the latter, do you know what those situations are?

Well, not in general, but for shared memory it's ignored (not sure
about if you're root). It used to be that shared memory was always
locked, which sounds like a great idea, until people started abusing it.

So now shared memory is on ethe same footing as other memory. Not sure
where I read this, I know it came up several years ago. I think it
changed back in 2.0 times.

> RLIMIT_MEMLOCK exists, it has a small default hard limit, and only
> root can increase that.  If root has gone out of its way to grant the
> postgres user a higher limit, the kernel should respect that, at least
> up until the situation become truly desperate.

Like I said, not sure about how it works for root.

> Unfortunately it is hard to know what the kernel considers to be
> significant memory pressure.
>
> My experience (from mostly non-pgsql work) is that kernel has what I
> would consider enough cache memory to throw away, but for some reason
> doesn't throw it away but does more counter productive things instead.

Possibly. Everyone always considers their memory to be more important
than all other memory on the system, but the kernel has a much better
idea of what's going on than the user. That doesn't mean it's without
fault or couldn't be improved.

But if there's a bunch of shared memory not being accessed very often
and the kernel thinks it's better used somewhere else, it may be right.
Repeatable test cases in this area are really hard.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
>                                       - Charles de Gaulle

Re: can shared cache be swapped to disk?

From

Jeff Janes

Date:

04 January 2011, 21:41:02

On Tue, Jan 4, 2011 at 2:52 PM, Martijn van Oosterhout
<kleptog@svana.org> wrote:
> On Tue, Jan 04, 2011 at 09:51:05AM -0800, Jeff Janes wrote:
>> > Correct. The kernel ignores locking requests because it's a great way
>> > to DOS a machine. For example, mlock() of large blocks of memory is
>> > also not permitted for similar reasons.
>>
>> Does it ignore such requests in general, or only under certain situations?
>>
>> If the latter, do you know what those situations are?
>
> Well, not in general, but for shared memory it's ignored (not sure
> about if you're root). It used to be that shared memory was always
> locked, which sounds like a great idea, until people started abusing it.
>
> So now shared memory is on ethe same footing as other memory. Not sure
> where I read this, I know it came up several years ago. I think it
> changed back in 2.0 times.
>
>> RLIMIT_MEMLOCK exists, it has a small default hard limit, and only
>> root can increase that.  If root has gone out of its way to grant the
>> postgres user a higher limit, the kernel should respect that, at least
>> up until the situation become truly desperate.
>
> Like I said, not sure about how it works for root.

I mean that root can increase it for *other* users.

I've done the experiment on kernel 2.6.31.5, as a non-root user, and
it looks like the kernel is respecting the SHM_LOCK.

On a 2GB machine I set shared_buffers to 1200MB and run pgbench -S
with scale of 80, and run it until it seems to be fully cached.

("top" doesn't distinguish between memory that has been requested but
never accessed, versus memory that has been accessed and then truly
swapped out to disk.  So unless you first let it run to steady-state
before applying pressure, it is hard to interpret the results.)

Then I start up a Perl program that just perpetually loops through
~1.1 GB of memory.

If I SHM_LOCK postgres's memory, then only perl starts swapping.  If I
don't lock it, then both perl and postgres start swapping.

Obviously there is a lot of territory not covered here, but it looks
like locking memory is respected in general.  It still doesn't let you
benefit from using shared_buffers that are a large portion of RAM
(other than in silly test cases), and I don't know why that is, but
I'm now pretty sure it isn't due to swapping out the shared memory.

Cheers,

Jeff

Re: can shared cache be swapped to disk?

From

Dimitri Fontaine

Date:

05 January 2011, 17:25:42

Jeff Janes <jeff.janes@gmail.com> writes:
> ("top" doesn't distinguish between memory that has been requested but
> never accessed, versus memory that has been accessed and then truly
> swapped out to disk.  So unless you first let it run to steady-state
> before applying pressure, it is hard to interpret the results.)

Would exmap be helping you here?
 http://www.berthels.co.uk/exmap/

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support