Thread: high shared buffer and swap

From:
Laurent Laborde
Date:

Friendly greetings !
I found something "odd" (something that i can't explain) this weekend.

An octocore server with 32GB of ram, running postgresql 8.3.6
Running only postgresql, slony-I and pgbouncer.

Just for testing purpose, i tried a setting with 26GB of shared_buffer.

I quickly noticed that the performances wasn't very good and the
server started to swap slowly but surely.
 (but still up to 2000query/second as reported by pgfouine)

It used all the 2GB of swap.
I removed the server from production, added 10GB of swap and left it
for the weekend with only slony and postgresql up to keep it in sync
with the master database.

This morning i found that the whole 12GB of swap were used :
Mem:  32892008k total, 32714728k used,   177280k free,    70872k buffers
Swap: 12582896k total, 12531812k used,    51084k free, 27047696k cached

# cat /proc/meminfo
MemTotal:     32892008 kB
MemFree:        171140 kB
Buffers:         70852 kB
Cached:       27065208 kB
SwapCached:    4752492 kB
Active:       24362168 kB
Inactive:      7806884 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:     32892008 kB
LowFree:        171140 kB
SwapTotal:    12582896 kB
SwapFree:        53064 kB
Dirty:          122636 kB
Writeback:           0 kB
AnonPages:      280336 kB
Mapped:       14118588 kB
Slab:           224632 kB
PageTables:     235120 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:  29028900 kB
Committed_AS: 28730620 kB
VmallocTotal: 34359738367 kB
VmallocUsed:     12916 kB
VmallocChunk: 34359725307 kB

While i understand that a very high shared_buffer wasn't a good idea,
i don't understand this behaviour.
Any tought ?

I tried this setup because having 2 level of data caching doesn't make
sense to me. (1 in OS filesystem cache and 1 in shm (shared_buffer)).

I'd love to understand what's happening here ! Thank  you :)

--
F4FQM
Kerunix Flan
Laurent Laborde

From:
Greg Stark
Date:

Sorry for top-posting - the iphone mail client sucks.

I think what's happening is that the sytem is seeing that some pages
of shared memory haven't been used recently and because there's more
shared memory than filesystem cache less recently than the filesystem
cache pages. So it pages out the shared memory. This is really awful
because we use a kind of lru algorithm for shared memory so the pages
that it's paging out are precisely the pges likely to be used soon.

I wonder if we should try to mlock shared buffers.

--
Greg


On 4 May 2009, at 10:10, Laurent Laborde <> wrote:

> Friendly greetings !
> I found something "odd" (something that i can't explain) this weekend.
>
> An octocore server with 32GB of ram, running postgresql 8.3.6
> Running only postgresql, slony-I and pgbouncer.
>
> Just for testing purpose, i tried a setting with 26GB of
> shared_buffer.
>
> I quickly noticed that the performances wasn't very good and the
> server started to swap slowly but surely.
> (but still up to 2000query/second as reported by pgfouine)
>
> It used all the 2GB of swap.
> I removed the server from production, added 10GB of swap and left it
> for the weekend with only slony and postgresql up to keep it in sync
> with the master database.
>
> This morning i found that the whole 12GB of swap were used :
> Mem:  32892008k total, 32714728k used,   177280k free,    70872k
> buffers
> Swap: 12582896k total, 12531812k used,    51084k free, 27047696k
> cached
>
> # cat /proc/meminfo
> MemTotal:     32892008 kB
> MemFree:        171140 kB
> Buffers:         70852 kB
> Cached:       27065208 kB
> SwapCached:    4752492 kB
> Active:       24362168 kB
> Inactive:      7806884 kB
> HighTotal:           0 kB
> HighFree:            0 kB
> LowTotal:     32892008 kB
> LowFree:        171140 kB
> SwapTotal:    12582896 kB
> SwapFree:        53064 kB
> Dirty:          122636 kB
> Writeback:           0 kB
> AnonPages:      280336 kB
> Mapped:       14118588 kB
> Slab:           224632 kB
> PageTables:     235120 kB
> NFS_Unstable:        0 kB
> Bounce:              0 kB
> CommitLimit:  29028900 kB
> Committed_AS: 28730620 kB
> VmallocTotal: 34359738367 kB
> VmallocUsed:     12916 kB
> VmallocChunk: 34359725307 kB
>
> While i understand that a very high shared_buffer wasn't a good idea,
> i don't understand this behaviour.
> Any tought ?
>
> I tried this setup because having 2 level of data caching doesn't make
> sense to me. (1 in OS filesystem cache and 1 in shm (shared_buffer)).
>
> I'd love to understand what's happening here ! Thank  you :)
>
> --
> F4FQM
> Kerunix Flan
> Laurent Laborde
>
> --
> Sent via pgsql-performance mailing list (
> )
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

From:
Martijn van Oosterhout
Date:

On Mon, May 04, 2009 at 10:57:47AM +0200, Greg Stark wrote:
> I think what's happening is that the sytem is seeing that some pages of
> shared memory haven't been used recently and because there's more shared
> memory than filesystem cache less recently than the filesystem cache
> pages. So it pages out the shared memory. This is really awful because we
> use a kind of lru algorithm for shared memory so the pages that it's
> paging out are precisely the pges likely to be used soon.
>
> I wonder if we should try to mlock shared buffers.

You can try, but it probably won't work. You often need to be root to
lock pages, and even on Linux 2.6.9+ where you don't need to be root
there's a limit of 32KB (that's only my machine anyway). Sure, that can
be changed, if you're root.

Actually locking the shared buffers seems to me like a footgun. People
occasionally give postgresql masses of memory leaving not enough to run
the rest of the system. Locking the memory would make the situation
worse.

Personally I've never seen a benefit of setting shared buffer above the
expected working set size. I generally let the kernel share the
remaining memory between postgresql disk cache and other processes I
might be running. On a NUMA machine you want to be keeping your memory
on the local node and letting the kernel copy that data from elsewhere
to your local memory when you need it.

Have a nice day,
--
Martijn van Oosterhout   <>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

From:
Scott Marlowe
Date:

On Mon, May 4, 2009 at 2:10 AM, Laurent Laborde <> wrote:
> Friendly greetings !
> I found something "odd" (something that i can't explain) this weekend.
>
> An octocore server with 32GB of ram, running postgresql 8.3.6
> Running only postgresql, slony-I and pgbouncer.
>
> Just for testing purpose, i tried a setting with 26GB of shared_buffer.
>
> I quickly noticed that the performances wasn't very good and the
> server started to swap slowly but surely.
>  (but still up to 2000query/second as reported by pgfouine)
>
> It used all the 2GB of swap.
> I removed the server from production, added 10GB of swap and left it
> for the weekend with only slony and postgresql up to keep it in sync
> with the master database.
>
> This morning i found that the whole 12GB of swap were used :
> Mem:  32892008k total, 32714728k used,   177280k free,    70872k buffers
> Swap: 12582896k total, 12531812k used,    51084k free, 27047696k cached

Try setting swappiness =0.

But as someone else mentioned, I've alwas had better luck letting the
OS do most of the caching anyway.