Re: Caching (was Re: choosing the right platform) - Mailing list pgsql-performance

From Matthew Nuzum
Subject Re: Caching (was Re: choosing the right platform)
Date
Msg-id 000501c2fefe$9f82c150$6900a8c0@mattspc
Whole thread Raw
In response to Caching (was Re: choosing the right platform)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Thanks for all the feedback, this is very informative.

My current issues that I'm still not clear on, are:
* Is the ia32 architecture going to impose uncomfortable limits on my
application?  I'm seeing lots of confirmation that this platform, regardless
of the OS is going to limit me to less the 4GB of memory allocated to a
single application (i.e. http://www.spack.org/index.cgi/LinuxRamLimits).
This may or may not be an issue because: (note that these are questions, not
statements)
** Postgres is multi-process, not multi-threaded (?)
** It's better to not use huge amount of sort-mem but instead let the OS do
the caching (?)
** My needs are really not going to be as big as I think they are if I
manage the application/environment correctly (?)

Here are some of the performance suggestions I've heard, please, if I
mis-understood, could you help me get clarity?
* It's better to run fewer apache children and turn off persistent
connections (I had suggested 200 children per server, someone else suggested
40)
* FreeBSD is going to provide a better file system than Linux (because Linux
only supports large files on journaling filesystems which impose extra over
head) (this gleaned from this conversation and previous threads in archives)
* Running Linux or *BSD on a 64 bit platform can alleviate some potential
RAM limitations (if there are truly going to be limitations).  If this is
so, I've heard suggestions for Itanium, Sparc and RS/6000.  Maybe someone
can give some more info on these, here are my immediate thoughts: I've heard
that the industry as a whole has not yet warmed up to Itanium.  I can't
afford the newest Sparc Servers, so I'd need to settle with a previous
generation if I went that route, any problems with that?  I know nothing
about the RS/6000 servers (I did see one once though :-), does linux|*BSD
run well on them and any suggestions for series/models I should look at?

Finally, some specific questions,
What's the max number of connections someone has seen on a database server?
What type of hardware was it?  How much RAM did postgres use?

Thanks again,

--
Matthew Nuzum
www.bearfruit.org
cobalt@bearfruit.org

> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-
> owner@postgresql.org] On Behalf Of Tom Lane
> Sent: Wednesday, April 09, 2003 8:21 PM
> To: jim@nasby.net
> Cc: scott.marlowe; Matthew Nuzum; 'Josh Berkus'; 'Pgsql-Performance'
> Subject: Caching (was Re: [PERFORM] choosing the right platform)
>
> "Jim C. Nasby" <jim@nasby.net> writes:
> > That seems odd... shouldn't pgsql be able to cache information better
> > since it would be cached in whatever format is best for it, rather than
> > the raw page format (or maybe that is the best format). There's also the
> > issue of having to go through more layers of software if you're relying
> > on the OS caching. All the tuning info I've seen for every other
> > database I've worked with specifically recommends giving the database as
> > much memory as you possibly can, the theory being that it will do a much
> > better job of caching than the OS will.
>
> There are a number of reasons why that's a dubious policy for PG (I
> won't take a position on whether these apply to other databases...)
>
> One is that because we sit on top of the OS' filesystem, we can't
> (portably) prevent the OS from caching blocks.  So it's quite easy to
> get into a situation where the same data is cached twice, once in PG
> buffers and once in kernel disk cache.  That's clearly a waste of RAM
> however you slice it, and it's worst when you set the PG shared buffer
> size to be about half of available RAM.  You can minimize the
> duplication by skewing the allocation one way or the other: either set
> PG's allocation relatively small, relying heavily on the OS to do the
> caching; or make PG's allocation most of RAM and hope to squeeze out
> the OS' cache.  There are partisans for both approaches on this list.
> I lean towards the first policy because I think that starving the kernel
> for RAM is a bad idea.  (Especially if you run on Linux, where this
> policy tempts the kernel to start kill -9'ing random processes ...)
>
> Another reason is that PG uses a simplistic fixed-number-of-buffers
> internal cache, and therefore it can't adapt on-the-fly to varying
> memory pressure, whereas the kernel can and will give up disk cache
> space to make room when it's needed for processes.  Since PG isn't
> even aware of the total memory pressure on the system as a whole,
> it couldn't do as good a job of trading off cache vs process workspace
> as the kernel can do, even if we had a variable-size cache scheme.
>
> A third reason is that on many (most?) Unixen, SysV shared memory is
> subject to swapping, and the bigger you make the shared_buffer arena,
> the more likely it gets that some of the arena will be touched seldom
> enough to make it a candidate for swapping.  A disk buffer that gets
> swapped to disk is worse than useless (if it's dirty, the swapping
> is downright counterproductive, since an extra read and write cycle
> will be needed before the data can make it to its rightful place).
>
> PG is *not* any smarter about the usage patterns of its disk buffers
> than the kernel is; it uses a simple LRU algorithm that is surely no
> brighter than what the kernel uses.  (We have looked at smarter buffer
> recycling rules, but failed to see any performance improvement.)  So the
> notion that PG can do a better job of cache management than the kernel
> is really illusory.  About the only advantage you gain from having data
> directly in PG buffers rather than kernel buffers is saving the CPU
> effort needed to move data across the userspace boundary --- which is
> not zero, but it's sure a lot less than the time spent for actual I/O.
>
> So my take on it is that you want shared_buffers fairly small, and let
> the kernel do the bulk of the heavy lifting for disk cache.  That's what
> it does for a living, so let it do what it does best.  You only want
> shared_buffers big enough so you don't spend too many CPU cycles shoving
> data back and forth between PG buffers and kernel disk cache.  The
> default shared_buffers setting of 64 is surely too small :-(, but my
> feeling is that values in the low thousands are enough to get past the
> knee of that curve in most cases.
>
>             regards, tom lane
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Help analyzing 7.2.4 EXPLAIN
Next
From: "Matthew Nuzum"
Date:
Subject: Caching (was Re: choosing the right platform)