Caching (was Re: choosing the right platform) - Mailing list pgsql-performance
From | Matthew Nuzum |
---|---|
Subject | Caching (was Re: choosing the right platform) |
Date | |
Msg-id | 000601c2fefe$d4b07e80$6900a8c0@mattspc Whole thread Raw |
Responses |
Re: Caching (was Re: choosing the right platform)
Re: Caching (was Re: choosing the right platform) Re: Caching (was Re: choosing the right platform) |
List | pgsql-performance |
Thanks for all the feedback, this is very informative. My current issues that I'm still not clear on, are: * Is the ia32 architecture going to impose uncomfortable limits on my application? I'm seeing lots of confirmation that this platform, regardless of the OS is going to limit me to less the 4GB of memory allocated to a single application (i.e. http://www.spack.org/index.cgi/LinuxRamLimits). This may or may not be an issue because: (note that these are questions, not statements) ** Postgres is multi-process, not multi-threaded (?) ** It's better to not use huge amount of sort-mem but instead let the OS do the caching (?) ** My needs are really not going to be as big as I think they are if I manage the application/environment correctly (?) Here are some of the performance suggestions I've heard, please, if I mis-understood, could you help me get clarity? * It's better to run fewer apache children and turn off persistent connections (I had suggested 200 children per server, someone else suggested 40) * FreeBSD is going to provide a better file system than Linux (because Linux only supports large files on journaling filesystems which impose extra over head) (this gleaned from this conversation and previous threads in archives) * Running Linux or *BSD on a 64 bit platform can alleviate some potential RAM limitations (if there are truly going to be limitations). If this is so, I've heard suggestions for Itanium, Sparc and RS/6000. Maybe someone can give some more info on these, here are my immediate thoughts: I've heard that the industry as a whole has not yet warmed up to Itanium. I can't afford the newest Sparc Servers, so I'd need to settle with a previous generation if I went that route, any problems with that? I know nothing about the RS/6000 servers (I did see one once though :-), does linux|*BSD run well on them and any suggestions for series/models I should look at? Finally, some specific questions, What's the max number of connections someone has seen on a database server? What type of hardware was it? How much RAM did postgres use? Thanks again, -- Matthew Nuzum www.bearfruit.org cobalt@bearfruit.org > -----Original Message----- > From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance- > owner@postgresql.org] On Behalf Of Tom Lane > Sent: Wednesday, April 09, 2003 8:21 PM > To: jim@nasby.net > Cc: scott.marlowe; Matthew Nuzum; 'Josh Berkus'; 'Pgsql-Performance' > Subject: Caching (was Re: [PERFORM] choosing the right platform) > > "Jim C. Nasby" <jim@nasby.net> writes: > > That seems odd... shouldn't pgsql be able to cache information better > > since it would be cached in whatever format is best for it, rather than > > the raw page format (or maybe that is the best format). There's also the > > issue of having to go through more layers of software if you're relying > > on the OS caching. All the tuning info I've seen for every other > > database I've worked with specifically recommends giving the database as > > much memory as you possibly can, the theory being that it will do a much > > better job of caching than the OS will. > > There are a number of reasons why that's a dubious policy for PG (I > won't take a position on whether these apply to other databases...) > > One is that because we sit on top of the OS' filesystem, we can't > (portably) prevent the OS from caching blocks. So it's quite easy to > get into a situation where the same data is cached twice, once in PG > buffers and once in kernel disk cache. That's clearly a waste of RAM > however you slice it, and it's worst when you set the PG shared buffer > size to be about half of available RAM. You can minimize the > duplication by skewing the allocation one way or the other: either set > PG's allocation relatively small, relying heavily on the OS to do the > caching; or make PG's allocation most of RAM and hope to squeeze out > the OS' cache. There are partisans for both approaches on this list. > I lean towards the first policy because I think that starving the kernel > for RAM is a bad idea. (Especially if you run on Linux, where this > policy tempts the kernel to start kill -9'ing random processes ...) > > Another reason is that PG uses a simplistic fixed-number-of-buffers > internal cache, and therefore it can't adapt on-the-fly to varying > memory pressure, whereas the kernel can and will give up disk cache > space to make room when it's needed for processes. Since PG isn't > even aware of the total memory pressure on the system as a whole, > it couldn't do as good a job of trading off cache vs process workspace > as the kernel can do, even if we had a variable-size cache scheme. > > A third reason is that on many (most?) Unixen, SysV shared memory is > subject to swapping, and the bigger you make the shared_buffer arena, > the more likely it gets that some of the arena will be touched seldom > enough to make it a candidate for swapping. A disk buffer that gets > swapped to disk is worse than useless (if it's dirty, the swapping > is downright counterproductive, since an extra read and write cycle > will be needed before the data can make it to its rightful place). > > PG is *not* any smarter about the usage patterns of its disk buffers > than the kernel is; it uses a simple LRU algorithm that is surely no > brighter than what the kernel uses. (We have looked at smarter buffer > recycling rules, but failed to see any performance improvement.) So the > notion that PG can do a better job of cache management than the kernel > is really illusory. About the only advantage you gain from having data > directly in PG buffers rather than kernel buffers is saving the CPU > effort needed to move data across the userspace boundary --- which is > not zero, but it's sure a lot less than the time spent for actual I/O. > > So my take on it is that you want shared_buffers fairly small, and let > the kernel do the bulk of the heavy lifting for disk cache. That's what > it does for a living, so let it do what it does best. You only want > shared_buffers big enough so you don't spend too many CPU cycles shoving > data back and forth between PG buffers and kernel disk cache. The > default shared_buffers setting of 64 is surely too small :-(, but my > feeling is that values in the low thousands are enough to get past the > knee of that curve in most cases. > > regards, tom lane > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org
pgsql-performance by date: