Thread: Linux v.s. Mac OS-X Performance
Our developers run on MacBook Pros w/ 2G memory and our production hardware is dual dual-Core Opterons w/ 8G memory running CentOS 5. The Macs perform common and complex Postgres operations in about half the time of our unloaded production hardware. We've compared configurations and the production hardware is running a much bigger configuration and faster disk. What are we missing? Is there a trick to making AMDs perform? Does Linux suck compared to BSD? Thanks.
On Nov 9, 2007 10:55 PM, Mark Niedzielski <min@epictechnologies.com> wrote: > > Our developers run on MacBook Pros w/ 2G memory and our production > hardware is dual dual-Core Opterons w/ 8G memory running CentOS 5. The > Macs perform common and complex Postgres operations in about half the > time of our unloaded production hardware. We've compared configurations > and the production hardware is running a much bigger configuration and > faster disk. > > What are we missing? Is there a trick to making AMDs perform? Does > Linux suck compared to BSD? It's quite possible that either you've got some issue with poor hardware / OS integration (think RAID controllers that have bad drivers, etc) or that you've de-tuned postgresql on your CentOS machines when you thought you were tuning it. A common mistake is to set work_mem or shared_buffers so high that they are slower than they would be if they were smaller. Also, if your data sets in production are hundreds of millions of rows, and the test set on your lap top is 100,000 rows, then of course the laptop is going to be faster, it has less data to wade through. So, the key question is what, exactly, is different between your dev laptops and your production machines.
On Fri, 9 Nov 2007, Mark Niedzielski wrote: > The Macs perform common and complex Postgres operations in about half > the time of our unloaded production hardware. Are they write intensive? If so, it may be possible that the Macs are buffering disk writes while production server isn't. It's often the case that desktop systems will cheat at writes while servers don't. > Is there a trick to making AMDs perform? One problem you can run into is that the default configuration on some Linux+AMD systems will include aggressive power management that throttles the CPU clock down. Take a look at /proc/cpuinfo on your server and see what the "cpu MHz" reads; if it's 1000.00 or otherwise doesn't match what you expect, you may need to turn off or otherwise tune power management to keep the system running at full speed. My home AMD dual-core system was positively sluggish until I fixed that. > Does Linux suck compared to BSD? Not the Mac OS BSD. Last time I looked into this OS X was still dramatically slower than Linux on things like process creation. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Fri, 9 Nov 2007, Mark Niedzielski wrote: > The Macs perform common and complex Postgres operations in about half > the time of our unloaded production hardware. Also, what kernel are you using with CentOS 5 - a 32-bit (with hugemem to support the 8GB) or a 64-bit? And which was PostgreSQL compiled for? -- Steve Wampler -- swampler@noao.edu The gods that smiled on your birth are now laughing out loud.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Fri, 09 Nov 2007 23:55:59 -0500 Mark Niedzielski <min@epictechnologies.com> wrote: > > Our developers run on MacBook Pros w/ 2G memory and our production > hardware is dual dual-Core Opterons w/ 8G memory running CentOS 5. > The Macs perform common and complex Postgres operations in about half > the time of our unloaded production hardware. We've compared > configurations and the production hardware is running a much bigger > configuration and faster disk. > > What are we missing? Likely alot. Are you performing any maintenance? What are your postgresql.conf settings? Are you running 64bit on the Linux machine? > Is there a trick to making AMDs perform? Does > Linux suck compared to BSD? No. Sincerely, Joshua D. Drake > > > Thanks. > > > ---------------------------(end of > broadcast)--------------------------- TIP 4: Have you searched our > list archives? > > http://archives.postgresql.org/ > - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHOIs5ATb/zqfZUUQRAo/3AJ9RLcHedTPvl1qVrOgp3Iz6jPJ4wgCfTRe+ tlLJCa1Y8Y9vZDfuxwTG/Bw= =5hHV -----END PGP SIGNATURE-----
On Mon, Nov 12, 2007 at 10:14:46AM -0700, Steve Wampler wrote: > Also, what kernel are you using with CentOS 5 - a 32-bit (with hugemem > to support the 8GB) or a 64-bit? And which was PostgreSQL compiled for? You don't need a 32bit kernel to support 8GB of memory should you? As long as the kernel supports PAE that should be enough to make use of it. You only need a 64bit address space when each process wants to see more than ~3GB of RAM. Sam
On Nov 12, 2007 11:29 AM, Sam Mason <sam@samason.me.uk> wrote: > On Mon, Nov 12, 2007 at 10:14:46AM -0700, Steve Wampler wrote: > > Also, what kernel are you using with CentOS 5 - a 32-bit (with hugemem > > to support the 8GB) or a 64-bit? And which was PostgreSQL compiled for? > > You don't need a 32bit kernel to support 8GB of memory should you? As > long as the kernel supports PAE that should be enough to make use of it. > You only need a 64bit address space when each process wants to see more > than ~3GB of RAM. There's a performance hit for using PAE. Not sure what it is, but I recall it being the in the 5 to 10% range.
On Mon, Nov 12, 2007 at 11:31:59AM -0600, Scott Marlowe wrote: > On Nov 12, 2007 11:29 AM, Sam Mason <sam@samason.me.uk> wrote: > > You don't need a 32bit kernel to support 8GB of memory should you? As > > long as the kernel supports PAE that should be enough to make use of it. > > You only need a 64bit address space when each process wants to see more > > than ~3GB of RAM. > > There's a performance hit for using PAE. Not sure what it is, but I > recall it being the in the 5 to 10% range. And what's the performance hit of using native 64bit code? I'd guess similar, moving twice as much data around with each pointer has got to affect things. Sam
Scott Marlowe wrote: > On Nov 12, 2007 11:29 AM, Sam Mason <sam@samason.me.uk> wrote: >> You don't need a 32bit kernel to support 8GB of memory should you? As >> long as the kernel supports PAE that should be enough to make use of it. >> You only need a 64bit address space when each process wants to see more >> than ~3GB of RAM. > > There's a performance hit for using PAE. Not sure what it is, but I > recall it being the in the 5 to 10% range. Also, using PAE *used* to require the (OS-internal) use of 'bounce-buffers' to copy data from processes high-up in memory down to i/o devices low-down in memory. I don't know if that's still an issue or not with 2.6 kernels, but I could see it still being the case and, if so, seems like it would have a significant impact on I/O bound tasks (like most DB processing...) -- Steve Wampler -- swampler@noao.edu The gods that smiled on your birth are now laughing out loud.
On Nov 12, 2007 11:37 AM, Sam Mason <sam@samason.me.uk> wrote: > On Mon, Nov 12, 2007 at 11:31:59AM -0600, Scott Marlowe wrote: > > On Nov 12, 2007 11:29 AM, Sam Mason <sam@samason.me.uk> wrote: > > > You don't need a 32bit kernel to support 8GB of memory should you? As > > > long as the kernel supports PAE that should be enough to make use of it. > > > You only need a 64bit address space when each process wants to see more > > > than ~3GB of RAM. > > > > There's a performance hit for using PAE. Not sure what it is, but I > > recall it being the in the 5 to 10% range. > > And what's the performance hit of using native 64bit code? I'd guess > similar, moving twice as much data around with each pointer has got to > affect things. That's not been my experience. It's not like everything you do requires 64 bits to be moved where in 32 bit code only 32 were moved. The performance gain of the 64 bit machine doing 64 bit operations over the 32 bit machine doing them (i.e. floating point etc...) is so much more that it more than makes up for the overhead of running in 64 bit mode.
Sam Mason wrote: > And what's the performance hit of using native 64bit code? I'd guess > similar, moving twice as much data around with each pointer has got to > affect things. That's probably difficult to predict. Since the architecture is 64-bits, it shouldn't cost any more to move a 64-bit pointer around as a 32-bit one. (Plus, I *think* you get more registers in 64-bit mode.) However, a good optimizer might figure out it can move two 32-bit pointers with one 64-bit transfer. -- Steve Wampler -- swampler@noao.edu The gods that smiled on your birth are now laughing out loud.
"Scott Marlowe" <scott.marlowe@gmail.com> writes: > On Nov 12, 2007 11:37 AM, Sam Mason <sam@samason.me.uk> wrote: >> And what's the performance hit of using native 64bit code? I'd guess >> similar, moving twice as much data around with each pointer has got to >> affect things. > > That's not been my experience. It's not like everything you do > requires 64 bits to be moved where in 32 bit code only 32 were moved. > The performance gain of the 64 bit machine doing 64 bit operations > over the 32 bit machine doing them (i.e. floating point etc...) is so > much more that it more than makes up for the overhead of running in 64 > bit mode. Plus, 64-bit mode gives you twice as many CPU registers, which is a huge win for some algorithms, though in many cases it doesn't make much of a difference. -Doug
On Mon, Nov 12, 2007 at 11:46:12AM -0600, Scott Marlowe wrote: > On Nov 12, 2007 11:37 AM, Sam Mason <sam@samason.me.uk> wrote: > > And what's the performance hit of using native 64bit code? I'd guess > > similar, moving twice as much data around with each pointer has got to > > affect things. > > That's not been my experience. It's not like everything you do > requires 64 bits to be moved where in 32 bit code only 32 were moved. > The performance gain of the 64 bit machine doing 64 bit operations > over the 32 bit machine doing them (i.e. floating point etc...) is so > much more that it more than makes up for the overhead of running in 64 > bit mode. OK, I'm willing to believe you. It used to be a big misunderstanding that moving to 64bits automatically speed things up, things like this change though. Sam
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Mon, 12 Nov 2007 10:47:29 -0700 Steve Wampler <swampler@noao.edu> wrote: > Sam Mason wrote: > > And what's the performance hit of using native 64bit code? I'd > > guess similar, moving twice as much data around with each pointer > > has got to affect things. > > That's probably difficult to predict. Since the architecture is > 64-bits, it shouldn't cost any more to move a 64-bit pointer around > as a 32-bit one. (Plus, I *think* you get more registers in 64-bit > mode.) It's all about the registers man... all extra 8 of them. Unless of course you are running with >8GB of ram, then it is all about the ability to use more than 2GB of shared memory. Joshua D. Drake - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHOJndATb/zqfZUUQRAjsLAJ4tzk65jzGRGMv33/voxCrqq7O/UACfQR6R jO/YsOG+4Opq4y8QgoXrnQg= =/dNT -----END PGP SIGNATURE-----
On Nov 12, 2007, at 12:29 PM, Sam Mason wrote: > You only need a 64bit address space when each process wants to see > more > than ~3GB of RAM. And how exactly do you get that on a 32-bit CPU? Even with PAE (shudders from memories of expanded/extended RAM in the DOS days), you still have a 32-bit address space per-process.
On Nov 12, 2007, at 12:01 PM, Greg Smith wrote: > Not the Mac OS BSD. Last time I looked into this OS X was still > dramatically slower than Linux on things like process creation. On MacOS X, that's the Mach kernel doing process creation, not anything BSD-ish at all. The BSD flavor of MacOS X is mostly just the userland experience.
On Mon, Nov 12, 2007 at 05:02:52PM -0500, Vivek Khera wrote: > On Nov 12, 2007, at 12:29 PM, Sam Mason wrote: > >You only need a 64bit address space when each process wants to see > >more than ~3GB of RAM. > > And how exactly do you get that on a 32-bit CPU? I didn't mean to suggest you could. You can actually hack around it by performing various kernel specific tricks (mmap()ing different parts of a large file works under some Unixes) but it's a lot of work and tends to be difficult and brittle. > Even with PAE > (shudders from memories of expanded/extended RAM in the DOS days), you > still have a 32-bit address space per-process. Yes, if you've got several clients connected they can each have their 3GB address space in RAM and not swapped out, or you have have lots of disk cache. Other people can probably comment on what life is actually on a box like this, I've not had much experience. Sam
On Fri, 2007-11-09 at 23:55 -0500, Mark Niedzielski wrote: > Our developers run on MacBook Pros w/ 2G memory and our production > hardware is dual dual-Core Opterons w/ 8G memory running CentOS 5. The > Macs perform common and complex Postgres operations in about half the > time of our unloaded production hardware. We've compared configurations > and the production hardware is running a much bigger configuration and > faster disk. > > What are we missing? Is there a trick to making AMDs perform? Does > Linux suck compared to BSD? ---- that was an awful lot of discussion without any empirical evidence to support the original claim. my understanding was that the lack of threading on OSX made it especially poor for a DB server (but if I recall correctly, that information was on MySQL). Do I smell a plant? Craig
> my understanding was that the lack of threading on OSX made it > especially poor for a DB server What you're referring to must be that the kernel was essentially single-threaded, with a single "kernel-funnel" lock. (Because the OS certainly supported threads, and it was certainly possible to write highly-threaded applications, and I don't know of any performance problems with threaded applications.) This has been getting progressively better, with each release adding more in-kernel concurrency. Which means that 10.5 probably obsoletes all prior postgres benchmarks on OS X. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
Thanks to all for the help - and the sanity check. The problem was in the test and not in the configuration. We were using a particularly difficult query as a reference (and fully understanding that it is a two-dimensional alternative to a proper benchmark). On our test system each run was with empty caches. The test on the Mac was with caches loaded. Once we started running the tests with loaded caches, the tuning parameters started behaving as expected. In the end we took a 880 second query to 3.4 seconds (compared to 95 seconds on the Mac). The key was the fact that large configuration changes drew no measurable change in performance. And that is when you know you are turning the wrong knobs! Scott Marlowe wrote: > On Nov 9, 2007 10:55 PM, Mark Niedzielski <min@epictechnologies.com> wrote: > >> Our developers run on MacBook Pros w/ 2G memory and our production >> hardware is dual dual-Core Opterons w/ 8G memory running CentOS 5. The >> Macs perform common and complex Postgres operations in about half the >> time of our unloaded production hardware. We've compared configurations >> and the production hardware is running a much bigger configuration and >> faster disk. >> >> What are we missing? Is there a trick to making AMDs perform? Does >> Linux suck compared to BSD? >> > > It's quite possible that either you've got some issue with poor > hardware / OS integration (think RAID controllers that have bad > drivers, etc) or that you've de-tuned postgresql on your CentOS > machines when you thought you were tuning it. A common mistake is to > set work_mem or shared_buffers so high that they are slower than they > would be if they were smaller. > > Also, if your data sets in production are hundreds of millions of > rows, and the test set on your lap top is 100,000 rows, then of course > the laptop is going to be faster, it has less data to wade through. > > So, the key question is what, exactly, is different between your dev > laptops and your production machines. >
On 11/13/07 10:02 AM, "Scott Ribe" <scott_ribe@killerbytes.com> wrote: > What you're referring to must be that the kernel was essentially > single-threaded, with a single "kernel-funnel" lock. (Because the OS > certainly supported threads, and it was certainly possible to write > highly-threaded applications, and I don't know of any performance problems > with threaded applications.) > > This has been getting progressively better, with each release adding more > in-kernel concurrency. Which means that 10.5 probably obsoletes all prior > postgres benchmarks on OS X. While I've never seen this documented anywhere, it empirically looks like 10.5 also (finally) adds CPU affinity to better utilize instruction caching. On a dual CPU system under 10.4, one CPU bound process would use two CPU's at 50%. Under 10.5 it uses one CPU at 100%. I never saw any resolution to this thread - were the original tests on the Opteron and OS X identical, or were they two different workloads? Wes
On Mon, 2007-11-26 at 17:37 -0600, Wes wrote: > On 11/13/07 10:02 AM, "Scott Ribe" <scott_ribe@killerbytes.com> wrote: > > > What you're referring to must be that the kernel was essentially > > single-threaded, with a single "kernel-funnel" lock. (Because the OS > > certainly supported threads, and it was certainly possible to write > > highly-threaded applications, and I don't know of any performance problems > > with threaded applications.) > > > > This has been getting progressively better, with each release adding more > > in-kernel concurrency. Which means that 10.5 probably obsoletes all prior > > postgres benchmarks on OS X. > > While I've never seen this documented anywhere, it empirically looks like > 10.5 also (finally) adds CPU affinity to better utilize instruction caching. > On a dual CPU system under 10.4, one CPU bound process would use two CPU's > at 50%. Under 10.5 it uses one CPU at 100%. > > I never saw any resolution to this thread - were the original tests on the > Opteron and OS X identical, or were they two different workloads? ---- resolution? http://archives.postgresql.org/pgsql-general/2007-11/msg00946.php conclusion? Mac was still pretty slow in comparison Craig
Hello, sorry for "butting in", but I'm just curious... > resolution? > > http://archives.postgresql.org/pgsql-general/2007-11/msg00946.php > > conclusion? > > Mac was still pretty slow in comparison Anyway, how does MacOS X (both 10.4 and 10.5) compare to Windows (2000, XP, Vista etc.) on the same hardware? And Linux to (Free-/Net-/whatever) BSD? No flamebait, I'm just wondering whether the performance gain is worth the learning effort required for Linux or BSD compared to the Mac. Sincerely, Wolfgang Keller
Hello, sorry for "butting in", but I'm just curious... > resolution? > > http://archives.postgresql.org/pgsql-general/2007-11/msg00946.php > > conclusion? > > Mac was still pretty slow in comparison Anyway, how does MacOS X (both 10.4 and 10.5) compare to Windows (2000, XP, Vista etc.) on the same hardware? And Linux to (Free-/Net-/whatever) BSD? No flamebait, I'm just wondering whether the performance gain is worth the learning effort required for Linux or BSD compared to the Mac. Sincerely, Wolfgang Keller
On Tue, 2007-11-27 at 11:11 +0100, Wolfgang Keller wrote: > Hello, > > sorry for "butting in", but I'm just curious... > > > resolution? > > > > http://archives.postgresql.org/pgsql-general/2007-11/msg00946.php > > > > conclusion? > > > > Mac was still pretty slow in comparison > > Anyway, how does MacOS X (both 10.4 and 10.5) compare to Windows (2000, > XP, Vista etc.) on the same hardware? In general, you can expect any Unix based OS, which includes MacOS X, to perform noticeably better than Windows for PostgreSQL. //Magnus
> In general, you can expect any Unix based OS, which includes MacOS X, to > perform noticeably better than Windows for PostgreSQL. Is that really true of BSD UNIXen??? I've certainly heard it's true of Linux. But with BSD you have the "kernel funnel" which can severely limit multitasking, regardless of whether threads or processes were used. Apple has been working toward finer-grained locking precisely because that was a serious bottleneck which limited OS X server performance. Or have I misunderstood and this was only the design of one particular flavor of BSD, not BSDen in general? -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tue, 27 Nov 2007 17:01:06 -0700 Scott Ribe <scott_ribe@killerbytes.com> wrote: > > In general, you can expect any Unix based OS, which includes MacOS > > X, to perform noticeably better than Windows for PostgreSQL. > > Is that really true of BSD UNIXen??? I've certainly heard it's true of > Linux. But with BSD you have the "kernel funnel" which can severely > limit multitasking, regardless of whether threads or processes were > used. Apple has been working toward finer-grained locking precisely > because that was a serious bottleneck which limited OS X server > performance. > > Or have I misunderstood and this was only the design of one particular > flavor of BSD, not BSDen in general? Not much of a kernel guy here but my understanding is that MacOSX is basically NeXT version 10, which means... Mach... which is entirely different than say FreeBSD at the kernel level. Joshua D. Drake > - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHTLBoATb/zqfZUUQRAs6OAJ4yIYWauPpZybyVZJlF/RScFoZrawCeOYv7 osMbcJEVqqJfLGOo6uRJBMY= =hgrE -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/27/07 18:01, Scott Ribe wrote: >> In general, you can expect any Unix based OS, which includes MacOS X, to >> perform noticeably better than Windows for PostgreSQL. > > Is that really true of BSD UNIXen??? I've certainly heard it's true of > Linux. But with BSD you have the "kernel funnel" which can severely limit > multitasking, regardless of whether threads or processes were used. Apple > has been working toward finer-grained locking precisely because that was a > serious bottleneck which limited OS X server performance. > > Or have I misunderstood and this was only the design of one particular > flavor of BSD, not BSDen in general? IIRC, FreeBSD got rid of the Giant Lock back in v5.x. There was a benchmark in Feb 2007 which demonstrated that FBSD 7.0 scaled *better* than Linux 2.6 after 4 CPUs. http://jeffr-tech.livejournal.com/5705.html Turns out that there was/is a bug in glibc's malloc(). Don't know if it's been fixed yet. - -- Ron Johnson, Jr. Jefferson LA USA %SYSTEM-F-FISH, my hovercraft is full of eels -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHTMAfS9HxQb37XmcRAg4NAJsFXVFa5NQtctsdrjbNCZ8GRAHMlwCeOfZr kBFOQUI6zGcTDiy793+JSIc= =/W4e -----END PGP SIGNATURE-----
On Tue, 27 Nov 2007, Wolfgang Keller wrote: > Anyway, how does MacOS X (both 10.4 and 10.5) compare to Windows (2000, > XP, Vista etc.) on the same hardware? And Linux to (Free-/Net-/whatever) > BSD? Apple hardware gets so expensive for some types of database configurations that such a comparision doesn't even make a lot of sense. For example, if you have an application that needs high database write throughput, to make that work well with PostgreSQL you must have a controller with a battery backed cache. If I have a PC, the entry-level solution in that category can be a random sub-$1000 system that runs Linux plus around $400 for a RAID card with BBC, and you've got multiple vendors to consider there (3Ware, Areca, LSI Logic, etc.) To do something similar with Apple hardware, you can get a Mac Pro and add their RAID card, at $3500 (early reports suggest even that may have serious problems, see http://forums.macrumors.com/showthread.php?t=384459 ). Or you can pick up an XServe RAID, but now you're talking $6350 because the smallest configuration is 1TB. The amount of server you can buy for $3500+ running Linux is going to be much more powerful than its Apple equivilant. Sure, you can run a trivial workload that features minimal writes even on a Mac Mini, but I don't see a lot of value to considering a platform where the jump to the cheapest serious server configuration is so big. Also, in previous generations, the Mach kernel core of Mac OS had some serious performance issues for database use even in read-heavy workloads: http://www.anandtech.com/mac/showdoc.aspx?i=2520&p=5 There are claims this is improved in current systems (Leopard + Intel), but the margin was so big before I would need some pretty hard proof to believe they've even achieved parity with Linux/FreeBSD on the same hardware, and even then the performance/dollar is unlikely to be competative. > I'm just wondering whether the performance gain is worth the learning > effort required for Linux or BSD compared to the Mac. On both Windows (where you get limitations like not being able to set a large value for shared_buffers) and Mac OS X, PostgreSQL has enough performance issues that I feel using those plaforms can only be justified if platform compatibility is more important than performance to you. The minute performance becomes a serious concern, you'd be much better off with Linux, one of the BSDs that's not hobbled by using the Mach kernel, or one of the more serious UNIXes like Solaris. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
"Joshua D. Drake" <jd@commandprompt.com> writes: > On Tue, 27 Nov 2007 17:01:06 -0700 > Scott Ribe <scott_ribe@killerbytes.com> wrote: > >> > In general, you can expect any Unix based OS, which includes MacOS >> > X, to perform noticeably better than Windows for PostgreSQL. >> >> Is that really true of BSD UNIXen??? I've certainly heard it's true of >> Linux. But with BSD you have the "kernel funnel" which can severely >> limit multitasking, regardless of whether threads or processes were >> used. Apple has been working toward finer-grained locking precisely >> because that was a serious bottleneck which limited OS X server >> performance. >> >> Or have I misunderstood and this was only the design of one particular >> flavor of BSD, not BSDen in general? That was true of the traditional BSD 4.3 and 4.4 design. However when people refer to "BSD" these days they're referring to one of the major derivatives which have all undergone extensive further development. FreeBSD has crowed a lot about their finer-grained kernel locks too for example. Other variants of BSD tend to focus on other areas (like portability for example) so they may not be as far ahead but they've still undoubtedly made significant progress compared to 1993. > Not much of a kernel guy here but my understanding is that MacOSX is > basically NeXT version 10, which means... Mach... which is entirely > different than say FreeBSD at the kernel level. I think (but I'm not sure) that the kernel in OSX comes from BSD. What they took from NeXT was the GUI design and object oriented application framework stuff. Basically all the stuff that Unix programmers still haven't quite figured out what it's good for. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning
On Nov 27, 2007, at 8:36 PM, Gregory Stark wrote: > I think (but I'm not sure) that the kernel in OSX comes from BSD. Kind of. Mach is still running underneath (and a lot of the app APIs use it directly) but there is a BSD 'personality' above it which (AIUI) is big parts of FreeBSD ported to run on Mach. So when you use the Unix APIs you're going through that. -Doug
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/27/07 19:36, Gregory Stark wrote: > "Joshua D. Drake" <jd@commandprompt.com> writes: [snip] > > That was true of the traditional BSD 4.3 and 4.4 design. However when people > refer to "BSD" these days they're referring to one of the major derivatives > which have all undergone extensive further development. FreeBSD has crowed a > lot about their finer-grained kernel locks too for example. Other variants of > BSD tend to focus on other areas (like portability for example) so they may > not be as far ahead but they've still undoubtedly made significant progress > compared to 1993. NetBSD and OpenBSD are still pretty not-good at scaling up. But they're darned good at running on 68K Macs (NBSD) and semi-embedded stuff like low-end firewalling routers (OBSD). >> Not much of a kernel guy here but my understanding is that MacOSX is >> basically NeXT version 10, which means... Mach... which is entirely >> different than say FreeBSD at the kernel level. > > I think (but I'm not sure) that the kernel in OSX comes from BSD. What they > took from NeXT was the GUI design and object oriented application framework > stuff. Basically all the stuff that Unix programmers still haven't quite > figured out what it's good for. Even AfterStep is written is plain C... - -- Ron Johnson, Jr. Jefferson LA USA %SYSTEM-F-FISH, my hovercraft is full of eels -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHTMqjS9HxQb37XmcRAmS+AKCyzxZ9b1jmcye8gEwlun7VrszhfgCfVC6B LEaSaGlorSQ5lX5eIIgx7dM= =NvJi -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/27/07 19:35, Greg Smith wrote: [snip] > to you. The minute performance becomes a serious concern, you'd be much > better off with Linux, one of the BSDs that's not hobbled by using the > Mach kernel, or one of the more serious UNIXes like Solaris. Wasn't there a time (2 years ago?) when PG ran pretty dog-like on SPARC? - -- Ron Johnson, Jr. Jefferson LA USA %SYSTEM-F-FISH, my hovercraft is full of eels -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHTMzQS9HxQb37XmcRAo91AJ0d1l1LW0REaUEyVwrkhAF7u6+EYgCaA1aG /qrqS5JebnStbMbO/QD+YA0= =U6ta -----END PGP SIGNATURE-----
On Nov 27, 2007 8:05 PM, Ron Johnson <ron.l.johnson@cox.net> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 11/27/07 19:35, Greg Smith wrote: > [snip] > > to you. The minute performance becomes a serious concern, you'd be much > > better off with Linux, one of the BSDs that's not hobbled by using the > > Mach kernel, or one of the more serious UNIXes like Solaris. > > Wasn't there a time (2 years ago?) when PG ran pretty dog-like on SPARC? Only under Solaris. With Linux or BSD on it it ran pretty well. I had a Sparc 20 running RH 7.2 back in the day (or whatever the last version of RH that ran on sparc was) that spanked an Ultra-2 running slowalrus with twice the memory and hard drives handily. Solaris has gotten much better since then, I'm sure.
> Only under Solaris. With Linux or BSD on it it ran pretty well. I > had a Sparc 20 running RH 7.2 back in the day (or whatever the last > version of RH that ran on sparc was) that spanked an Ultra-2 running > slowalrus with twice the memory and hard drives handily. > > Solaris has gotten much better since then, I'm sure. Ubuntu is supposed to be able to spin on a T1000/T2000 and they have come out with a magical beast called Solaris 10 and in Sun's infinite wisdom they have decided to abandon the /etc/init.d/ and friends way of startup for some complex XML way of doing things. But otherwise its quite good (ZFS and Cool Thread servers being among the other good things out of Sun's shop). Cheers, Aly. -- Aly Dharshi aly.dharshi@telus.net Got TELUS TV ? 310-MYTV or http://www.telus.com/tv "A good speech is like a good dress that's short enough to be interesting and long enough to cover the subject"
> Kind of. Mach is still running underneath (and a lot of the app APIs > use it directly) but there is a BSD 'personality' above it which > (AIUI) is big parts of FreeBSD ported to run on Mach. Right. Also, to be clear, OS X is not a true microkernel architecture. They took the "division of responsibilities" from the Mach microkernel design, but Mach is compiled into the kernel and is not a separate process from the kernel. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
> There are claims this > is improved in current systems (Leopard + Intel), but the margin was so > big before... IIRC, it was later established that during those tests they had fsync enabled on OS X and disabled on Linux. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
Doug McNaught <doug@mcnaught.org> writes: > On Nov 27, 2007, at 8:36 PM, Gregory Stark wrote: >> I think (but I'm not sure) that the kernel in OSX comes from BSD. > Kind of. Mach is still running underneath (and a lot of the app APIs > use it directly) but there is a BSD 'personality' above it which > (AIUI) is big parts of FreeBSD ported to run on Mach. So when you use > the Unix APIs you're going through that. The one bit of the OSX userland code that I've really had my nose rubbed in is libedit, and they definitely took that from NetBSD not FreeBSD. You sure you got your BSDen straight? Some random poking around at http://www.opensource.apple.com/darwinsource/10.5/ finds a whole lot of different-looking license headers. But it seems pretty clear that their userland is BSD-derived, whereas I've always heard that their kernel is Mach-based. I've not gone looking at the kernel though. regards, tom lane
On Tue, Nov 27, 2007 at 05:01:06PM -0700, Scott Ribe wrote: > > In general, you can expect any Unix based OS, which includes MacOS X, to > > perform noticeably better than Windows for PostgreSQL. > > Is that really true of BSD UNIXen??? I've certainly heard it's true of > Linux. But with BSD you have the "kernel funnel" which can severely limit > multitasking, regardless of whether threads or processes were used. Yes, very much so. Windows lacks the fork() concept, which is what makes PostgreSQL much slower there. //Magnus
On Tue, 27 Nov 2007, Scott Ribe wrote: > IIRC, it was later established that during those tests they had fsync > enabled on OS X and disabled on Linux. You recall correctly but I'm guessing you didn't keep up with the investigation there; I was tempted to bring this up in that last message but was already running too long. Presumably you're talking about http://ridiculousfish.com/blog/?p=17 . The fsync theory was suggested by them and possibly others after Anandtech's first benchmarking test of this type. The second test that I linked to rebutted that at http://www.anandtech.com/mac/showdoc.aspx?i=2520&p=6 . This specific issue is also addressed by a comment from Johan Of Anandtech on the ridiculousfish site. The short version is that the MySQL they were using had a MyISAM configuration that doesn't do fsyncs, period, so it's impossible fsyncs were to blame. You only get fsync if you're running InnoDB. I think the reason for this confusion is that at the time of the initial review, working MySQL fsync under OS X was pretty new (January 2005 I think, http://dev.mysql.com/doc/refman/4.1/en/news-4-1-9.html ) Ultimately, the exact cause here doesn't change how to clear the air here. As I suggested, the only way to refute benchmarks showing awful performance is not to theorize as to the cause, but to show new ones that disprove the first ones are still accurate. If you can point me to one of those, I'd love to see it--this is actually one of the items on the relatively short list of why I'm typing this on a Thinkpad running Linux instead of a Macbook (the other big one is Apple's string of Eclipse issues, which I already ranted about recently at http://slashdot.org/comments.pl?sid=342667&cid=21154137 ) -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Tue, 27 Nov 2007, Ron Johnson wrote: > There was a benchmark in Feb 2007 which demonstrated that FBSD 7.0 > scaled *better* than Linux 2.6 after 4 CPUs. > http://jeffr-tech.livejournal.com/5705.html > Turns out that there was/is a bug in glibc's malloc(). Don't know > if it's been fixed yet. Last I heard it was actually glibc combined with a kernel problem, and changes to both would be required to resolve: http://kerneltrap.org/mailarchive/linux-kernel/2007/4/3/73000 I'm not aware of any resolution there but I haven't been following this one closely. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On 11/27/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Doug McNaught <doug@mcnaught.org> writes: > > Kind of. Mach is still running underneath (and a lot of the app APIs > > use it directly) but there is a BSD 'personality' above it which > > (AIUI) is big parts of FreeBSD ported to run on Mach. So when you use > > the Unix APIs you're going through that. > The one bit of the OSX userland code that I've really had my nose rubbed > in is libedit, and they definitely took that from NetBSD not FreeBSD. > You sure you got your BSDen straight? > > Some random poking around at > http://www.opensource.apple.com/darwinsource/10.5/ > finds a whole lot of different-looking license headers. But it seems > pretty clear that their userland is BSD-derived, whereas I've always > heard that their kernel is Mach-based. I've not gone looking at the > kernel though. The majority of the BSDness in the kernel is from FreeBSD, but it is very much a hybrid, Mach being the other parent. Userland is a mixed bag; FreeBSD, NetBSD, OpenBSD are all visible in different places. In older versions I've also seen 4.4BSD credited directly (as in not even caught up with FreeBSD), but I believe most of that has been updated in newer versions of the OS. Apple also has employees who are major developers for both FreeBSD and NetBSD at least, though I haven't kept up with who is doing what. http://developer.apple.com/documentation/Darwin/Conceptual/KernelProgramming/Architecture/chapter_3_section_3.html
> Yes, very much so. Windows lacks the fork() concept, which is what makes > PostgreSQL much slower there. So grossly slower process creation would kill postgres connection times. But what about the cases where persistent connections are used? Is it the case also that Windows has a performance bottleneck for interprocess communication? -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
On Wed, 2007-11-28 at 07:29 -0700, Scott Ribe wrote: > > Yes, very much so. Windows lacks the fork() concept, which is what makes > > PostgreSQL much slower there. > > So grossly slower process creation would kill postgres connection times. But > what about the cases where persistent connections are used? Is it the case > also that Windows has a performance bottleneck for interprocess > communication? There is at least one other bottleneck, probably more than one. Context switching between processes is a lot more expensive than on Unix (given that win32 is optimized towards context switching between threads). NTFS isn't optimized for having 100+ processes reading and writing to the same file. Probably others.. //Magnus
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/28/07 11:13, Magnus Hagander wrote: > On Wed, 2007-11-28 at 07:29 -0700, Scott Ribe wrote: >>> Yes, very much so. Windows lacks the fork() concept, which is what makes >>> PostgreSQL much slower there. >> So grossly slower process creation would kill postgres connection times. But >> what about the cases where persistent connections are used? Is it the case >> also that Windows has a performance bottleneck for interprocess >> communication? > > There is at least one other bottleneck, probably more than one. Context > switching between processes is a lot more expensive than on Unix (given > that win32 is optimized towards context switching between threads). NTFS Isn't that why Apache2 has separate "thread mode" and 1.x-style pre-forked mode? > isn't optimized for having 100+ processes reading and writing to the > same file. Probably others.. - -- Ron Johnson, Jr. Jefferson LA USA %SYSTEM-F-FISH, my hovercraft is full of eels -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHTaP3S9HxQb37XmcRAoFfAJ4gQJIzI95FWyukNy0+7mt2NT+MFgCbBpt/ pdIzLmq1Rndnt3busADFHP8= =NgLQ -----END PGP SIGNATURE-----
Ron Johnson wrote: > On 11/28/07 11:13, Magnus Hagander wrote: >> On Wed, 2007-11-28 at 07:29 -0700, Scott Ribe wrote: >>>> Yes, very much so. Windows lacks the fork() concept, which is what makes >>>> PostgreSQL much slower there. >>> So grossly slower process creation would kill postgres connection times. But >>> what about the cases where persistent connections are used? Is it the case >>> also that Windows has a performance bottleneck for interprocess >>> communication? >> There is at least one other bottleneck, probably more than one. Context >> switching between processes is a lot more expensive than on Unix (given >> that win32 is optimized towards context switching between threads). NTFS > > Isn't that why Apache2 has separate "thread mode" and 1.x-style > pre-forked mode? I think it was a contributing reason for getting it in the first place, but it's certainly not the only reason... //Magnus
On 11/28/07, Magnus Hagander <magnus@hagander.net> wrote: > On Wed, 2007-11-28 at 07:29 -0700, Scott Ribe wrote: > > > Yes, very much so. Windows lacks the fork() concept, which is what makes > > > PostgreSQL much slower there. > > > > So grossly slower process creation would kill postgres connection times. But > > what about the cases where persistent connections are used? Is it the case > > also that Windows has a performance bottleneck for interprocess > > communication? > > There is at least one other bottleneck, probably more than one. Context > switching between processes is a lot more expensive than on Unix (given > that win32 is optimized towards context switching between threads). NTFS > isn't optimized for having 100+ processes reading and writing to the > same file. Probably others.. I'd be interested to know what this info is based on. The only fundamental difference between a process and a thread context switch is VM mapping (extra TLB flush, possible pagetable mapping tweaks). And why would NTFS care about anything other than handles? I mean, I can understand NT having bottlenecks in various areas compared to Unix, but this "threads are specially optimized" thing is seeming a bit overblown. Just how often do you see threads from a single process get contiguous access to the CPU?
Trevor Talbot wrote: > On 11/28/07, Magnus Hagander <magnus@hagander.net> wrote: > >> On Wed, 2007-11-28 at 07:29 -0700, Scott Ribe wrote: >>>> Yes, very much so. Windows lacks the fork() concept, which is what makes >>>> PostgreSQL much slower there. >>> So grossly slower process creation would kill postgres connection times. But >>> what about the cases where persistent connections are used? Is it the case >>> also that Windows has a performance bottleneck for interprocess >>> communication? >> There is at least one other bottleneck, probably more than one. Context >> switching between processes is a lot more expensive than on Unix (given >> that win32 is optimized towards context switching between threads). NTFS >> isn't optimized for having 100+ processes reading and writing to the >> same file. Probably others.. > > I'd be interested to know what this info is based on. The only > fundamental difference between a process and a thread context switch > is VM mapping (extra TLB flush, possible pagetable mapping tweaks). Generally, lots of references I've seen around the net and elsewhere. If I'm not mistaken, the use of threads over processes was listed as one of the main reasons why SQL Server got such good performance on Windows compared to it's competitors. But I don't have my Inside SQL Server around to check for an actual reference. > And why would NTFS care about anything other than handles? Not sure, again it's just something I've picked up from what others have been saying. I should perhaps have been clearer that I don't have any direct proof of that one. > I mean, I can understand NT having bottlenecks in various areas > compared to Unix, but this "threads are specially optimized" thing is > seeming a bit overblown. Just how often do you see threads from a > single process get contiguous access to the CPU? On a CPU loaded SQL server, fairly often I'd say. But certainly not always. //Magnus
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, 28 Nov 2007 09:53:34 -0800 "Trevor Talbot" <quension@gmail.com> wrote: > On 11/28/07, Magnus Hagander <magnus@hagander.net> wrote: > > > On Wed, 2007-11-28 at 07:29 -0700, Scott Ribe wrote: > > > > Yes, very much so. Windows lacks the fork() concept, which is > > > > what makes PostgreSQL much slower there. > I mean, I can understand NT having bottlenecks in various areas > compared to Unix, but this "threads are specially optimized" thing is > seeming a bit overblown. Just how often do you see threads from a > single process get contiguous access to the CPU? I thought it was more about the cost to fork() a process in win32? Sincerely, Joshua D. Drake - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHTazMATb/zqfZUUQRAtpgAJwNXh9tyO0J/KSYnlzB5HoTiru/3wCfQeDy 5cZ+OIZmAUMPmuflVfRP11Q= =4j6q -----END PGP SIGNATURE-----
On 11/28/07, Magnus Hagander <magnus@hagander.net> wrote: > Trevor Talbot wrote: > > On 11/28/07, Magnus Hagander <magnus@hagander.net> wrote: > >> There is at least one other bottleneck, probably more than one. Context > >> switching between processes is a lot more expensive than on Unix (given > >> that win32 is optimized towards context switching between threads). NTFS > >> isn't optimized for having 100+ processes reading and writing to the > >> same file. Probably others.. > > I'd be interested to know what this info is based on. The only > > fundamental difference between a process and a thread context switch > > is VM mapping (extra TLB flush, possible pagetable mapping tweaks). > Generally, lots of references I've seen around the net and elsewhere. If > I'm not mistaken, the use of threads over processes was listed as one of > the main reasons why SQL Server got such good performance on Windows > compared to it's competitors. But I don't have my Inside SQL Server > around to check for an actual reference. Well, yes, in general using multiple threads instead of multiple processes is going to be a gain on any common OS for several reasons, but context switching is a very minor part of that. Threads let you share state much more efficiently than processes do, and in complex servers of this type there tends to be a lot to be shared. SQL Server is somewhat unique in that it doesn't simply throw threads at the problem; it has a small pool and uses its own internal task scheduler for actual SQL work. There's no OS thread per user or anything. Think continuations or pure userspace threading. That design also lets it reduce context switches in general. > > I mean, I can understand NT having bottlenecks in various areas > > compared to Unix, but this "threads are specially optimized" thing is > > seeming a bit overblown. Just how often do you see threads from a > > single process get contiguous access to the CPU? > > On a CPU loaded SQL server, fairly often I'd say. But certainly not always. I meant as a design point for a general-purpose OS. If you consider how Windows does GUIs, ignoring the expense of process context switching would be fatal, since it forces so much app involvement in window painting. Having a system dedicated to a single process with multiple threads running full-bore is not particularly common in this sense.
On 11/28/07, Joshua D. Drake <jd@commandprompt.com> wrote: > On Wed, 28 Nov 2007 09:53:34 -0800 > "Trevor Talbot" <quension@gmail.com> wrote: > > > On Wed, 2007-11-28 at 07:29 -0700, Scott Ribe wrote: > > > > > Yes, very much so. Windows lacks the fork() concept, which is > > > > > what makes PostgreSQL much slower there. > > I mean, I can understand NT having bottlenecks in various areas > > compared to Unix, but this "threads are specially optimized" thing is > > seeming a bit overblown. Just how often do you see threads from a > > single process get contiguous access to the CPU? > I thought it was more about the cost to fork() a process in win32? Creating a process is indeed expensive on Windows, but a followup question was about the performance when using persistent connections, and therefore not creating processes. That's where the conversation got more interesting :)
On Wed, Nov 28, 2007 at 10:33:08AM -0800, Trevor Talbot wrote: > On 11/28/07, Magnus Hagander <magnus@hagander.net> wrote: > > Trevor Talbot wrote: > > > On 11/28/07, Magnus Hagander <magnus@hagander.net> wrote: > > > >> There is at least one other bottleneck, probably more than one. Context > > >> switching between processes is a lot more expensive than on Unix (given > > >> that win32 is optimized towards context switching between threads). NTFS > > >> isn't optimized for having 100+ processes reading and writing to the > > >> same file. Probably others.. > > > > I'd be interested to know what this info is based on. The only > > > fundamental difference between a process and a thread context switch > > > is VM mapping (extra TLB flush, possible pagetable mapping tweaks). > > > Generally, lots of references I've seen around the net and elsewhere. If > > I'm not mistaken, the use of threads over processes was listed as one of > > the main reasons why SQL Server got such good performance on Windows > > compared to it's competitors. But I don't have my Inside SQL Server > > around to check for an actual reference. > > Well, yes, in general using multiple threads instead of multiple > processes is going to be a gain on any common OS for several reasons, > but context switching is a very minor part of that. Threads let you > share state much more efficiently than processes do, and in complex > servers of this type there tends to be a lot to be shared. > > SQL Server is somewhat unique in that it doesn't simply throw threads > at the problem; it has a small pool and uses its own internal task > scheduler for actual SQL work. There's no OS thread per user or > anything. Think continuations or pure userspace threading. That design > also lets it reduce context switches in general. There are actually two different ways to run SQL Server. Either it runs with operating system threadpools (the same way that we deal with backend exits in 8.3), which is IIRC the default. Or it runs with Fibers which are also an OS feature, but they're scheduled by the application. > > > I mean, I can understand NT having bottlenecks in various areas > > > compared to Unix, but this "threads are specially optimized" thing is > > > seeming a bit overblown. Just how often do you see threads from a > > > single process get contiguous access to the CPU? > > > > On a CPU loaded SQL server, fairly often I'd say. But certainly not always. > > I meant as a design point for a general-purpose OS. If you consider > how Windows does GUIs, ignoring the expense of process context > switching would be fatal, since it forces so much app involvement in > window painting. Having a system dedicated to a single process with > multiple threads running full-bore is not particularly common in this > sense. Ok, then I understand what you're saying :-) //Magnus
Regarding the various kernel bottlenecks, have there been any tests with Google's malloc (libtcmalloc)? Perhaps PostgreSQL isn't heavily threaded enough to make a difference, but on one of our heavily threaded applications (unrelated to Postgres), it made a night and day difference. Instead of memory and CPU usage growing and growing, both stabilized quickly at less than half of what the linux malloc produced. Linux (2.6) RH malloc stinks in heavily threaded applications. The benchmark that got us looking at this was a MySQL benchmark showing performance scaling by number of threads on various linux operating systems. The difference in our application by simply relinking at run time (LD_PRELOAD) with libtcmalloc was astounding. Wes
On Thu, 29 Nov 2007, Wes wrote: > Perhaps PostgreSQL isn't heavily threaded enough to make a difference PostgreSQL doesn't use threads at all; it forks processes. See 1.14 in http://www.postgresql.org/docs/faqs.FAQ_DEV.html > The benchmark that got us looking at this was a MySQL benchmark showing > performance scaling by number of threads on various linux operating > systems. Presumably you mean this one: http://ozlabs.org/~anton/linux/sysbench/ The threading/malloc issues in MySQL are so awful that similar approaches have already been suggested for other operating systems. Check out http://developers.sun.com/solaris/articles/mysql_perf_tune.html for comments about this under Solaris for example. The fact that PostgreSQL scalability doesn't fall off like this suggests it doesn't have this particular issue. Note that the curve in that sysbench run is awfully similar to the MySQL results at http://tweakers.net/reviews/649/7 (just shifted to the right because there are many more cores in that system). Then look at their PostgreSQL results running the same test. Forgive the error where they state "PostgreSQL might be called a textbook example of a good implementation of multithreading"; it's actually a good multi-process implementation. Interestingly, those results are from a Solaris system. It's good to know about the Google perftools allocator, as there are plenty of client applications that could benefit as yours has from this technique (like the multi-threaded C++ apps it appears aimed at). I just wouldn't expect it to be a big win for the PostgreSQL server itself. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Thu, Nov 29, 2007 at 11:04:38PM -0600, Wes wrote: > Regarding the various kernel bottlenecks, have there been any tests with > Google's malloc (libtcmalloc)? PostgreSQL has its own allocator on top of malloc already. tcmalloc is optimised for many small allocations, whereas postgres only requests blocks from the OS in large blocks. I doubt tcmalloc would make a useful difference here. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Those who make peaceful revolution impossible will make violent revolution inevitable. > -- John F Kennedy
Attachment
>> Anyway, how does MacOS X (both 10.4 and 10.5) compare to Windows >> (2000, XP, Vista etc.) on the same hardware? And Linux to >> (Free-/Net-/whatever) BSD? > > Apple hardware gets so expensive for some types of database > configurations that such a comparision doesn't even make a lot of > sense. So far my experience with the effective price/performance ratio of Apple vs. other Hardware for my applications has been pretty good. E.g. it was impossible for me to find a similarly priced (Linux-/*BSD/Intel/AMD-)equivalent to my PowerMac G5 over here at the time when I bought it. Not to mention the required learning effort for Linux/*BSD compared to MacOS X, if I count it in (days x day rate)... > For example, if you have an application that needs high > database write throughput, to make that work well with PostgreSQL you > must have a controller with a battery backed cache. Hmm, what would be the difference compared to plenty of RAM and a UPS (plus stand-by backup server)? Looks just like moving the "single point of failure" to adifferent hardware item, no...? > If I have a PC, > the entry-level solution in that category can be a random sub-$1000 > system that runs Linux Can't find one over here for that price that does all the other things that need to be done in a typicle small office (fileserver, printserver, mailserver, calendar server,...) similarly well as my old G5 PowerMac. To turn this one into a part-time DB server, I'd just plug in an eSATA or SAS array (with PCIe adapter) and maybe another few GB of RAM (currently 4). Plus a backup tape drive. My world are environments with not more than at most 10 concurrent database clients at any given moment. But those won't want to wait, because they need to get actual work done. > plus around $400 for a RAID card with BBC, and > you've got multiple vendors to consider there (3Ware, Areca, LSI > Logic, etc.) LSI drivers are not available for MacOS X on PowerMacs? Ouch. > Also, in previous generations, the Mach kernel core of Mac OS had > some serious performance issues for database use even in read-heavy > workloads: http://www.anandtech.com/mac/showdoc.aspx?i=2520&p=5 "With the MySQL performance woes now clearly caused by OS X" Erm, systematic error here: It could also be that the MySQL implementation/configuration for the two different OSes was the source for the performance difference. I wouldn't use MySQL anyway, and I'm mostly interested in transaction performance (client waiting time until commit). >> I'm just wondering whether the performance gain is worth the >> learning effort required for Linux or BSD compared to the Mac. > > On both Windows (where you get limitations like not being able to set > a large value for shared_buffers) My consistent experience with Windows over the last >15 years has been that it just won't multitask anymore as soon as one process does significant I/O. No matter what hardware you put underneath. > and Mac OS X, PostgreSQL has enough > performance issues that I feel using those plaforms can only be > justified if platform compatibility is more important than > performance to you. The point is that cost for "installation", "configuration" and "administration" must be taken into account. A dedicated individual just for that is simply out of question in this world where I live. So someone who's already available has to do all that in a (as tiny as possible) fraction of his/her worktime. With MacOS X it's feasible, but Linux/*BSD? I'm not so sure. Sincerely, Wolfgang Keller
On 11/30/07, Wolfgang Keller <wolfgang.keller.privat@gmx.de> wrote: > > For example, if you have an application that needs high > > database write throughput, to make that work well with PostgreSQL you > > must have a controller with a battery backed cache. > Hmm, what would be the difference compared to plenty of RAM and a UPS > (plus stand-by backup server)? Looks just like moving the "single point > of failure" to adifferent hardware item, no...? Well, you want a backup server anyway, for completely different reasons. It's not relevant to write throughput. The difference between using a disk controller with a BBC compared to just turning fsync off and using RAM is that you've introduced an additional point of failure: the OS itself. You have to trust that the OS is always going to be able to write the cached data to disk. That tends to be riskier than relying on a piece of hardware dedicated to the job, simply because an OS does more, and therefore has more to go wrong (kernel panic / grey screen / BSOD). You could make similar arguments about the additional hardware components in the chain, like the internal power supply. The point is that the database expects that when it asked for data to hit disk, it actually got there. A BBC allows a disk controller to lie (reliably), but turning fsync off allows pretty much everything from the OS down to lie (somewhat less reliably). The controller always exists, so it's not moving a point of failure; if a controller goes you've lost the disk anyway. The tradeoff is how much trust you're willing to put into various parts of the system being uninterrupted.
At 09:09 PM 11/30/2007, Trevor Talbot wrote: >The controller always exists, so it's not moving a point of failure; >if a controller goes you've lost the disk anyway. Anecdotal - I have found "smart" raid controllers to fail more often than dumb scsi controllers (or even SATA/PATA controllers), and some seem more failure prone than semi-decent operating systems. Not recommending people turn fsync off, but the O/S "always" exists, if it is that flaky, you might lose data anyway, so pick a better O/S. What's more likely in most places is somebody powering down the server abruptly, and then fsync=off could hurt :). Regards, Link.
On 30.11.2007, at 04:48, Wolfgang Keller wrote: > LSI drivers are not available for MacOS X on PowerMacs? Ouch. The problem is that they suck as they can't to channel bundling for higher trough-put to a single disk array. [not your comment, but referred there] >> and Mac OS X, PostgreSQL has enough >> performance issues that I feel using those plaforms can only be >> justified if platform compatibility is more important than >> performance to you. Actually - In our test if just used with a similar load as pgbench (e.g. typical web applications) Mac OS X 10.4.7 performed better then Yellow Dog Linux (I was testing with G5 hardware) on the same hardware as soon as more than about 90 concurrent clients were simulated. But okay, don't trust statistics you didn't make up yourself ... cug
On Fri, 30 Nov 2007, Guido Neitzer wrote: > Actually - In our test if just used with a similar load as pgbench (e.g. > typical web applications) Mac OS X 10.4.7 performed better then Yellow Dog > Linux (I was testing with G5 hardware) on the same hardware as soon as more > than about 90 concurrent clients were simulated. At this point, that's just an interesting historical note. Yellow Dog is not a particularly good Linux compared with the ones that have gotten years worth of performance tuning for Intel/AMD processors. And you really can't extrapolate anything useful today from how it ran on a G5--that's two layers of obsolete. The comparisons that matter now are Intel+Mac OS vs. Intel+a popular Linux aimed at servers. As an unrelated note, I'm curious what you did with pgbench that you consider it a reasonable similation of a web application. The default pgbench transaction is very write-heavy, and the read-only option available is way too simple to be realistic. You'd need to pass in custom scripts to execute to get something that acted like a web app. pgbench is an unruly tool, and there's many ways to run it that gives results that aren't so useful. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Fri, 30 Nov 2007, Lincoln Yeoh wrote: > Anecdotal - I have found "smart" raid controllers to fail more often than > dumb scsi controllers (or even SATA/PATA controllers), and some seem more > failure prone than semi-decent operating systems. You'd need to name some names here for this to mean too much. There are plenty of positively miserable RAID controllers out there. I wouldn't trust the cards from Adaptec, Promise, and Highpoint to correctly store a database about what's in my pockets. > What's more likely in most places is somebody powering down the server > abruptly, and then fsync=off could hurt :). Here you're hitting on the real point. If a proposed solution adds potential for database corruption if someone trips over the server cord, it's not really a solution at all. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
At 6:15 PM -0500 11/30/07, Greg Smith wrote: >On Fri, 30 Nov 2007, Guido Neitzer wrote: > >>Actually - In our test if just used with a similar load as pgbench >>(e.g. typical web applications) Mac OS X 10.4.7 performed better >>then Yellow Dog Linux (I was testing with G5 hardware) on the same >>hardware as soon as more than about 90 concurrent clients were >>simulated. > >At this point, that's just an interesting historical note. Yellow >Dog is not a particularly good Linux compared with the ones that >have gotten years worth of performance tuning for Intel/AMD >processors. And you really can't extrapolate anything useful today >from how it ran on a G5--that's two layers of obsolete. The >comparisons that matter now are Intel+Mac OS vs. Intel+a popular >Linux aimed at servers. > >As an unrelated note, I'm curious what you did with pgbench that you >consider it a reasonable similation of a web application. The >default pgbench transaction is very write-heavy, and the read-only >option available is way too simple to be realistic. You'd need to >pass in custom scripts to execute to get something that acted like a >web app. pgbench is an unruly tool, and there's many ways to run it >that gives results that aren't so useful. If this is any help to anyone, I'm running Postgresql on an Intel Xserve Mac OS X. Performance is more than fine for my usage. If anyone would like me to run some benchmark code to test comparisons, I'd be happy to do so. -Owen
On Fri, 30 Nov 2007, Wolfgang Keller wrote: > it was impossible for me to find a similarly priced > (Linux-/*BSD/Intel/AMD-)equivalent to my PowerMac G5 over here at the > time when I bought it. The problem from my perspective is the common complaint that Apple doesn't ship an inexpensive desktop product that would be suitable for light-duty server work. Their cheapest system you can add a PCI-X card to is $2200 USD (I just priced a system out and realized I can downgrade the processors from the default), and that has only has 4 SATA drive bays which doesn't make it much of a serious database server platform. A similarly configured system from Dell runs around $1900, which gives the usual (and completely reasonable) Apple tax of around $300. However, I can just as easily pop over to Dell, buy a $500 system, drop an SATA RAID+BBC controller in for another $400, and I've got a perfectly reasonable little server--one that on write-heavy loads will outperform at least double its price in Apple hardware, simply because that's how much it costs to get the cheapest system you can put a caching controller in from them. (Don't anyone take that as a recommendation for Dell hardware, which I hate, but simply as a reference point; the only thing I like about them is that the system building interface on their web site makes it easy to do comparisons like this) >> For example, if you have an application that needs high >> database write throughput, to make that work well with PostgreSQL you >> must have a controller with a battery backed cache. > > Hmm, what would be the difference compared to plenty of RAM and a UPS (plus > stand-by backup server)? Looks just like moving the "single point of failure" > to adifferent hardware item, no...? When you write a WAL record to commit a transaction, if you can cache that write it doesn't slow any client down. If you can't, the database waits for a physical write to the disk, which can only happen at a rate that depends on your disk's rotation speed. For a standard 7200RPM drive, that tops out a bit less than 120 writes/second for any single client, and somewhere around 500 total for larger numbers of simultaneous clients. The only robust way to cache a write involves a battery-backed controller. Relying on RAM or the write cache in the drives, even if you have the world's greatest UPS, means that the first person who accidentally unplugs your system (or the first power supply failure) could corrupt your database. That's really not acceptable for anyone. But since the integrity policy of the good caching controlers is far better than that, you can leave that cache on safely, and only expect corruption if there's a multi-day power outage. It's still more rambling than I'd like, but I have the pieces to a full discussion of this topic at http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm > LSI drivers are not available for MacOS X on PowerMacs? Ouch. There might be something out there, but I'm not aware of anything from them or other vendors targeted at the current Intel Power Macs that looks robust; there's just Apple's offering. > Erm, systematic error here: It could also be that the MySQL > implementation/configuration for the two different OSes was the source > for the performance difference. That's possible, but other than the specific fsync write fixes they applied for OS X I'm not aware of anything specific to Mac OS that would cause this. When the low-level benchmarks show awful performance doing things like creating processes, and performance dives under a heavy load, it seems sensible to assume the two are linked until proven otherwise. (Appropriate disclaimer: http://en.wikipedia.org/wiki/Correlation_does_not_imply_causation ) It's also true that some of the MySQL threading limitations that were brought up in a tangent to this discussion could be contributing as well, in which case a PostgreSQL test might not show as large of a gap. Again, criticizing the benchmark methods doesn't accomplish anything, you need an advocate for the platform to perform ones showing otherwise before the current results are disproven. > The point is that cost for "installation", "configuration" and > "administration" must be taken into account. The question you asked about was how Apple Hardware+Mac OS X+PostgreSQL stacks up on a performance basis with more common platforms like PC hardware+Linux. All the answers I've seen suggest not very well, and none of these other things are relevant when evaluating the platform from a performance perspetive. TCO issues are all relative to the administrator and tasks anyway--an experienced Linux system administrator may be a little slower on some things than one running Apple's GUI tools, but once you get to more scriptable changes they could be far more efficient. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD