Thread: Linux v.s. Mac OS-X Performance

Linux v.s. Mac OS-X Performance

From
Mark Niedzielski
Date:
Our developers run on MacBook Pros w/ 2G memory and our production
hardware is dual dual-Core Opterons w/ 8G memory running CentOS 5.  The
Macs perform common and complex Postgres operations in about half the
time of our unloaded production hardware.  We've compared configurations
and the production hardware is running a much bigger configuration and
faster disk.

What are we missing?  Is there a trick to making AMDs perform?  Does
Linux suck compared to BSD?


Thanks.


Re: Linux v.s. Mac OS-X Performance

From
"Scott Marlowe"
Date:
On Nov 9, 2007 10:55 PM, Mark Niedzielski <min@epictechnologies.com> wrote:
>
> Our developers run on MacBook Pros w/ 2G memory and our production
> hardware is dual dual-Core Opterons w/ 8G memory running CentOS 5.  The
> Macs perform common and complex Postgres operations in about half the
> time of our unloaded production hardware.  We've compared configurations
> and the production hardware is running a much bigger configuration and
> faster disk.
>
> What are we missing?  Is there a trick to making AMDs perform?  Does
> Linux suck compared to BSD?

It's quite possible that either you've got some issue with poor
hardware / OS integration (think RAID controllers that have bad
drivers, etc) or that you've de-tuned postgresql on your CentOS
machines when you thought you were tuning it.  A common mistake is to
set work_mem or shared_buffers so high that they are slower than they
would be if they were smaller.

Also, if your data sets in production are hundreds of millions of
rows, and the test set on your lap top is 100,000 rows, then of course
the laptop is going to be faster, it has less data to wade through.

So, the key question is what, exactly, is different between your dev
laptops and your production machines.

Re: Linux v.s. Mac OS-X Performance

From
Greg Smith
Date:
On Fri, 9 Nov 2007, Mark Niedzielski wrote:

> The Macs perform common and complex Postgres operations in about half
> the time of our unloaded production hardware.

Are they write intensive?  If so, it may be possible that the Macs are
buffering disk writes while production server isn't.  It's often the case
that desktop systems will cheat at writes while servers don't.

> Is there a trick to making AMDs perform?

One problem you can run into is that the default configuration on some
Linux+AMD systems will include aggressive power management that throttles
the CPU clock down.  Take a look at /proc/cpuinfo on your server and see
what the "cpu MHz" reads; if it's 1000.00 or otherwise doesn't match what
you expect, you may need to turn off or otherwise tune power management to
keep the system running at full speed.  My home AMD dual-core system was
positively sluggish until I fixed that.

> Does Linux suck compared to BSD?

Not the Mac OS BSD.  Last time I looked into this OS X was still
dramatically slower than Linux on things like process creation.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Linux v.s. Mac OS-X Performance

From
Steve Wampler
Date:
On Fri, 9 Nov 2007, Mark Niedzielski wrote:
> The Macs perform common and complex Postgres operations in about half
> the time of our unloaded production hardware.

Also, what kernel are you using with CentOS 5 - a 32-bit (with hugemem
to support the 8GB) or a 64-bit?  And which was PostgreSQL compiled for?

--
Steve Wampler -- swampler@noao.edu
The gods that smiled on your birth are now laughing out loud.

Re: Linux v.s. Mac OS-X Performance

From
"Joshua D. Drake"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, 09 Nov 2007 23:55:59 -0500
Mark Niedzielski <min@epictechnologies.com> wrote:

> 
> Our developers run on MacBook Pros w/ 2G memory and our production
> hardware is dual dual-Core Opterons w/ 8G memory running CentOS 5.
> The Macs perform common and complex Postgres operations in about half
> the time of our unloaded production hardware.  We've compared
> configurations and the production hardware is running a much bigger
> configuration and faster disk.
> 
> What are we missing?

Likely alot. Are you performing any maintenance? What are your
postgresql.conf settings? Are you running 64bit on the Linux machine?


>  Is there a trick to making AMDs perform?  Does
> Linux suck compared to BSD?

No. 


Sincerely,

Joshua D. Drake


> 
> 
> Thanks.
> 
> 
> ---------------------------(end of
> broadcast)--------------------------- TIP 4: Have you searched our
> list archives?
> 
>                http://archives.postgresql.org/
> 


- -- 

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
            UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHOIs5ATb/zqfZUUQRAo/3AJ9RLcHedTPvl1qVrOgp3Iz6jPJ4wgCfTRe+
tlLJCa1Y8Y9vZDfuxwTG/Bw=
=5hHV
-----END PGP SIGNATURE-----

Re: Linux v.s. Mac OS-X Performance

From
Sam Mason
Date:
On Mon, Nov 12, 2007 at 10:14:46AM -0700, Steve Wampler wrote:
> Also, what kernel are you using with CentOS 5 - a 32-bit (with hugemem
> to support the 8GB) or a 64-bit?  And which was PostgreSQL compiled for?

You don't need a 32bit kernel to support 8GB of memory should you? As
long as the kernel supports PAE that should be enough to make use of it.
You only need a 64bit address space when each process wants to see more
than ~3GB of RAM.


  Sam

Re: Linux v.s. Mac OS-X Performance

From
"Scott Marlowe"
Date:
On Nov 12, 2007 11:29 AM, Sam Mason <sam@samason.me.uk> wrote:
> On Mon, Nov 12, 2007 at 10:14:46AM -0700, Steve Wampler wrote:
> > Also, what kernel are you using with CentOS 5 - a 32-bit (with hugemem
> > to support the 8GB) or a 64-bit?  And which was PostgreSQL compiled for?
>
> You don't need a 32bit kernel to support 8GB of memory should you? As
> long as the kernel supports PAE that should be enough to make use of it.
> You only need a 64bit address space when each process wants to see more
> than ~3GB of RAM.

There's a performance hit for using PAE.  Not sure what it is, but I
recall it being the in the 5 to 10% range.

Re: Linux v.s. Mac OS-X Performance

From
Sam Mason
Date:
On Mon, Nov 12, 2007 at 11:31:59AM -0600, Scott Marlowe wrote:
> On Nov 12, 2007 11:29 AM, Sam Mason <sam@samason.me.uk> wrote:
> > You don't need a 32bit kernel to support 8GB of memory should you? As
> > long as the kernel supports PAE that should be enough to make use of it.
> > You only need a 64bit address space when each process wants to see more
> > than ~3GB of RAM.
>
> There's a performance hit for using PAE.  Not sure what it is, but I
> recall it being the in the 5 to 10% range.

And what's the performance hit of using native 64bit code?  I'd guess
similar, moving twice as much data around with each pointer has got to
affect things.


  Sam

Re: Linux v.s. Mac OS-X Performance

From
Steve Wampler
Date:
Scott Marlowe wrote:
> On Nov 12, 2007 11:29 AM, Sam Mason <sam@samason.me.uk> wrote:
>> You don't need a 32bit kernel to support 8GB of memory should you? As
>> long as the kernel supports PAE that should be enough to make use of it.
>> You only need a 64bit address space when each process wants to see more
>> than ~3GB of RAM.
>
> There's a performance hit for using PAE.  Not sure what it is, but I
> recall it being the in the 5 to 10% range.

Also, using PAE *used* to require the (OS-internal) use of 'bounce-buffers'
to copy data from processes high-up in memory down to i/o devices low-down
in memory.  I don't know if that's still an issue or not with 2.6 kernels,
but I could see it still being the case and, if so, seems like it would have
a significant impact on I/O bound tasks (like most DB processing...)


--
Steve Wampler -- swampler@noao.edu
The gods that smiled on your birth are now laughing out loud.

Re: Linux v.s. Mac OS-X Performance

From
"Scott Marlowe"
Date:
On Nov 12, 2007 11:37 AM, Sam Mason <sam@samason.me.uk> wrote:
> On Mon, Nov 12, 2007 at 11:31:59AM -0600, Scott Marlowe wrote:
> > On Nov 12, 2007 11:29 AM, Sam Mason <sam@samason.me.uk> wrote:
> > > You don't need a 32bit kernel to support 8GB of memory should you? As
> > > long as the kernel supports PAE that should be enough to make use of it.
> > > You only need a 64bit address space when each process wants to see more
> > > than ~3GB of RAM.
> >
> > There's a performance hit for using PAE.  Not sure what it is, but I
> > recall it being the in the 5 to 10% range.
>
> And what's the performance hit of using native 64bit code?  I'd guess
> similar, moving twice as much data around with each pointer has got to
> affect things.

That's not been my experience.  It's not like everything you do
requires 64 bits to be moved where in 32 bit code only 32 were moved.
The performance gain of the 64 bit machine doing 64 bit operations
over the 32 bit machine doing them (i.e. floating point etc...) is so
much more that it more than makes up for the overhead of running in 64
bit mode.

Re: Linux v.s. Mac OS-X Performance

From
Steve Wampler
Date:
Sam Mason wrote:
> And what's the performance hit of using native 64bit code?  I'd guess
> similar, moving twice as much data around with each pointer has got to
> affect things.

That's probably difficult to predict.  Since the architecture is 64-bits,
it shouldn't cost any more to move a 64-bit pointer around as a 32-bit
one.  (Plus, I *think* you get more registers in 64-bit mode.)

However, a good optimizer might figure out it can move two 32-bit pointers
with one 64-bit transfer.

--
Steve Wampler -- swampler@noao.edu
The gods that smiled on your birth are now laughing out loud.

Re: Linux v.s. Mac OS-X Performance

From
Douglas McNaught
Date:
"Scott Marlowe" <scott.marlowe@gmail.com> writes:

> On Nov 12, 2007 11:37 AM, Sam Mason <sam@samason.me.uk> wrote:

>> And what's the performance hit of using native 64bit code?  I'd guess
>> similar, moving twice as much data around with each pointer has got to
>> affect things.
>
> That's not been my experience.  It's not like everything you do
> requires 64 bits to be moved where in 32 bit code only 32 were moved.
> The performance gain of the 64 bit machine doing 64 bit operations
> over the 32 bit machine doing them (i.e. floating point etc...) is so
> much more that it more than makes up for the overhead of running in 64
> bit mode.

Plus, 64-bit mode gives you twice as many CPU registers, which is a
huge win for some algorithms, though in many cases it doesn't make
much of a difference.

-Doug

Re: Linux v.s. Mac OS-X Performance

From
Sam Mason
Date:
On Mon, Nov 12, 2007 at 11:46:12AM -0600, Scott Marlowe wrote:
> On Nov 12, 2007 11:37 AM, Sam Mason <sam@samason.me.uk> wrote:
> > And what's the performance hit of using native 64bit code?  I'd guess
> > similar, moving twice as much data around with each pointer has got to
> > affect things.
>
> That's not been my experience.  It's not like everything you do
> requires 64 bits to be moved where in 32 bit code only 32 were moved.
> The performance gain of the 64 bit machine doing 64 bit operations
> over the 32 bit machine doing them (i.e. floating point etc...) is so
> much more that it more than makes up for the overhead of running in 64
> bit mode.

OK, I'm willing to believe you.  It used to be a big misunderstanding
that moving to 64bits automatically speed things up, things like this
change though.


  Sam

Re: Linux v.s. Mac OS-X Performance

From
"Joshua D. Drake"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 12 Nov 2007 10:47:29 -0700
Steve Wampler <swampler@noao.edu> wrote:

> Sam Mason wrote:
> > And what's the performance hit of using native 64bit code?  I'd
> > guess similar, moving twice as much data around with each pointer
> > has got to affect things.
> 
> That's probably difficult to predict.  Since the architecture is
> 64-bits, it shouldn't cost any more to move a 64-bit pointer around
> as a 32-bit one.  (Plus, I *think* you get more registers in 64-bit
> mode.)

It's all about the registers man... all extra 8 of them. Unless of
course you are running with >8GB of ram, then it is all about the
ability to use more than 2GB of shared memory.

Joshua D. Drake




- -- 

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
            UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHOJndATb/zqfZUUQRAjsLAJ4tzk65jzGRGMv33/voxCrqq7O/UACfQR6R
jO/YsOG+4Opq4y8QgoXrnQg=
=/dNT
-----END PGP SIGNATURE-----

Re: Linux v.s. Mac OS-X Performance

From
Vivek Khera
Date:
On Nov 12, 2007, at 12:29 PM, Sam Mason wrote:

> You only need a 64bit address space when each process wants to see
> more
> than ~3GB of RAM.

And how exactly do you get that on a 32-bit CPU?  Even with PAE
(shudders from memories of expanded/extended RAM in the DOS days), you
still have a 32-bit address space per-process.


Re: Linux v.s. Mac OS-X Performance

From
Vivek Khera
Date:
On Nov 12, 2007, at 12:01 PM, Greg Smith wrote:

> Not the Mac OS BSD.  Last time I looked into this OS X was still
> dramatically slower than Linux on things like process creation.

On MacOS X, that's the Mach kernel doing process creation, not
anything BSD-ish at all.  The BSD flavor of MacOS X is mostly just the
userland experience.


Re: Linux v.s. Mac OS-X Performance

From
Sam Mason
Date:
On Mon, Nov 12, 2007 at 05:02:52PM -0500, Vivek Khera wrote:
> On Nov 12, 2007, at 12:29 PM, Sam Mason wrote:
> >You only need a 64bit address space when each process wants to see
> >more than ~3GB of RAM.
>
> And how exactly do you get that on a 32-bit CPU?

I didn't mean to suggest you could.  You can actually hack around it by
performing various kernel specific tricks (mmap()ing different parts of
a large file works under some Unixes) but it's a lot of work and tends
to be difficult and brittle.

> Even with PAE
> (shudders from memories of expanded/extended RAM in the DOS days), you
> still have a 32-bit address space per-process.

Yes, if you've got several clients connected they can each have their
3GB address space in RAM and not swapped out, or you have have lots of
disk cache.  Other people can probably comment on what life is actually
on a box like this, I've not had much experience.


  Sam

Re: Linux v.s. Mac OS-X Performance

From
Craig White
Date:
On Fri, 2007-11-09 at 23:55 -0500, Mark Niedzielski wrote:
> Our developers run on MacBook Pros w/ 2G memory and our production
> hardware is dual dual-Core Opterons w/ 8G memory running CentOS 5.  The
> Macs perform common and complex Postgres operations in about half the
> time of our unloaded production hardware.  We've compared configurations
> and the production hardware is running a much bigger configuration and
> faster disk.
>
> What are we missing?  Is there a trick to making AMDs perform?  Does
> Linux suck compared to BSD?
----
that was an awful lot of discussion without any empirical evidence to
support the original claim.

my understanding was that the lack of threading on OSX made it
especially poor for a DB server (but if I recall correctly, that
information was on MySQL).

Do I smell a plant?

Craig


Re: Linux v.s. Mac OS-X Performance

From
Scott Ribe
Date:
> my understanding was that the lack of threading on OSX made it
> especially poor for a DB server

What you're referring to must be that the kernel was essentially
single-threaded, with a single "kernel-funnel" lock. (Because the OS
certainly supported threads, and it was certainly possible to write
highly-threaded applications, and I don't know of any performance problems
with threaded applications.)

This has been getting progressively better, with each release adding more
in-kernel concurrency. Which means that 10.5 probably obsoletes all prior
postgres benchmarks on OS X.

--
Scott Ribe
scott_ribe@killerbytes.com
http://www.killerbytes.com/
(303) 722-0567 voice



Re: Linux v.s. Mac OS-X Performance

From
Mark Niedzielski
Date:
Thanks to all for the help - and the sanity check.  The problem was in
the test and not in the configuration.

We were using a particularly difficult query as a reference (and fully
understanding that it is a two-dimensional alternative to a proper
benchmark).  On our test system each run was with empty caches.  The
test on the Mac was with caches loaded.  Once we started running the
tests with loaded caches, the tuning parameters started behaving as
expected.  In the end we took a 880 second query to 3.4 seconds
(compared to 95 seconds on the Mac).

The key was the fact that large configuration changes drew no measurable
change in performance.  And that is when you know you are turning the
wrong knobs!


Scott Marlowe wrote:
> On Nov 9, 2007 10:55 PM, Mark Niedzielski <min@epictechnologies.com> wrote:
>
>> Our developers run on MacBook Pros w/ 2G memory and our production
>> hardware is dual dual-Core Opterons w/ 8G memory running CentOS 5.  The
>> Macs perform common and complex Postgres operations in about half the
>> time of our unloaded production hardware.  We've compared configurations
>> and the production hardware is running a much bigger configuration and
>> faster disk.
>>
>> What are we missing?  Is there a trick to making AMDs perform?  Does
>> Linux suck compared to BSD?
>>
>
> It's quite possible that either you've got some issue with poor
> hardware / OS integration (think RAID controllers that have bad
> drivers, etc) or that you've de-tuned postgresql on your CentOS
> machines when you thought you were tuning it.  A common mistake is to
> set work_mem or shared_buffers so high that they are slower than they
> would be if they were smaller.
>
> Also, if your data sets in production are hundreds of millions of
> rows, and the test set on your lap top is 100,000 rows, then of course
> the laptop is going to be faster, it has less data to wade through.
>
> So, the key question is what, exactly, is different between your dev
> laptops and your production machines.
>


Re: Linux v.s. Mac OS-X Performance

From
Wes
Date:
On 11/13/07 10:02 AM, "Scott Ribe" <scott_ribe@killerbytes.com> wrote:

> What you're referring to must be that the kernel was essentially
> single-threaded, with a single "kernel-funnel" lock. (Because the OS
> certainly supported threads, and it was certainly possible to write
> highly-threaded applications, and I don't know of any performance problems
> with threaded applications.)
>
> This has been getting progressively better, with each release adding more
> in-kernel concurrency. Which means that 10.5 probably obsoletes all prior
> postgres benchmarks on OS X.

While I've never seen this documented anywhere, it empirically looks like
10.5 also (finally) adds CPU affinity to better utilize instruction caching.
On a dual CPU system under 10.4, one CPU bound process would use two CPU's
at 50%. Under 10.5 it uses one CPU at 100%.

I never saw any resolution to this thread - were the original tests on the
Opteron and OS X identical, or were they two different workloads?

Wes



Re: Linux v.s. Mac OS-X Performance

From
Craig White
Date:
On Mon, 2007-11-26 at 17:37 -0600, Wes wrote:
> On 11/13/07 10:02 AM, "Scott Ribe" <scott_ribe@killerbytes.com> wrote:
>
> > What you're referring to must be that the kernel was essentially
> > single-threaded, with a single "kernel-funnel" lock. (Because the OS
> > certainly supported threads, and it was certainly possible to write
> > highly-threaded applications, and I don't know of any performance problems
> > with threaded applications.)
> >
> > This has been getting progressively better, with each release adding more
> > in-kernel concurrency. Which means that 10.5 probably obsoletes all prior
> > postgres benchmarks on OS X.
>
> While I've never seen this documented anywhere, it empirically looks like
> 10.5 also (finally) adds CPU affinity to better utilize instruction caching.
> On a dual CPU system under 10.4, one CPU bound process would use two CPU's
> at 50%. Under 10.5 it uses one CPU at 100%.
>
> I never saw any resolution to this thread - were the original tests on the
> Opteron and OS X identical, or were they two different workloads?
----
resolution?

http://archives.postgresql.org/pgsql-general/2007-11/msg00946.php

conclusion?

Mac was still pretty slow in comparison

Craig


Re: Linux v.s. Mac OS-X Performance

From
Wolfgang Keller
Date:
Hello,

sorry for "butting in", but I'm just curious...

> resolution?
>
> http://archives.postgresql.org/pgsql-general/2007-11/msg00946.php
>
> conclusion?
>
> Mac was still pretty slow in comparison

Anyway, how does MacOS X (both 10.4 and 10.5) compare to Windows (2000,
XP, Vista etc.) on the same hardware?

And Linux to (Free-/Net-/whatever) BSD?

No flamebait, I'm just wondering whether the performance gain is worth
the learning effort required for Linux or BSD compared to the Mac.

Sincerely,

Wolfgang Keller

Re: Linux v.s. Mac OS-X Performance

From
Wolfgang Keller
Date:
Hello,

sorry for "butting in", but I'm just curious...

> resolution?
>
> http://archives.postgresql.org/pgsql-general/2007-11/msg00946.php
>
> conclusion?
>
> Mac was still pretty slow in comparison

Anyway, how does MacOS X (both 10.4 and 10.5) compare to Windows (2000,
XP, Vista etc.) on the same hardware?

And Linux to (Free-/Net-/whatever) BSD?

No flamebait, I'm just wondering whether the performance gain is worth
the learning effort required for Linux or BSD compared to the Mac.

Sincerely,

Wolfgang Keller

Re: Linux v.s. Mac OS-X Performance

From
Magnus Hagander
Date:
On Tue, 2007-11-27 at 11:11 +0100, Wolfgang Keller wrote:
> Hello,
>
> sorry for "butting in", but I'm just curious...
>
> > resolution?
> >
> > http://archives.postgresql.org/pgsql-general/2007-11/msg00946.php
> >
> > conclusion?
> >
> > Mac was still pretty slow in comparison
>
> Anyway, how does MacOS X (both 10.4 and 10.5) compare to Windows (2000,
> XP, Vista etc.) on the same hardware?

In general, you can expect any Unix based OS, which includes MacOS X, to
perform noticeably better than Windows for PostgreSQL.

//Magnus


Re: Linux v.s. Mac OS-X Performance

From
Scott Ribe
Date:
> In general, you can expect any Unix based OS, which includes MacOS X, to
> perform noticeably better than Windows for PostgreSQL.

Is that really true of BSD UNIXen??? I've certainly heard it's true of
Linux. But with BSD you have the "kernel funnel" which can severely limit
multitasking, regardless of whether threads or processes were used. Apple
has been working toward finer-grained locking precisely because that was a
serious bottleneck which limited OS X server performance.

Or have I misunderstood and this was only the design of one particular
flavor of BSD, not BSDen in general?

--
Scott Ribe
scott_ribe@killerbytes.com
http://www.killerbytes.com/
(303) 722-0567 voice



Re: Linux v.s. Mac OS-X Performance

From
"Joshua D. Drake"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 27 Nov 2007 17:01:06 -0700
Scott Ribe <scott_ribe@killerbytes.com> wrote:

> > In general, you can expect any Unix based OS, which includes MacOS
> > X, to perform noticeably better than Windows for PostgreSQL.
> 
> Is that really true of BSD UNIXen??? I've certainly heard it's true of
> Linux. But with BSD you have the "kernel funnel" which can severely
> limit multitasking, regardless of whether threads or processes were
> used. Apple has been working toward finer-grained locking precisely
> because that was a serious bottleneck which limited OS X server
> performance.
> 
> Or have I misunderstood and this was only the design of one particular
> flavor of BSD, not BSDen in general?

Not much of a kernel guy here but my understanding is that MacOSX is
basically NeXT version 10, which means... Mach... which is entirely
different than say FreeBSD at the kernel level.

Joshua D. Drake

> 


- -- 

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
            UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHTLBoATb/zqfZUUQRAs6OAJ4yIYWauPpZybyVZJlF/RScFoZrawCeOYv7
osMbcJEVqqJfLGOo6uRJBMY=
=hgrE
-----END PGP SIGNATURE-----

Re: Linux v.s. Mac OS-X Performance

From
Ron Johnson
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/27/07 18:01, Scott Ribe wrote:
>> In general, you can expect any Unix based OS, which includes MacOS X, to
>> perform noticeably better than Windows for PostgreSQL.
>
> Is that really true of BSD UNIXen??? I've certainly heard it's true of
> Linux. But with BSD you have the "kernel funnel" which can severely limit
> multitasking, regardless of whether threads or processes were used. Apple
> has been working toward finer-grained locking precisely because that was a
> serious bottleneck which limited OS X server performance.
>
> Or have I misunderstood and this was only the design of one particular
> flavor of BSD, not BSDen in general?

IIRC, FreeBSD got rid of the Giant Lock back in v5.x.

There was a benchmark in Feb 2007 which demonstrated that FBSD 7.0
scaled *better* than Linux 2.6 after 4 CPUs.
http://jeffr-tech.livejournal.com/5705.html

Turns out that there was/is a bug in glibc's malloc().  Don't know
if it's been fixed yet.

- --
Ron Johnson, Jr.
Jefferson LA  USA

%SYSTEM-F-FISH, my hovercraft is full of eels
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHTMAfS9HxQb37XmcRAg4NAJsFXVFa5NQtctsdrjbNCZ8GRAHMlwCeOfZr
kBFOQUI6zGcTDiy793+JSIc=
=/W4e
-----END PGP SIGNATURE-----

Re: Linux v.s. Mac OS-X Performance

From
Greg Smith
Date:
On Tue, 27 Nov 2007, Wolfgang Keller wrote:

> Anyway, how does MacOS X (both 10.4 and 10.5) compare to Windows (2000,
> XP, Vista etc.) on the same hardware? And Linux to (Free-/Net-/whatever)
> BSD?

Apple hardware gets so expensive for some types of database configurations
that such a comparision doesn't even make a lot of sense.  For example, if
you have an application that needs high database write throughput, to make
that work well with PostgreSQL you must have a controller with a battery
backed cache.  If I have a PC, the entry-level solution in that category
can be a random sub-$1000 system that runs Linux plus around $400 for a
RAID card with BBC, and you've got multiple vendors to consider there
(3Ware, Areca, LSI Logic, etc.)

To do something similar with Apple hardware, you can get a Mac Pro and add
their RAID card, at $3500 (early reports suggest even that may have
serious problems, see http://forums.macrumors.com/showthread.php?t=384459
). Or you can pick up an XServe RAID, but now you're talking $6350 because
the smallest configuration is 1TB.  The amount of server you can buy for
$3500+ running Linux is going to be much more powerful than its Apple
equivilant.  Sure, you can run a trivial workload that features minimal
writes even on a Mac Mini, but I don't see a lot of value to considering a
platform where the jump to the cheapest serious server configuration is so
big.

Also, in previous generations, the Mach kernel core of Mac OS had some
serious performance issues for database use even in read-heavy workloads:
http://www.anandtech.com/mac/showdoc.aspx?i=2520&p=5 There are claims this
is improved in current systems (Leopard + Intel), but the margin was so
big before I would need some pretty hard proof to believe they've even
achieved parity with Linux/FreeBSD on the same hardware, and even then the
performance/dollar is unlikely to be competative.

> I'm just wondering whether the performance gain is worth the learning
> effort required for Linux or BSD compared to the Mac.

On both Windows (where you get limitations like not being able to set a
large value for shared_buffers) and Mac OS X, PostgreSQL has enough
performance issues that I feel using those plaforms can only be justified
if platform compatibility is more important than performance to you.  The
minute performance becomes a serious concern, you'd be much better off
with Linux, one of the BSDs that's not hobbled by using the Mach kernel,
or one of the more serious UNIXes like Solaris.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Linux v.s. Mac OS-X Performance

From
Gregory Stark
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:

> On Tue, 27 Nov 2007 17:01:06 -0700
> Scott Ribe <scott_ribe@killerbytes.com> wrote:
>
>> > In general, you can expect any Unix based OS, which includes MacOS
>> > X, to perform noticeably better than Windows for PostgreSQL.
>>
>> Is that really true of BSD UNIXen??? I've certainly heard it's true of
>> Linux. But with BSD you have the "kernel funnel" which can severely
>> limit multitasking, regardless of whether threads or processes were
>> used. Apple has been working toward finer-grained locking precisely
>> because that was a serious bottleneck which limited OS X server
>> performance.
>>
>> Or have I misunderstood and this was only the design of one particular
>> flavor of BSD, not BSDen in general?

That was true of the traditional BSD 4.3 and 4.4 design. However when people
refer to "BSD" these days they're referring to one of the major derivatives
which have all undergone extensive further development. FreeBSD has crowed a
lot about their finer-grained kernel locks too for example. Other variants of
BSD tend to focus on other areas (like portability for example) so they may
not be as far ahead but they've still undoubtedly made significant progress
compared to 1993.

> Not much of a kernel guy here but my understanding is that MacOSX is
> basically NeXT version 10, which means... Mach... which is entirely
> different than say FreeBSD at the kernel level.

I think (but I'm not sure) that the kernel in OSX comes from BSD. What they
took from NeXT was the GUI design and object oriented application framework
stuff. Basically all the stuff that Unix programmers still haven't quite
figured out what it's good for.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

Re: Linux v.s. Mac OS-X Performance

From
Doug McNaught
Date:
On Nov 27, 2007, at 8:36 PM, Gregory Stark wrote:

> I think (but I'm not sure) that the kernel in OSX comes from BSD.

Kind of.  Mach is still running underneath (and a lot of the app APIs
use it directly) but there is a BSD 'personality' above it which
(AIUI) is big parts of FreeBSD ported to run on Mach.  So when you use
the Unix APIs you're going through that.

-Doug

Re: Linux v.s. Mac OS-X Performance

From
Ron Johnson
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/27/07 19:36, Gregory Stark wrote:
> "Joshua D. Drake" <jd@commandprompt.com> writes:
[snip]
>
> That was true of the traditional BSD 4.3 and 4.4 design. However when people
> refer to "BSD" these days they're referring to one of the major derivatives
> which have all undergone extensive further development. FreeBSD has crowed a
> lot about their finer-grained kernel locks too for example. Other variants of
> BSD tend to focus on other areas (like portability for example) so they may
> not be as far ahead but they've still undoubtedly made significant progress
> compared to 1993.

NetBSD and OpenBSD are still pretty not-good at scaling up.

But they're darned good at running on 68K Macs (NBSD) and
semi-embedded stuff like low-end firewalling routers (OBSD).

>> Not much of a kernel guy here but my understanding is that MacOSX is
>> basically NeXT version 10, which means... Mach... which is entirely
>> different than say FreeBSD at the kernel level.
>
> I think (but I'm not sure) that the kernel in OSX comes from BSD. What they
> took from NeXT was the GUI design and object oriented application framework
> stuff. Basically all the stuff that Unix programmers still haven't quite
> figured out what it's good for.

Even AfterStep is written is plain C...

- --
Ron Johnson, Jr.
Jefferson LA  USA

%SYSTEM-F-FISH, my hovercraft is full of eels
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHTMqjS9HxQb37XmcRAmS+AKCyzxZ9b1jmcye8gEwlun7VrszhfgCfVC6B
LEaSaGlorSQ5lX5eIIgx7dM=
=NvJi
-----END PGP SIGNATURE-----

Re: Linux v.s. Mac OS-X Performance

From
Ron Johnson
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/27/07 19:35, Greg Smith wrote:
[snip]
> to you.  The minute performance becomes a serious concern, you'd be much
> better off with Linux, one of the BSDs that's not hobbled by using the
> Mach kernel, or one of the more serious UNIXes like Solaris.

Wasn't there a time (2 years ago?) when PG ran pretty dog-like on SPARC?

- --
Ron Johnson, Jr.
Jefferson LA  USA

%SYSTEM-F-FISH, my hovercraft is full of eels
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHTMzQS9HxQb37XmcRAo91AJ0d1l1LW0REaUEyVwrkhAF7u6+EYgCaA1aG
/qrqS5JebnStbMbO/QD+YA0=
=U6ta
-----END PGP SIGNATURE-----

Re: Linux v.s. Mac OS-X Performance

From
"Scott Marlowe"
Date:
On Nov 27, 2007 8:05 PM, Ron Johnson <ron.l.johnson@cox.net> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 11/27/07 19:35, Greg Smith wrote:
> [snip]
> > to you.  The minute performance becomes a serious concern, you'd be much
> > better off with Linux, one of the BSDs that's not hobbled by using the
> > Mach kernel, or one of the more serious UNIXes like Solaris.
>
> Wasn't there a time (2 years ago?) when PG ran pretty dog-like on SPARC?

Only under Solaris.  With Linux or BSD on it it ran pretty well.  I
had a Sparc 20 running RH 7.2 back in the day (or whatever the last
version of RH that ran on sparc was) that spanked an Ultra-2 running
slowalrus with twice the memory and hard drives handily.

Solaris has gotten much better since then, I'm sure.

Re: Linux v.s. Mac OS-X Performance

From
Aly Dharshi
Date:
> Only under Solaris.  With Linux or BSD on it it ran pretty well.  I
> had a Sparc 20 running RH 7.2 back in the day (or whatever the last
> version of RH that ran on sparc was) that spanked an Ultra-2 running
> slowalrus with twice the memory and hard drives handily.
>
> Solaris has gotten much better since then, I'm sure.

    Ubuntu is supposed to be able to spin on a T1000/T2000 and they have
come out with a magical beast called Solaris 10 and in Sun's infinite
wisdom they have decided to abandon the /etc/init.d/ and friends way of
startup for some complex XML way of doing things. But otherwise its
quite good (ZFS and Cool Thread servers being among the other good
things out of Sun's shop).

    Cheers,

    Aly.

--
Aly Dharshi
aly.dharshi@telus.net
Got TELUS TV ? 310-MYTV or http://www.telus.com/tv

          "A good speech is like a good dress
           that's short enough to be interesting
           and long enough to cover the subject"


Re: Linux v.s. Mac OS-X Performance

From
Scott Ribe
Date:
> Kind of.  Mach is still running underneath (and a lot of the app APIs
> use it directly) but there is a BSD 'personality' above it which
> (AIUI) is big parts of FreeBSD ported to run on Mach.

Right. Also, to be clear, OS X is not a true microkernel architecture. They
took the "division of responsibilities" from the Mach microkernel design,
but Mach is compiled into the kernel and is not a separate process from the
kernel.

--
Scott Ribe
scott_ribe@killerbytes.com
http://www.killerbytes.com/
(303) 722-0567 voice



Re: Linux v.s. Mac OS-X Performance

From
Scott Ribe
Date:
> There are claims this
> is improved in current systems (Leopard + Intel), but the margin was so
> big before...

IIRC, it was later established that during those tests they had fsync
enabled on OS X and disabled on Linux.

--
Scott Ribe
scott_ribe@killerbytes.com
http://www.killerbytes.com/
(303) 722-0567 voice



Re: Linux v.s. Mac OS-X Performance

From
Tom Lane
Date:
Doug McNaught <doug@mcnaught.org> writes:
> On Nov 27, 2007, at 8:36 PM, Gregory Stark wrote:
>> I think (but I'm not sure) that the kernel in OSX comes from BSD.

> Kind of.  Mach is still running underneath (and a lot of the app APIs
> use it directly) but there is a BSD 'personality' above it which
> (AIUI) is big parts of FreeBSD ported to run on Mach.  So when you use
> the Unix APIs you're going through that.

The one bit of the OSX userland code that I've really had my nose rubbed
in is libedit, and they definitely took that from NetBSD not FreeBSD.
You sure you got your BSDen straight?

Some random poking around at
http://www.opensource.apple.com/darwinsource/10.5/
finds a whole lot of different-looking license headers.  But it seems
pretty clear that their userland is BSD-derived, whereas I've always
heard that their kernel is Mach-based.  I've not gone looking at the
kernel though.

            regards, tom lane

Re: Linux v.s. Mac OS-X Performance

From
Magnus Hagander
Date:
On Tue, Nov 27, 2007 at 05:01:06PM -0700, Scott Ribe wrote:
> > In general, you can expect any Unix based OS, which includes MacOS X, to
> > perform noticeably better than Windows for PostgreSQL.
>
> Is that really true of BSD UNIXen??? I've certainly heard it's true of
> Linux. But with BSD you have the "kernel funnel" which can severely limit
> multitasking, regardless of whether threads or processes were used.

Yes, very much so. Windows lacks the fork() concept, which is what makes
PostgreSQL much slower there.

//Magnus

Re: Linux v.s. Mac OS-X Performance

From
Greg Smith
Date:
On Tue, 27 Nov 2007, Scott Ribe wrote:

> IIRC, it was later established that during those tests they had fsync
> enabled on OS X and disabled on Linux.

You recall correctly but I'm guessing you didn't keep up with the
investigation there; I was tempted to bring this up in that last message
but was already running too long.

Presumably you're talking about http://ridiculousfish.com/blog/?p=17 . The
fsync theory was suggested by them and possibly others after Anandtech's
first benchmarking test of this type.

The second test that I linked to rebutted that at
http://www.anandtech.com/mac/showdoc.aspx?i=2520&p=6 .  This specific
issue is also addressed by a comment from Johan Of Anandtech on the
ridiculousfish site.  The short version is that the MySQL they were using
had a MyISAM configuration that doesn't do fsyncs, period, so it's
impossible fsyncs were to blame.  You only get fsync if you're running
InnoDB.  I think the reason for this confusion is that at the time of the
initial review, working MySQL fsync under OS X was pretty new (January
2005 I think, http://dev.mysql.com/doc/refman/4.1/en/news-4-1-9.html )

Ultimately, the exact cause here doesn't change how to clear the air here.
As I suggested, the only way to refute benchmarks showing awful
performance is not to theorize as to the cause, but to show new ones that
disprove the first ones are still accurate.  If you can point me to one of
those, I'd love to see it--this is actually one of the items on the
relatively short list of why I'm typing this on a Thinkpad running Linux
instead of a Macbook (the other big one is Apple's string of Eclipse
issues, which I already ranted about recently at
http://slashdot.org/comments.pl?sid=342667&cid=21154137 )

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Linux v.s. Mac OS-X Performance

From
Greg Smith
Date:
On Tue, 27 Nov 2007, Ron Johnson wrote:

> There was a benchmark in Feb 2007 which demonstrated that FBSD 7.0
> scaled *better* than Linux 2.6 after 4 CPUs.
> http://jeffr-tech.livejournal.com/5705.html
> Turns out that there was/is a bug in glibc's malloc().  Don't know
> if it's been fixed yet.

Last I heard it was actually glibc combined with a kernel problem, and
changes to both would be required to resolve:

http://kerneltrap.org/mailarchive/linux-kernel/2007/4/3/73000

I'm not aware of any resolution there but I haven't been following this
one closely.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Linux v.s. Mac OS-X Performance

From
"Trevor Talbot"
Date:
On 11/27/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Doug McNaught <doug@mcnaught.org> writes:

> > Kind of.  Mach is still running underneath (and a lot of the app APIs
> > use it directly) but there is a BSD 'personality' above it which
> > (AIUI) is big parts of FreeBSD ported to run on Mach.  So when you use
> > the Unix APIs you're going through that.

> The one bit of the OSX userland code that I've really had my nose rubbed
> in is libedit, and they definitely took that from NetBSD not FreeBSD.
> You sure you got your BSDen straight?
>
> Some random poking around at
> http://www.opensource.apple.com/darwinsource/10.5/
> finds a whole lot of different-looking license headers.  But it seems
> pretty clear that their userland is BSD-derived, whereas I've always
> heard that their kernel is Mach-based.  I've not gone looking at the
> kernel though.

The majority of the BSDness in the kernel is from FreeBSD, but it is
very much a hybrid, Mach being the other parent.  Userland is a mixed
bag; FreeBSD, NetBSD, OpenBSD are all visible in different places.  In
older versions I've also seen 4.4BSD credited directly (as in not even
caught up with FreeBSD), but I believe most of that has been updated
in newer versions of the OS.  Apple also has employees who are major
developers for both FreeBSD and NetBSD at least, though I haven't kept
up with who is doing what.

http://developer.apple.com/documentation/Darwin/Conceptual/KernelProgramming/Architecture/chapter_3_section_3.html

Re: Linux v.s. Mac OS-X Performance

From
Scott Ribe
Date:
> Yes, very much so. Windows lacks the fork() concept, which is what makes
> PostgreSQL much slower there.

So grossly slower process creation would kill postgres connection times. But
what about the cases where persistent connections are used? Is it the case
also that Windows has a performance bottleneck for interprocess
communication?

--
Scott Ribe
scott_ribe@killerbytes.com
http://www.killerbytes.com/
(303) 722-0567 voice



Re: Linux v.s. Mac OS-X Performance

From
Magnus Hagander
Date:
On Wed, 2007-11-28 at 07:29 -0700, Scott Ribe wrote:
> > Yes, very much so. Windows lacks the fork() concept, which is what makes
> > PostgreSQL much slower there.
>
> So grossly slower process creation would kill postgres connection times. But
> what about the cases where persistent connections are used? Is it the case
> also that Windows has a performance bottleneck for interprocess
> communication?

There is at least one other bottleneck, probably more than one. Context
switching between processes is a lot more expensive than on Unix (given
that win32 is optimized towards context switching between threads). NTFS
isn't optimized for having 100+ processes reading and writing to the
same file. Probably others..

//Magnus

Re: Linux v.s. Mac OS-X Performance

From
Ron Johnson
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/28/07 11:13, Magnus Hagander wrote:
> On Wed, 2007-11-28 at 07:29 -0700, Scott Ribe wrote:
>>> Yes, very much so. Windows lacks the fork() concept, which is what makes
>>> PostgreSQL much slower there.
>> So grossly slower process creation would kill postgres connection times. But
>> what about the cases where persistent connections are used? Is it the case
>> also that Windows has a performance bottleneck for interprocess
>> communication?
>
> There is at least one other bottleneck, probably more than one. Context
> switching between processes is a lot more expensive than on Unix (given
> that win32 is optimized towards context switching between threads). NTFS

Isn't that why Apache2 has separate "thread mode" and 1.x-style
pre-forked mode?

> isn't optimized for having 100+ processes reading and writing to the
> same file. Probably others..

- --
Ron Johnson, Jr.
Jefferson LA  USA

%SYSTEM-F-FISH, my hovercraft is full of eels
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHTaP3S9HxQb37XmcRAoFfAJ4gQJIzI95FWyukNy0+7mt2NT+MFgCbBpt/
pdIzLmq1Rndnt3busADFHP8=
=NgLQ
-----END PGP SIGNATURE-----

Re: Linux v.s. Mac OS-X Performance

From
Magnus Hagander
Date:
Ron Johnson wrote:
> On 11/28/07 11:13, Magnus Hagander wrote:
>> On Wed, 2007-11-28 at 07:29 -0700, Scott Ribe wrote:
>>>> Yes, very much so. Windows lacks the fork() concept, which is what makes
>>>> PostgreSQL much slower there.
>>> So grossly slower process creation would kill postgres connection times. But
>>> what about the cases where persistent connections are used? Is it the case
>>> also that Windows has a performance bottleneck for interprocess
>>> communication?
>> There is at least one other bottleneck, probably more than one. Context
>> switching between processes is a lot more expensive than on Unix (given
>> that win32 is optimized towards context switching between threads). NTFS
>
> Isn't that why Apache2 has separate "thread mode" and 1.x-style
> pre-forked mode?

I think it was a contributing reason for getting it in the first place,
but it's certainly not the only reason...

//Magnus

Re: Linux v.s. Mac OS-X Performance

From
"Trevor Talbot"
Date:
On 11/28/07, Magnus Hagander <magnus@hagander.net> wrote:

> On Wed, 2007-11-28 at 07:29 -0700, Scott Ribe wrote:
> > > Yes, very much so. Windows lacks the fork() concept, which is what makes
> > > PostgreSQL much slower there.
> >
> > So grossly slower process creation would kill postgres connection times. But
> > what about the cases where persistent connections are used? Is it the case
> > also that Windows has a performance bottleneck for interprocess
> > communication?
>
> There is at least one other bottleneck, probably more than one. Context
> switching between processes is a lot more expensive than on Unix (given
> that win32 is optimized towards context switching between threads). NTFS
> isn't optimized for having 100+ processes reading and writing to the
> same file. Probably others..

I'd be interested to know what this info is based on.  The only
fundamental difference between a process and a thread context switch
is VM mapping (extra TLB flush, possible pagetable mapping tweaks).
And why would NTFS care about anything other than handles?

I mean, I can understand NT having bottlenecks in various areas
compared to Unix, but this "threads are specially optimized" thing is
seeming a bit overblown.  Just how often do you see threads from a
single process get contiguous access to the CPU?

Re: Linux v.s. Mac OS-X Performance

From
Magnus Hagander
Date:
Trevor Talbot wrote:
> On 11/28/07, Magnus Hagander <magnus@hagander.net> wrote:
>
>> On Wed, 2007-11-28 at 07:29 -0700, Scott Ribe wrote:
>>>> Yes, very much so. Windows lacks the fork() concept, which is what makes
>>>> PostgreSQL much slower there.
>>> So grossly slower process creation would kill postgres connection times. But
>>> what about the cases where persistent connections are used? Is it the case
>>> also that Windows has a performance bottleneck for interprocess
>>> communication?
>> There is at least one other bottleneck, probably more than one. Context
>> switching between processes is a lot more expensive than on Unix (given
>> that win32 is optimized towards context switching between threads). NTFS
>> isn't optimized for having 100+ processes reading and writing to the
>> same file. Probably others..
>
> I'd be interested to know what this info is based on.  The only
> fundamental difference between a process and a thread context switch
> is VM mapping (extra TLB flush, possible pagetable mapping tweaks).

Generally, lots of references I've seen around the net and elsewhere. If
I'm not mistaken, the use of threads over processes was listed as one of
the main reasons why SQL Server got such good performance on Windows
compared to it's competitors. But I don't have my Inside SQL Server
around to check for an actual reference.


> And why would NTFS care about anything other than handles?

Not sure, again it's just something I've picked up from what others have
been saying. I should perhaps have been clearer that I don't have any
direct proof of that one.


> I mean, I can understand NT having bottlenecks in various areas
> compared to Unix, but this "threads are specially optimized" thing is
> seeming a bit overblown.  Just how often do you see threads from a
> single process get contiguous access to the CPU?

On a CPU loaded SQL server, fairly often I'd say. But certainly not always.

//Magnus

Re: Linux v.s. Mac OS-X Performance

From
"Joshua D. Drake"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, 28 Nov 2007 09:53:34 -0800
"Trevor Talbot" <quension@gmail.com> wrote:

> On 11/28/07, Magnus Hagander <magnus@hagander.net> wrote:
> 
> > On Wed, 2007-11-28 at 07:29 -0700, Scott Ribe wrote:
> > > > Yes, very much so. Windows lacks the fork() concept, which is
> > > > what makes PostgreSQL much slower there.

> I mean, I can understand NT having bottlenecks in various areas
> compared to Unix, but this "threads are specially optimized" thing is
> seeming a bit overblown.  Just how often do you see threads from a
> single process get contiguous access to the CPU?

I thought it was more about the cost to fork() a process in win32? 

Sincerely,

Joshua D. Drake



- -- 

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
            UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHTazMATb/zqfZUUQRAtpgAJwNXh9tyO0J/KSYnlzB5HoTiru/3wCfQeDy
5cZ+OIZmAUMPmuflVfRP11Q=
=4j6q
-----END PGP SIGNATURE-----

Re: Linux v.s. Mac OS-X Performance

From
"Trevor Talbot"
Date:
On 11/28/07, Magnus Hagander <magnus@hagander.net> wrote:
> Trevor Talbot wrote:
> > On 11/28/07, Magnus Hagander <magnus@hagander.net> wrote:

> >> There is at least one other bottleneck, probably more than one. Context
> >> switching between processes is a lot more expensive than on Unix (given
> >> that win32 is optimized towards context switching between threads). NTFS
> >> isn't optimized for having 100+ processes reading and writing to the
> >> same file. Probably others..

> > I'd be interested to know what this info is based on.  The only
> > fundamental difference between a process and a thread context switch
> > is VM mapping (extra TLB flush, possible pagetable mapping tweaks).

> Generally, lots of references I've seen around the net and elsewhere. If
> I'm not mistaken, the use of threads over processes was listed as one of
> the main reasons why SQL Server got such good performance on Windows
> compared to it's competitors. But I don't have my Inside SQL Server
> around to check for an actual reference.

Well, yes, in general using multiple threads instead of multiple
processes is going to be a gain on any common OS for several reasons,
but context switching is a very minor part of that. Threads let you
share state much more efficiently than processes do, and in complex
servers of this type there tends to be a lot to be shared.

SQL Server is somewhat unique in that it doesn't simply throw threads
at the problem; it has a small pool and uses its own internal task
scheduler for actual SQL work. There's no OS thread per user or
anything. Think continuations or pure userspace threading. That design
also lets it reduce context switches in general.

> > I mean, I can understand NT having bottlenecks in various areas
> > compared to Unix, but this "threads are specially optimized" thing is
> > seeming a bit overblown.  Just how often do you see threads from a
> > single process get contiguous access to the CPU?
>
> On a CPU loaded SQL server, fairly often I'd say. But certainly not always.

I meant as a design point for a general-purpose OS. If you consider
how Windows does GUIs, ignoring the expense of process context
switching would be fatal, since it forces so much app involvement in
window painting. Having a system dedicated to a single process with
multiple threads running full-bore is not particularly common in this
sense.

Re: Linux v.s. Mac OS-X Performance

From
"Trevor Talbot"
Date:
On 11/28/07, Joshua D. Drake <jd@commandprompt.com> wrote:

> On Wed, 28 Nov 2007 09:53:34 -0800
> "Trevor Talbot" <quension@gmail.com> wrote:

> > > On Wed, 2007-11-28 at 07:29 -0700, Scott Ribe wrote:
> > > > > Yes, very much so. Windows lacks the fork() concept, which is
> > > > > what makes PostgreSQL much slower there.

> > I mean, I can understand NT having bottlenecks in various areas
> > compared to Unix, but this "threads are specially optimized" thing is
> > seeming a bit overblown.  Just how often do you see threads from a
> > single process get contiguous access to the CPU?

> I thought it was more about the cost to fork() a process in win32?

Creating a process is indeed expensive on Windows, but a followup
question was about the performance when using persistent connections,
and therefore not creating processes. That's where the conversation
got more interesting :)

Re: Linux v.s. Mac OS-X Performance

From
Magnus Hagander
Date:
On Wed, Nov 28, 2007 at 10:33:08AM -0800, Trevor Talbot wrote:
> On 11/28/07, Magnus Hagander <magnus@hagander.net> wrote:
> > Trevor Talbot wrote:
> > > On 11/28/07, Magnus Hagander <magnus@hagander.net> wrote:
>
> > >> There is at least one other bottleneck, probably more than one. Context
> > >> switching between processes is a lot more expensive than on Unix (given
> > >> that win32 is optimized towards context switching between threads). NTFS
> > >> isn't optimized for having 100+ processes reading and writing to the
> > >> same file. Probably others..
>
> > > I'd be interested to know what this info is based on.  The only
> > > fundamental difference between a process and a thread context switch
> > > is VM mapping (extra TLB flush, possible pagetable mapping tweaks).
>
> > Generally, lots of references I've seen around the net and elsewhere. If
> > I'm not mistaken, the use of threads over processes was listed as one of
> > the main reasons why SQL Server got such good performance on Windows
> > compared to it's competitors. But I don't have my Inside SQL Server
> > around to check for an actual reference.
>
> Well, yes, in general using multiple threads instead of multiple
> processes is going to be a gain on any common OS for several reasons,
> but context switching is a very minor part of that. Threads let you
> share state much more efficiently than processes do, and in complex
> servers of this type there tends to be a lot to be shared.
>
> SQL Server is somewhat unique in that it doesn't simply throw threads
> at the problem; it has a small pool and uses its own internal task
> scheduler for actual SQL work. There's no OS thread per user or
> anything. Think continuations or pure userspace threading. That design
> also lets it reduce context switches in general.

There are actually two different ways to run SQL Server. Either it runs
with operating system threadpools (the same way that we deal with backend
exits in 8.3), which is IIRC the default. Or it runs with Fibers which are
also an OS feature, but they're scheduled by the application.


> > > I mean, I can understand NT having bottlenecks in various areas
> > > compared to Unix, but this "threads are specially optimized" thing is
> > > seeming a bit overblown.  Just how often do you see threads from a
> > > single process get contiguous access to the CPU?
> >
> > On a CPU loaded SQL server, fairly often I'd say. But certainly not always.
>
> I meant as a design point for a general-purpose OS. If you consider
> how Windows does GUIs, ignoring the expense of process context
> switching would be fatal, since it forces so much app involvement in
> window painting. Having a system dedicated to a single process with
> multiple threads running full-bore is not particularly common in this
> sense.

Ok, then I understand what you're saying :-)

//Magnus

Re: Linux v.s. Mac OS-X Performance

From
Wes
Date:
Regarding the various kernel bottlenecks, have there been any tests with
Google's malloc (libtcmalloc)?  Perhaps PostgreSQL isn't heavily threaded
enough to make a difference, but on one of our heavily threaded applications
(unrelated to Postgres), it made a night and day difference.  Instead of
memory and CPU usage growing and growing, both stabilized quickly at less
than half of what the linux malloc produced.  Linux (2.6) RH malloc stinks
in heavily threaded applications.  The benchmark that got us looking at this
was a MySQL benchmark showing performance scaling by number of threads on
various linux operating systems.  The difference in our application by
simply relinking at run time (LD_PRELOAD) with libtcmalloc was astounding.

Wes



Re: Linux v.s. Mac OS-X Performance

From
Greg Smith
Date:
On Thu, 29 Nov 2007, Wes wrote:

> Perhaps PostgreSQL isn't heavily threaded enough to make a difference

PostgreSQL doesn't use threads at all; it forks processes.  See 1.14 in
http://www.postgresql.org/docs/faqs.FAQ_DEV.html

> The benchmark that got us looking at this was a MySQL benchmark showing
> performance scaling by number of threads on various linux operating
> systems.

Presumably you mean this one:  http://ozlabs.org/~anton/linux/sysbench/

The threading/malloc issues in MySQL are so awful that similar approaches
have already been suggested for other operating systems.  Check out
http://developers.sun.com/solaris/articles/mysql_perf_tune.html for
comments about this under Solaris for example.

The fact that PostgreSQL scalability doesn't fall off like this suggests
it doesn't have this particular issue.  Note that the curve in that
sysbench run is awfully similar to the MySQL results at
http://tweakers.net/reviews/649/7 (just shifted to the right because there
are many more cores in that system).  Then look at their PostgreSQL
results running the same test.  Forgive the error where they state
"PostgreSQL might be called a textbook example of a good implementation of
multithreading"; it's actually a good multi-process implementation.
Interestingly, those results are from a Solaris system.

It's good to know about the Google perftools allocator, as there are
plenty of client applications that could benefit as yours has from this
technique (like the multi-threaded C++ apps it appears aimed at).  I just
wouldn't expect it to be a big win for the PostgreSQL server itself.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Linux v.s. Mac OS-X Performance

From
Martijn van Oosterhout
Date:
On Thu, Nov 29, 2007 at 11:04:38PM -0600, Wes wrote:
> Regarding the various kernel bottlenecks, have there been any tests with
> Google's malloc (libtcmalloc)?

PostgreSQL has its own allocator on top of malloc already. tcmalloc is
optimised for many small allocations, whereas postgres only requests
blocks from the OS in large blocks. I doubt tcmalloc would make a
useful difference here.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Those who make peaceful revolution impossible will make violent revolution inevitable.
>  -- John F Kennedy

Attachment

Re: Linux v.s. Mac OS-X Performance

From
Wolfgang Keller
Date:
>> Anyway, how does MacOS X (both 10.4 and 10.5) compare to Windows
>> (2000,  XP, Vista etc.) on the same hardware? And Linux to
>> (Free-/Net-/whatever)  BSD?
>
> Apple hardware gets so expensive for some types of database
> configurations that such a comparision doesn't even make a lot of
> sense.

So far my experience with the effective price/performance ratio of
Apple vs. other Hardware for my applications has been pretty good. E.g.
it was impossible for me to find a similarly priced
(Linux-/*BSD/Intel/AMD-)equivalent to my PowerMac G5 over here at the
time when I bought it.

Not to mention the required learning effort for Linux/*BSD compared to
MacOS X, if I count it in (days x day rate)...

> For example, if you have an application that needs high
> database write throughput, to make that work well with PostgreSQL you
> must have a controller with a battery backed cache.

Hmm, what would be the difference compared to plenty of RAM and a UPS
(plus stand-by backup server)? Looks just like moving the "single point
of failure" to adifferent hardware item, no...?

>  If I have a PC,
> the entry-level solution in that category can be a random sub-$1000
> system that runs Linux

Can't find one over here for that price that does all the other things
that need to be done in a typicle small office (fileserver,
printserver, mailserver, calendar server,...) similarly well as my old
G5 PowerMac. To turn this one into a part-time DB server, I'd just plug
in an eSATA or SAS array (with PCIe adapter) and maybe another few GB
of RAM (currently 4). Plus a backup tape drive.

My world are environments with not more than at most 10 concurrent
database clients at any given moment. But those won't want to wait,
because they need to get actual work done.

> plus around $400 for a RAID card with BBC, and
> you've got multiple vendors to consider there (3Ware, Areca, LSI
> Logic, etc.)

LSI drivers are not available for MacOS X on PowerMacs? Ouch.

> Also, in previous generations, the Mach kernel core of Mac OS had
> some serious performance issues for database use even in read-heavy
> workloads: http://www.anandtech.com/mac/showdoc.aspx?i=2520&p=5

"With the MySQL performance woes now clearly caused by OS X"

Erm, systematic error here: It could also be that the MySQL
implementation/configuration for the two different OSes was the source
for the performance difference.

I wouldn't use MySQL anyway, and I'm mostly interested in transaction
performance (client waiting time until commit).

>> I'm just wondering whether the performance gain is worth the
>> learning  effort required for Linux or BSD compared to the Mac.
>
> On both Windows (where you get limitations like not being able to set
> a large value for shared_buffers)

My consistent experience with Windows over the last >15 years has been
that it just won't multitask anymore as soon as one process does
significant I/O. No matter what hardware you put underneath.

> and Mac OS X, PostgreSQL has enough
> performance issues that I feel using those plaforms can only be
> justified if platform compatibility is more important than
> performance to you.

The point is that cost for "installation", "configuration" and
"administration" must be taken into account. A dedicated individual
just for that is simply out of question in this world where I live. So
someone who's already available has to do all that in a (as tiny as
possible) fraction of his/her worktime. With MacOS X it's feasible, but
Linux/*BSD? I'm not so sure.

Sincerely,

Wolfgang Keller

Re: Linux v.s. Mac OS-X Performance

From
"Trevor Talbot"
Date:
On 11/30/07, Wolfgang Keller <wolfgang.keller.privat@gmx.de> wrote:

> > For example, if you have an application that needs high
> > database write throughput, to make that work well with PostgreSQL you
> > must have a controller with a battery backed cache.

> Hmm, what would be the difference compared to plenty of RAM and a UPS
> (plus stand-by backup server)? Looks just like moving the "single point
> of failure" to adifferent hardware item, no...?

Well, you want a backup server anyway, for completely different
reasons. It's not relevant to write throughput.

The difference between using a disk controller with a BBC compared to
just turning fsync off and using RAM is that you've introduced an
additional point of failure: the OS itself. You have to trust that the
OS is always going to be able to write the cached data to disk. That
tends to be riskier than relying on a piece of hardware dedicated to
the job, simply because an OS does more, and therefore has more to go
wrong (kernel panic / grey screen / BSOD).

You could make similar arguments about the additional hardware
components in the chain, like the internal power supply. The point is
that the database expects that when it asked for data to hit disk, it
actually got there. A BBC allows a disk controller to lie (reliably),
but turning fsync off allows pretty much everything from the OS down
to lie (somewhat less reliably).

The controller always exists, so it's not moving a point of failure;
if a controller goes you've lost the disk anyway.

The tradeoff is how much trust you're willing to put into various
parts of the system being uninterrupted.

Re: Linux v.s. Mac OS-X Performance

From
Lincoln Yeoh
Date:
At 09:09 PM 11/30/2007, Trevor Talbot wrote:

>The controller always exists, so it's not moving a point of failure;
>if a controller goes you've lost the disk anyway.

Anecdotal - I have found "smart" raid controllers to fail more often
than dumb scsi controllers (or even SATA/PATA controllers), and some
seem more failure prone than semi-decent operating systems.

Not recommending people turn fsync off, but the O/S "always" exists,
if it is that flaky, you might lose data anyway, so pick a better O/S.

What's more likely in most places is somebody powering down the
server abruptly, and then fsync=off could hurt :).

Regards,
Link.


Re: Linux v.s. Mac OS-X Performance

From
Guido Neitzer
Date:
On 30.11.2007, at 04:48, Wolfgang Keller wrote:

> LSI drivers are not available for MacOS X on PowerMacs? Ouch.

The problem is that they suck as they can't to channel bundling for
higher trough-put to a single disk array.

[not your comment, but referred there]
>> and Mac OS X, PostgreSQL has enough
>> performance issues that I feel using those plaforms can only be
>> justified if platform compatibility is more important than
>> performance to you.

Actually - In our test if just used with a similar load as pgbench
(e.g. typical web applications) Mac OS X  10.4.7 performed better then
Yellow Dog Linux (I was testing with G5 hardware) on the same hardware
as soon as more than about 90 concurrent clients were simulated.

But okay, don't trust statistics you didn't make up yourself ...

cug

Re: Linux v.s. Mac OS-X Performance

From
Greg Smith
Date:
On Fri, 30 Nov 2007, Guido Neitzer wrote:

> Actually - In our test if just used with a similar load as pgbench (e.g.
> typical web applications) Mac OS X  10.4.7 performed better then Yellow Dog
> Linux (I was testing with G5 hardware) on the same hardware as soon as more
> than about 90 concurrent clients were simulated.

At this point, that's just an interesting historical note.  Yellow Dog is
not a particularly good Linux compared with the ones that have gotten
years worth of performance tuning for Intel/AMD processors.  And you
really can't extrapolate anything useful today from how it ran on a
G5--that's two layers of obsolete.  The comparisons that matter now are
Intel+Mac OS vs. Intel+a popular Linux aimed at servers.

As an unrelated note, I'm curious what you did with pgbench that you
consider it a reasonable similation of a web application.  The default
pgbench transaction is very write-heavy, and the read-only option
available is way too simple to be realistic.  You'd need to pass in custom
scripts to execute to get something that acted like a web app.  pgbench is
an unruly tool, and there's many ways to run it that gives results that
aren't so useful.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Linux v.s. Mac OS-X Performance

From
Greg Smith
Date:
On Fri, 30 Nov 2007, Lincoln Yeoh wrote:

> Anecdotal - I have found "smart" raid controllers to fail more often than
> dumb scsi controllers (or even SATA/PATA controllers), and some seem more
> failure prone than semi-decent operating systems.

You'd need to name some names here for this to mean too much.  There are
plenty of positively miserable RAID controllers out there.  I wouldn't
trust the cards from Adaptec, Promise, and Highpoint to correctly store a
database about what's in my pockets.

> What's more likely in most places is somebody powering down the server
> abruptly, and then fsync=off could hurt :).

Here you're hitting on the real point.  If a proposed solution adds
potential for database corruption if someone trips over the server cord,
it's not really a solution at all.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Linux v.s. Mac OS-X Performance

From
Owen Hartnett
Date:
At 6:15 PM -0500 11/30/07, Greg Smith wrote:
>On Fri, 30 Nov 2007, Guido Neitzer wrote:
>
>>Actually - In our test if just used with a similar load as pgbench
>>(e.g. typical web applications) Mac OS X  10.4.7 performed better
>>then Yellow Dog Linux (I was testing with G5 hardware) on the same
>>hardware as soon as more than about 90 concurrent clients were
>>simulated.
>
>At this point, that's just an interesting historical note.  Yellow
>Dog is not a particularly good Linux compared with the ones that
>have gotten years worth of performance tuning for Intel/AMD
>processors.  And you really can't extrapolate anything useful today
>from how it ran on a G5--that's two layers of obsolete.  The
>comparisons that matter now are Intel+Mac OS vs. Intel+a popular
>Linux aimed at servers.
>
>As an unrelated note, I'm curious what you did with pgbench that you
>consider it a reasonable similation of a web application.  The
>default pgbench transaction is very write-heavy, and the read-only
>option available is way too simple to be realistic.  You'd need to
>pass in custom scripts to execute to get something that acted like a
>web app.  pgbench is an unruly tool, and there's many ways to run it
>that gives results that aren't so useful.

If this is any help to anyone, I'm running Postgresql on an Intel
Xserve Mac OS X.  Performance is more than fine for my usage.  If
anyone would like me to run some benchmark code to test comparisons,
I'd be happy to do so.

-Owen

Re: Linux v.s. Mac OS-X Performance

From
Greg Smith
Date:
On Fri, 30 Nov 2007, Wolfgang Keller wrote:

> it was impossible for me to find a similarly priced
> (Linux-/*BSD/Intel/AMD-)equivalent to my PowerMac G5 over here at the
> time when I bought it.

The problem from my perspective is the common complaint that Apple doesn't
ship an inexpensive desktop product that would be suitable for light-duty
server work.  Their cheapest system you can add a PCI-X card to is $2200
USD (I just priced a system out and realized I can downgrade the
processors from the default), and that has only has 4 SATA drive bays
which doesn't make it much of a serious database server platform.  A
similarly configured system from Dell runs around $1900, which gives the
usual (and completely reasonable) Apple tax of around $300.  However, I
can just as easily pop over to Dell, buy a $500 system, drop an SATA
RAID+BBC controller in for another $400, and I've got a perfectly
reasonable little server--one that on write-heavy loads will outperform at
least double its price in Apple hardware, simply because that's how much
it costs to get the cheapest system you can put a caching controller in
from them.

(Don't anyone take that as a recommendation for Dell hardware, which I
hate, but simply as a reference point; the only thing I like about them is
that the system building interface on their web site makes it easy to do
comparisons like this)

>> For example, if you have an application that needs high
>> database write throughput, to make that work well with PostgreSQL you
>> must have a controller with a battery backed cache.
>
> Hmm, what would be the difference compared to plenty of RAM and a UPS (plus
> stand-by backup server)? Looks just like moving the "single point of failure"
> to adifferent hardware item, no...?

When you write a WAL record to commit a transaction, if you can cache that
write it doesn't slow any client down.  If you can't, the database waits
for a physical write to the disk, which can only happen at a rate that
depends on your disk's rotation speed.  For a standard 7200RPM drive, that
tops out a bit less than 120 writes/second for any single client, and
somewhere around 500 total for larger numbers of simultaneous clients.

The only robust way to cache a write involves a battery-backed controller.
Relying on RAM or the write cache in the drives, even if you have the
world's greatest UPS, means that the first person who accidentally unplugs
your system (or the first power supply failure) could corrupt your
database.  That's really not acceptable for anyone.  But since the
integrity policy of the good caching controlers is far better than that,
you can leave that cache on safely, and only expect corruption if there's
a multi-day power outage.

It's still more rambling than I'd like, but I have the pieces to a full
discussion of this topic at
http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm

> LSI drivers are not available for MacOS X on PowerMacs? Ouch.

There might be something out there, but I'm not aware of anything from
them or other vendors targeted at the current Intel Power Macs that looks
robust; there's just Apple's offering.

> Erm, systematic error here: It could also be that the MySQL
> implementation/configuration for the two different OSes was the source
> for the performance difference.

That's possible, but other than the specific fsync write fixes they
applied for OS X I'm not aware of anything specific to Mac OS that would
cause this.  When the low-level benchmarks show awful performance doing
things like creating processes, and performance dives under a heavy load,
it seems sensible to assume the two are linked until proven otherwise.
(Appropriate disclaimer:
http://en.wikipedia.org/wiki/Correlation_does_not_imply_causation )

It's also true that some of the MySQL threading limitations that were
brought up in a tangent to this discussion could be contributing as well,
in which case a PostgreSQL test might not show as large of a gap.  Again,
criticizing the benchmark methods doesn't accomplish anything, you need an
advocate for the platform to perform ones showing otherwise before the
current results are disproven.

> The point is that cost for "installation", "configuration" and
> "administration" must be taken into account.

The question you asked about was how Apple Hardware+Mac OS X+PostgreSQL
stacks up on a performance basis with more common platforms like PC
hardware+Linux.  All the answers I've seen suggest not very well, and none
of these other things are relevant when evaluating the platform from a
performance perspetive.  TCO issues are all relative to the administrator
and tasks anyway--an experienced Linux system administrator may be a
little slower on some things than one running Apple's GUI tools, but once
you get to more scriptable changes they could be far more efficient.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD