Maybe OT, not sure Re: Best suiting OS - Mailing list pgsql-performance

From Mark Mielke
Subject Maybe OT, not sure Re: Best suiting OS
Date
Msg-id 4AC9585A.2070707@mark.mielke.cc
Whole thread Raw
In response to Re: Best suiting OS  (Scott Marlowe <scott.marlowe@gmail.com>)
Responses Re: Maybe OT, not sure Re: Best suiting OS
List pgsql-performance
This is kind of OT, unless somebody really is concerned with
understanding the + and - of distributions, and is willing to believe
the content of this thread as being accurate and objective... :-)

On 10/04/2009 08:42 PM, Scott Marlowe wrote:
> On Sun, Oct 4, 2009 at 8:05 AM, Mark Mielke<mark@mark.mielke.cc>  wrote:
>
>> So any comparisons between operating system *distributions* should be fair.
>> Comparing a 2007 release to a 2009 release, for example, is not fair. RHEL /
>> CentOS are basically out of the running right now, because they are so old.
>> However, RHEL 6.0 should be out in January or February, 2010, at which point
>> it will be relevant again.
>>
> It's completely fair.  I have a Centos 5.2 machine with 430 or so days
> of uptime.  I put it online, tested it and had it ready 430 days ago
> and it's crashed / hung exactly zero times since.  You're right.  It's
> completely unfair to compare some brand new kernel with unknown
> bugginess and stability issues to my 5.2 machine.  Oh wait, you're
> saying Centos is out of the running because it's old?  That's 110%
> backwards from the way a DBA should be thinking.  First make it
> stable, THEN look for ways to make it performance.  A DB server with
> stability issues is completely useless in a production environment.
>

Maybe - if the only thing the server is running is PostgreSQL. Show of
hands - how many users who ONLY install PostgreSQL, and use a bare
minimum OS install, choosing to not run any other software? Now, how
many people ALSO run things like PHP, and require software more
up-to-date than 3 years?

>> Personally, I use Fedora, and my servers have been quite stable. One of our
>> main web servers running Fedora:
>>
> It's not that there can't be stable releases of FC, it's that it's not
> the focus of that project.  So, if you get lucky, great!  I can't
> imagine running a production DB on FC, with it's short supported life
> span and focus on development and not stability.
>

Depends on requirements. If the application is frozen in time and
doesn't change - sure. If the application keeps evolving and benefits
from new base software - to require an upgrade every 12 months or more
out of *application* requirements (not even counting OS support
requirements), may not be unusual. In any case - I'm not telling you to
use Fedora. I'm telling you that I use Fedora, and that RHEL 5 is too
old from an application software perspective for anybody with a
requirement on more than a handful of base OS packages.

>> [mark@bambi]~% uptime
>>   09:45:41 up 236 days, 10:07,  1 user,  load average: 0.02, 0.04, 0.08
>>
>> It was last rebooted as a scheduled reboot, not a crash. This isn't to say
>> Fedora will be stable for you - however, having used both RHEL and Fedora,
>> in many different configurations, I find RHEL being 2+ years behind in terms
>> of being patch-current means that it is NOT as stable on modern hardware.
>>
> It is NOT 2+ years behind on patches.  Any security issues or bugs
> that show up get patched.  Performance enhancing architectural changes
> get to wait for the next version (RHEL6).
>

Not true. They only backport things considered "important". BIND might
be updated. Firefox might be updated. Most packages see no updates. Some
packages see many updates. As I said - the kernel has something like
3000 patches applied against it (although that's a small subset of all
of the changes made to the upstream kernel). It is not true that "any
security issues or bugs that show up get patched." *Some* security
issues or bugs that show up get patched. If they patched everything
back, they wouldn't have a stable release. Also, they cross this line -
performance enhancing architectural changes *are* made, but only if
there is sufficient demand from the customer base. XFS, EXT4, and FUSE
made it into RHEL 5.4. Even so, plenty of open source software is
difficult or impossible to compile for RHEL 5 without re-compiling base
packages or bringing them in from another source. Try compiling
Subversion 1.6.5 with GNOME keyring support on RHEL 5.3 - that was the
last one that busted us. In fact, this one is still open for us.


>> Most recently, I installed RHEL on a 10 machine cluster of HP nodes, and
>> RHEL would not detect the SATA controller out of the box and reverted to
>> some base PIO mode yielding 2 Mbyte/s disk speed.
>>
> Yes, again,  RHEL is focused on making stable, already production
> capable hardware stay up, and stay supported for 5 to 7 years.  It's a
> different focus.
>

Yes, exactly. Which is why a new deployment should align against the
RHEL release. Deploying RHEL 5 today, when RHEL 5 is 3 years old, and
RHEL 6 is coming out in 4 months, means that your RHEL 5 install is not
going to have 5 to 7 years of life starting today. Half the support life
has almost elapsed at this point.

>> Fedora was able to achieve
>> 112 Mbyte/s on the same disks. Some twiddling of grub.conf allowed RHEL to
>> achieve the same speed, but the point is that there are two ways to
>> de-stabilize a kernel. One is to use the leading edge, the other is to use
>> the trailing edge.
>>
> Sorry, but your argument does not support this point.  RHEL took
> twiddling to work with the SATA ports.
>

You don't even know what the twiddling is - so not sure what you mean by
"support". Do you think it is a well tested configuration from an RHEL
perspective? If an OS can be considered to "support" hardware, if
twiddling in grub.conf will get it to work - then we may as well
conclude that all Linux distributions "support" most or all hardware,
and that they are all "stable". I think it "works", I don't think it is
stable. Every time I upgrade the kernel, I have to watch and make sure
that it still works on next boot, and that grubby has propagated my
grub.conf hack forwards to the next install. Great. Makes me feel REAL
comfortable that my configuration is "supported". (Sarcastic in case it
wasn't obvious).


>> Using an operating system designed for 2007 on hardware
>> designed in 2009 is a bad idea.
>>
> Depends on whether or not it's using the latest and greatest or if
> RHEL has back patched support for newer hardware.  Using RHEL on
> hardware that isn't officially supported is a bad idea.  I agree about
> halfway here, but not really.  I have brand new machines running
> Centos 5.2 with no problem.
>

We also had some HP machines with fancy video cards (dual-headed HDMI
something or other) which I can't even get X to work on with RHEL or
with Fedora, and the machines are probably from ~2006. It depends on if
you stick to standard commodity hardware or fancy obscure stuff. In my
case, I just switched to run level 3 as these were only going to be used
to test some install processes, and were not going to be used as
graphical desktops any ways.

Back in 2006 when I decided on Fedora 8 for one of my servers, it was
because RHEL/CentOS 4.x and Ubuntu would not detect my RAID controller
properly no matter what I tried, and Fedora worked out of the box.

Different people have different experiences. Obviously, so research
and/or lucky choices can improve the odds of a success - but it's sort
of difficult for an operating system to predict what sort of hardware
enhancement will come along two years from now, and prepare for it. Some
enhancements like DDR3 come for free. Other enhancements, such as hyper
threading, turn out to result in disaster. (For those that didn't follow
the HT problems - many operating systems treated the HT as an
independent core, and it could result in two CPU-heavy processes being
assigned to the same single core, leaving the other core idle)

SATA/SAS is one that affected lots of platforms, and affected me as I
described. The BIOS usually has some sort of "legacy IDE" mode where it
let's the SATA pretend to be IDE - but this loses many of the benefits
of SATA. For example, my Fedora installs are benefitting from NCQ,
whereas the RHEL 5.3 installs on the same hardware do not know how to
use NCQ.

You might be fine with CentOS 5.2 on your modern hardware - but I
suspect that your CentOS 5.2 is not making the absolute best use of your
modern hardware. For busy servers, I find "relatime" to be essential.
With CentOS 5.2 - you don't even have that option available to you, and
it has nothing to do with hardware capabilities. You server is busy
scattering writes across your platters for no benefit to you.

>> Using a modern operating system
>> on modern hardware that the operating system was designed for will give you
>> the best performance potential.
>>
> True.  But you have to test it hard and prove it's reliable first,
> cause it really doesn't matter how fast it crashes.
>

I think it's prudent to "test it hard" no matter what configuration you
have selected - whether latest available hardware / software, or whether
hardware and software that is 3 years old. It's your (and mine)
reputation on the line. I certainly wouldn't say "eh, the hardware and
software is two years old - of course it will work!", and I bet you
wouldn't either.

>> In this case, Fedora 11 with Linux 2.6.30.8
>> is almost guaranteed to out-perform RHEL 5.3 with Linux 2.6.18. Where Fedora
>> is within 1 to 6 months of the leading edge, Ubuntu is within 3 to 9 months
>> of the leading edge, so Ubuntu will perform more similar to Fedora than
>> RHEL.
>>
> And on more exotic hardware (think 8 sockets of 6 core CPUs 128G RAM
> and $1400 RAID controllers) it's usually much less well tested yet,
> and more likely to bite you in the but.  Nothing like waking up to a
> production issue where 46 of your 48 cores are stuck spinning at 100%
> with some wonderful new kernel bug that's gonna take 6 months to get a
> fix to.  I'll stick to 15% slower but never crashing.  I'll test the
> new OS for sure, but I won't trust it enough to put it into production
> until it's had a month or more of testing to prove it works.  And with
> FC, that's a large chunk of its support lifespan.
>

Yeah - I wouldn't use a "deskop OS" on server hardware without making
sure it worked. I doubt Fedora or Ubuntu are heavily tested on exotic
hardware such as you describe. For anybody not willing to do some
testing, you are definitely right.

>> I've given up on the OS war. People use what they are comfortable with.
>>
> No, I'm not a huge fan of RHEL.  I'd prefer to run debian or ubuntu on
> my servers, but both had some strange bug with multicore systems last
> year, and I was forced to run Centos to get a stable machine.  I'll
> take a look at Solaris / Open Solaris when I get a chance.  And
> FreeBSD now too.  What I'm comfortable with will only matter if it's
> stable and reliable and supportable.
>

Unfortunately, my ventures into Debian/Ubuntu had the same results. The
last time I tried, it wouldn't grok my desired RAID configuration. I
gave it a real try too. Oh well.

Solaris is good - but I think it doesn't have the ability to sustain
market share. I'm not investing any of my efforts into it. Of course, I
put FreeBSD into that category as well. Maybe I'm prejudiced. :-)

>> Comfort lasts until the operating systems screws a person over, at which
>> point they "hate" it, and switch to something else. It's about passion - not
>> about actual metrics, capability, reliability, or any of these other
>> supposed criteria.
>>
> Sorry, but you obviously don't know me, and I'm guessing a lot of
> people on this list, well enough to make that judgement.  Maybe for
> some folks comfort is what matters.  For professionals it's not that
> at all.  It's about stability, reliability, and performance.  Comfort
> is something we just hope we can get in the deal too.
>

It might be venturing on insult, although it isn't meant to be - but I
think if any of you or us is honest about it - you and I have no idea
whether our installations are truly reliability. They work until they
don't. We cannot see the future. We draw up our own conclusions on what
criteria makes a "reliable" system. We read up on statistics and the
reviews done by others and come to a conclusion. We do our best - but
ultimately, we're guessing. We're using our judgement - we're choosing
what conclusion we are comfortable with. We're comfortable until our
hardware / software betrays us, at which point we feel hurt, and we
decide between forgiving the betrayal or boycotting the configuration. :-)

The best advice of all is your advice earlier above. "Test it hard". Try
and make it fail. Do a good enough job here, and it is more data / less
faith in our conclusions. :-)

The configuration can still fail, though. I would even expect it. I
prefer to assume failure and focus on a contingency plan. I know my
disks and power supplies are going to fail. I know my database will be
corrupted. How do I minimize the impact when such a thing does occur, as
it eventually will?

Cheers,
mark

--
Mark Mielke<mark@mielke.cc>


pgsql-performance by date:

Previous
From: Omar Kilani
Date:
Subject: Re: Bad performance of SELECT ... where id IN (...)
Next
From: Craig James
Date:
Subject: Re: Best suiting OS