Thread: Machine available for community use

Machine available for community use

From
"Gavin M. Roy"
Date:
Recently I've been involved in or overheard discussions about SMP
scalability at both the PA PgSQL get together and in some list
traffic.

myYearbook.com would ike to make one of our previous production
machines available to established PgSQL Hackers who don't have access
to this level of hardware for testing, benchmarking and development to
work at improving SMP scalability and related projects.

The machine is a HP 585 G1, 8 Core AMD, 32GB RAM with one 400GB 14
Spindle DAS Array dedicated to community use.  I've attached a text
file with dmesg and /proc/cpuinfo output.

I'm working on how this will be setup and am open to suggestions on
how to structure access.

I'm currently in the process of having Gentoo linux reinstalled on the
box since that is what I am most comfortable administering from a
security perspective.  If this will be a blocker for developers who
would actually work on it, please let me know.

If you're interested in access, my only requirement is that you're a
current PgSQL Hacker with a proven track-record of committing patches
to the community.  This is a resource we could be using for something
else, and I'd like to see the community get direct benefit from it as
opposed to it being a play sandbox for people who want to tinker.

Please let me know thoughts, concerns or suggestions.

Gavin M. Roy
CTO
myYearbook.com
gmr@myyearbook.com

Attachment

Re: Machine available for community use

From
Tom Lane
Date:
"Gavin M. Roy" <gavinmroy@gmail.com> writes:
> I'm currently in the process of having Gentoo linux reinstalled on the
> box since that is what I am most comfortable administering from a
> security perspective.  If this will be a blocker for developers who
> would actually work on it, please let me know.

Personally I'd prefer almost any of the other Linux distros.
Gentoo always leaves me wondering exactly what I'm running today,
and I think reproducibility is an important attribute for a benchmarking
machine.
        regards, tom lane


Re: Machine available for community use

From
"Gavin M. Roy"
Date:
If you're interested in using the box, name what you want installed.

On 7/25/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Gavin M. Roy" <gavinmroy@gmail.com> writes:
> > I'm currently in the process of having Gentoo linux reinstalled on the
> > box since that is what I am most comfortable administering from a
> > security perspective.  If this will be a blocker for developers who
> > would actually work on it, please let me know.
>
> Personally I'd prefer almost any of the other Linux distros.
> Gentoo always leaves me wondering exactly what I'm running today,
> and I think reproducibility is an important attribute for a benchmarking
> machine.
>
>                         regards, tom lane
>


Re: Machine available for community use

From
"Mark Wong"
Date:
On 7/25/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Gavin M. Roy" <gavinmroy@gmail.com> writes:
> > I'm currently in the process of having Gentoo linux reinstalled on the
> > box since that is what I am most comfortable administering from a
> > security perspective.  If this will be a blocker for developers who
> > would actually work on it, please let me know.
>
> Personally I'd prefer almost any of the other Linux distros.
> Gentoo always leaves me wondering exactly what I'm running today,
> and I think reproducibility is an important attribute for a benchmarking
> machine.

Tom, have any specific ideas in mind for using the system?  While I'm
used to having more disks it could be useful nonetheless for the tests
I used to run if there are no other ideas.

Rats, I've always liked Gentoo. ;)

Regards,
Mark


Re: Machine available for community use

From
"Gavin M. Roy"
Date:
Note it's a 28 disk system, and I can allocate more if needed, but I
was going to use one MSA for internal use.

On 7/25/07, Mark Wong <markwkm@gmail.com> wrote:
> On 7/25/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > "Gavin M. Roy" <gavinmroy@gmail.com> writes:
> > > I'm currently in the process of having Gentoo linux reinstalled on the
> > > box since that is what I am most comfortable administering from a
> > > security perspective.  If this will be a blocker for developers who
> > > would actually work on it, please let me know.
> >
> > Personally I'd prefer almost any of the other Linux distros.
> > Gentoo always leaves me wondering exactly what I'm running today,
> > and I think reproducibility is an important attribute for a benchmarking
> > machine.
>
> Tom, have any specific ideas in mind for using the system?  While I'm
> used to having more disks it could be useful nonetheless for the tests
> I used to run if there are no other ideas.
>
> Rats, I've always liked Gentoo. ;)
>
> Regards,
> Mark
>


Re: Machine available for community use

From
Tom Lane
Date:
"Gavin M. Roy" <gavinmroy@gmail.com> writes:
> If you're interested in using the box, name what you want installed.

Personally I use Fedora, but that's because of where I work ;-).
I have no objection to some other distro so long as it's one where
other people can duplicate your environment easily (no locally
compiled stuff).  A disadvantage of Fedora is its relatively short
support lifetime --- if you don't want to have to reinstall at least
once a year, something else would be better.
        regards, tom lane


Re: Machine available for community use

From
"Gavin M. Roy"
Date:
Ubuntu server?  Slackware?  Not a fan of Centos, RHEL or Fedora...
What about on the BSD side of things?

On 7/25/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Gavin M. Roy" <gavinmroy@gmail.com> writes:
> > If you're interested in using the box, name what you want installed.
>
> Personally I use Fedora, but that's because of where I work ;-).
> I have no objection to some other distro so long as it's one where
> other people can duplicate your environment easily (no locally
> compiled stuff).  A disadvantage of Fedora is its relatively short
> support lifetime --- if you don't want to have to reinstall at least
> once a year, something else would be better.
>
>                         regards, tom lane
>


Re: Machine available for community use

From
Greg Smith
Date:
On Wed, 25 Jul 2007, Gavin M. Roy wrote:

> Ubuntu server?  Slackware?  Not a fan of Centos, RHEL or Fedora...

Unless you did a custom intall, using Ubuntu server would expose the 
people using your server to the quirks of how the Debian packages for 
PostgreSQL differ from other Linux distributions.  I'm not sure whether 
that would be a good (shine some light on that underdocumented area) or 
bad (get in people's way) thing.  The way they make it easier to manage 
multiple clusters might actually be ideal for what you're trying to do, 
let people have their own cluster and stay out of each other's data space 
at least.

I think Slackware has all the non-mainstream issues of Gentoo, but without 
the advantages Portage brings.

> What about on the BSD side of things?

Since your goal is improve scalability on Linux, I think you'd be best 
focusing on that.  There's just enough low-level differences between the 
two that I'd hate to see you put resources into improving scaling, only to 
discover it doesn't actually help what you put into production because the 
platform is too different.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD


Re: Machine available for community use

From
"Simon Riggs"
Date:
On Wed, 2007-07-25 at 08:50 -0700, Mark Wong wrote:
> On 7/25/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > "Gavin M. Roy" <gavinmroy@gmail.com> writes:
> > > I'm currently in the process of having Gentoo linux reinstalled on the
> > > box since that is what I am most comfortable administering from a
> > > security perspective.  If this will be a blocker for developers who
> > > would actually work on it, please let me know.

Gavin, I'd like access please. This sounds very cool. We'll be able to
show each other directly what's going on, even log on together to
inspect various aspects of runs.

Will you run a booking system?

Could you give us some details about myYearbook.com's application? I
feel we should prioritise work slightly so that the contributor can see
some benefit coming their way in the longer term.

> > Personally I'd prefer almost any of the other Linux distros.
> > Gentoo always leaves me wondering exactly what I'm running today,
> > and I think reproducibility is an important attribute for a benchmarking
> > machine.
> 
> Tom, have any specific ideas in mind for using the system?  While I'm
> used to having more disks it could be useful nonetheless for the tests
> I used to run if there are no other ideas.

Mark, If you're thinking TPC-E, so am I. Where are we with the TPC-E
toolkit you guys were working on?

Initially though, I'd like to do some tests on CVS HEAD with large
shared_buffers settings, so the 32GB RAM will come in handy for that and
no worries about disks.

> Rats, I've always liked Gentoo. ;)

I'd agree with Tom on that: we need a system that remains the same over
longer periods, not simply a very fast one. I'm OK with Fedora.

--  Simon Riggs EnterpriseDB  http://www.enterprisedb.com



Re: Machine available for community use

From
"Mark Wong"
Date:
On 7/25/07, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Wed, 2007-07-25 at 08:50 -0700, Mark Wong wrote:
> > On 7/25/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > "Gavin M. Roy" <gavinmroy@gmail.com> writes:
> > > > I'm currently in the process of having Gentoo linux reinstalled on the
> > > > box since that is what I am most comfortable administering from a
> > > > security perspective.  If this will be a blocker for developers who
> > > > would actually work on it, please let me know.
>
> Gavin, I'd like access please. This sounds very cool. We'll be able to
> show each other directly what's going on, even log on together to
> inspect various aspects of runs.
>
> Will you run a booking system?
>
> Could you give us some details about myYearbook.com's application? I
> feel we should prioritise work slightly so that the contributor can see
> some benefit coming their way in the longer term.
>
> > > Personally I'd prefer almost any of the other Linux distros.
> > > Gentoo always leaves me wondering exactly what I'm running today,
> > > and I think reproducibility is an important attribute for a benchmarking
> > > machine.
> >
> > Tom, have any specific ideas in mind for using the system?  While I'm
> > used to having more disks it could be useful nonetheless for the tests
> > I used to run if there are no other ideas.
>
> Mark, If you're thinking TPC-E, so am I. Where are we with the TPC-E
> toolkit you guys were working on?
>
> Initially though, I'd like to do some tests on CVS HEAD with large
> shared_buffers settings, so the 32GB RAM will come in handy for that and
> no worries about disks.

Yeah, the the C stored functions are half done but there is a finished
implementation for the pl/pgsql stored functions.  It's in decent
shape otherwise, although it's mostly based on the 0.32 version.

> > Rats, I've always liked Gentoo. ;)
>
> I'd agree with Tom on that: we need a system that remains the same over
> longer periods, not simply a very fast one. I'm OK with Fedora.

True, I'll settle for whatever everyone agrees with.

Regards,
Mark


Re: Machine available for community use

From
Greg Smith
Date:
On Wed, 25 Jul 2007, Tom Lane wrote:

> Gentoo always leaves me wondering exactly what I'm running today,
> and I think reproducibility is an important attribute for a benchmarking
> machine.

At this point, there's enough performance variations even between 
individual Linux kernel releases that I'm not sure how much 
reproducibility you're ever going to get here.  Are the differences 
between Gentoo and RHEL any bigger than those, say, between RHEL and SuSE?

The idea of setting this up with a long-term stable distribution runs 
counter to one of the things that I think is important to explore here, 
which is testing how more recent Linux kernels have improved their 
scalability.  Do you really want to put a lot of time into identifying and 
working around the source of a problem with the typically older kernels 
that ship with the more stable releases if one answer is "that goes away 
if you use 2.6.21 or later because they fixed the bug that caused it"? 
I've watched that sort of thing happen with PG+Linux, and when involved in 
one of the recent roving talks Gavin mentioned I recall him mentioning a 
bit of that experience himself.  You'd be hard pressed to find a better 
platform for that kind of experimentation than Gentoo.

The best you can hope for, I think, is that you can walk away with some 
general benchmark expectations and "on Gavin's machine, this worked 
better"; then try to replicate that improvement elsewhere.  If you want to 
push bleeding edge performance, I'd expect it's impractical to do that and 
target long-term results stability at the same time.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD


Re: Machine available for community use

From
Stefan Kaltenbrunner
Date:
Simon Riggs wrote:
> On Wed, 2007-07-25 at 08:50 -0700, Mark Wong wrote:
>> On 7/25/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> "Gavin M. Roy" <gavinmroy@gmail.com> writes:
>>>> I'm currently in the process of having Gentoo linux reinstalled on the
>>>> box since that is what I am most comfortable administering from a
>>>> security perspective.  If this will be a blocker for developers who
>>>> would actually work on it, please let me know.
> 
> Gavin, I'd like access please. This sounds very cool. We'll be able to
> show each other directly what's going on, even log on together to
> inspect various aspects of runs.
> 
> Will you run a booking system?
> 
> Could you give us some details about myYearbook.com's application? I
> feel we should prioritise work slightly so that the contributor can see
> some benefit coming their way in the longer term.
> 
>>> Personally I'd prefer almost any of the other Linux distros.
>>> Gentoo always leaves me wondering exactly what I'm running today,
>>> and I think reproducibility is an important attribute for a benchmarking
>>> machine.
>> Tom, have any specific ideas in mind for using the system?  While I'm
>> used to having more disks it could be useful nonetheless for the tests
>> I used to run if there are no other ideas.
> 
> Mark, If you're thinking TPC-E, so am I. Where are we with the TPC-E
> toolkit you guys were working on?
> 
> Initially though, I'd like to do some tests on CVS HEAD with large
> shared_buffers settings, so the 32GB RAM will come in handy for that and
> no worries about disks.
> 
>> Rats, I've always liked Gentoo. ;)
> 
> I'd agree with Tom on that: we need a system that remains the same over
> longer periods, not simply a very fast one. I'm OK with Fedora.

fedora is probably not a prime example for "stays same over long period"
(which I think is important) since it has pretty short release cycles.
Maybe something like ubuntu LTS, Debian Etch or even CentOS would be
more appropriate (we have debian on a number of very similiar HP boxes
and HP is doing Debian Support now too).


Stefan


Re: Machine available for community use

From
Stefan Kaltenbrunner
Date:
Greg Smith wrote:
> On Wed, 25 Jul 2007, Gavin M. Roy wrote:
> 
>> Ubuntu server?  Slackware?  Not a fan of Centos, RHEL or Fedora...
> 
> Unless you did a custom intall, using Ubuntu server would expose the
> people using your server to the quirks of how the Debian packages for
> PostgreSQL differ from other Linux distributions.  I'm not sure whether
> that would be a good (shine some light on that underdocumented area) or
> bad (get in people's way) thing.  The way they make it easier to manage
> multiple clusters might actually be ideal for what you're trying to do,
> let people have their own cluster and stay out of each other's data
> space at least.

for a server like this I don't think anybody cares at all for the
prepackaged postgresql. People are likely to use such a box for
development/testing of new patches and development stuff. so what they
need is a proper toolchain and solid packages. Debian derived
distributions are quite good at that usually (debian etch ships with gcc
3.3,gcc 3.4 and gcc 4.1 for example) and I expect people to simply get
their accounts and do all their work in their home-directories
anyway(which sounds like the normal way to develop on unix like OSes to me).


Stefan


Re: Machine available for community use

From
Tom Lane
Date:
Greg Smith <gsmith@gregsmith.com> writes:
> Unless you did a custom intall, using Ubuntu server would expose the 
> people using your server to the quirks of how the Debian packages for 
> PostgreSQL differ from other Linux distributions.

I doubt we'd be doing much work with the distro-installed version of
Postgres anyway, so this doesn't seem like a big concern.  In fact,
to avoid confusion it might be best if the machine has no
distro-installed Postgres at all.  That would help avoid "oops, that
test was run against the wrong server" syndrome.

I do essentially all my development work with installations that are
--prefix'd to user directories and started/stopped by hand; it's just
a lot easier to manage a pile of different versions that way.  Plus
I never need to become root.  Not sure how other developers work,
though.
        regards, tom lane


Re: Machine available for community use

From
Tom Lane
Date:
Greg Smith <gsmith@gregsmith.com> writes:
> On Wed, 25 Jul 2007, Tom Lane wrote:
>> Gentoo always leaves me wondering exactly what I'm running today,
>> and I think reproducibility is an important attribute for a benchmarking
>> machine.

> At this point, there's enough performance variations even between 
> individual Linux kernel releases that I'm not sure how much 
> reproducibility you're ever going to get here.  Are the differences 
> between Gentoo and RHEL any bigger than those, say, between RHEL and SuSE?

The problem I've got with Gentoo is that it encourages homegrown builds
with randomly-chosen options and compiler switches.  That may help
squeeze out a bit more speed but it does nothing for stability, nor
reproduceability of results on other platforms which is what we really
care about here.

Another fairly big issue is that we need to know whether measurements we
take in August are comparable to measurements we take in October, so a
fairly stable platform is important.  As you say, a fast-changing kernel
would make it difficult to have any confidence about comparability over
time.  That would tend to make me vote for RHEL/Centos, where long-term
stability is an explicit development goal.  Debian stable might do too,
though I'm not as clear about their update criteria as I am about Red Hat's.

> The idea of setting this up with a long-term stable distribution runs 
> counter to one of the things that I think is important to explore here, 
> which is testing how more recent Linux kernels have improved their 
> scalability.

Dunno if Gavin wants to manage multiple systems, but for most of what
I'd like to do a bleeding-edge kernel is exactly what I do not want.
        regards, tom lane


Re: Machine available for community use

From
Stefan Kaltenbrunner
Date:
Tom Lane wrote:
> Greg Smith <gsmith@gregsmith.com> writes:
>> On Wed, 25 Jul 2007, Tom Lane wrote:
>>> Gentoo always leaves me wondering exactly what I'm running today,
>>> and I think reproducibility is an important attribute for a benchmarking
>>> machine.
> 
>> At this point, there's enough performance variations even between 
>> individual Linux kernel releases that I'm not sure how much 
>> reproducibility you're ever going to get here.  Are the differences 
>> between Gentoo and RHEL any bigger than those, say, between RHEL and SuSE?
> 
> The problem I've got with Gentoo is that it encourages homegrown builds
> with randomly-chosen options and compiler switches.  That may help
> squeeze out a bit more speed but it does nothing for stability, nor
> reproduceability of results on other platforms which is what we really
> care about here.
> 
> Another fairly big issue is that we need to know whether measurements we
> take in August are comparable to measurements we take in October, so a
> fairly stable platform is important.  As you say, a fast-changing kernel
> would make it difficult to have any confidence about comparability over
> time.  That would tend to make me vote for RHEL/Centos, where long-term
> stability is an explicit development goal.  Debian stable might do too,
> though I'm not as clear about their update criteria as I am about Red Hat's.

Fully agreed (on the RH/CentOS and longterm stability stuff) debian is
even more stricter/conservatve than RH usually - they only have security
bugs and on very rare occation bugfixes for major issues(RH sometimes
adds new features and stuff in their point-releases).
Debian etch seems to be (very) slightly relaxing that - and in fact a
number of people were very surprised to see PostgreSQL updated from
8.1.8 (as shipped in etch) to 8.1.9 with the latest security release :-)
I would agree however that gentoo and also slackware are not "that"
attractive for this kind of work.


Stefan


Re: Machine available for community use

From
"Dave Page"
Date:

> ------- Original Message -------
> From: Tom Lane <tgl@sss.pgh.pa.us>
> To: Greg Smith <gsmith@gregsmith.com>
> Sent: 25/07/07, 18:54:50
> Subject: Re: [HACKERS] Machine available for community use
> 
> Another fairly big issue is that we need to know whether measurements we
> take in August are comparable to measurements we take in October, so a
> fairly stable platform is important.  As you say, a fast-changing kernel
> would make it difficult to have any confidence about comparability over
> time.  That would tend to make me vote for RHEL/Centos, where long-term
> stability is an explicit development goal.  Debian stable might do too,
> though I'm not as clear about their update criteria as I am about Red Hat's.

Perhaps RH could donate us a RHEL/RHN licence for this?

/D


Re: Machine available for community use

From
"Gavin M. Roy"
Date:
One thing to take into account is I dont have physical access to the
box (It is in TX, I am in PA).  All installs but Gentoo will be
performed by a well trained NOC monkey. *cough*

On 7/25/07, Dave Page <dpage@postgresql.org> wrote:
>
>
> > ------- Original Message -------
> > From: Tom Lane <tgl@sss.pgh.pa.us>
> > To: Greg Smith <gsmith@gregsmith.com>
> > Sent: 25/07/07, 18:54:50
> > Subject: Re: [HACKERS] Machine available for community use
> >
> > Another fairly big issue is that we need to know whether measurements we
> > take in August are comparable to measurements we take in October, so a
> > fairly stable platform is important.  As you say, a fast-changing kernel
> > would make it difficult to have any confidence about comparability over
> > time.  That would tend to make me vote for RHEL/Centos, where long-term
> > stability is an explicit development goal.  Debian stable might do too,
> > though I'm not as clear about their update criteria as I am about Red Hat's.
>
> Perhaps RH could donate us a RHEL/RHN licence for this?
>
> /D
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                 http://www.postgresql.org/about/donate
>


Re: Machine available for community use

From
Tom Lane
Date:
"Dave Page" <dpage@postgresql.org> writes:
> Perhaps RH could donate us a RHEL/RHN licence for this?

I could ask, if there's consensus we want it.  It sounded like more
people like Debian, though.
        regards, tom lane


Re: Machine available for community use

From
Stefan Kaltenbrunner
Date:
Gavin M. Roy wrote:
> One thing to take into account is I dont have physical access to the
> box (It is in TX, I am in PA).  All installs but Gentoo will be
> performed by a well trained NOC monkey. *cough*

iLO ?

Stefan


Re: Machine available for community use

From
Stefan Kaltenbrunner
Date:
Tom Lane wrote:
> "Dave Page" <dpage@postgresql.org> writes:
>> Perhaps RH could donate us a RHEL/RHN licence for this?
> 
> I could ask, if there's consensus we want it.  It sounded like more
> people like Debian, though.

well a RHEL/RHN licence would not be a bad thing either (and I guess
it's also a fairly common database-on-linux platform).


Stefan


Re: Machine available for community use

From
"Simon Riggs"
Date:
On Wed, 2007-07-25 at 19:35 +0200, Stefan Kaltenbrunner wrote:
> >> Rats, I've always liked Gentoo. ;)
> > 
> > I'd agree with Tom on that: we need a system that remains the same over
> > longer periods, not simply a very fast one. I'm OK with Fedora.
> 
> fedora is probably not a prime example for "stays same over long period"
> (which I think is important) since it has pretty short release cycles.
> Maybe something like ubuntu LTS, Debian Etch or even CentOS would be
> more appropriate (we have debian on a number of very similiar HP boxes
> and HP is doing Debian Support now too).

OK... Gavin please arbitrate: tis your box. I'm a DB tech, dont really
care about OS.

--  Simon Riggs EnterpriseDB  http://www.enterprisedb.com



Re: Machine available for community use

From
Greg Smith
Date:
On Wed, 25 Jul 2007, Tom Lane wrote:

> The problem I've got with Gentoo is that it encourages homegrown builds
> with randomly-chosen options and compiler switches.

It encourages it, but it certainly doesn't require it.  Knowing that this 
is a NOC machine, I don't think there's going to be a lot of fiddling with 
custom builds.

> That would tend to make me vote for RHEL/Centos, where long-term 
> stability is an explicit development goal.  Debian stable might do too, 
> though I'm not as clear about their update criteria as I am about Red 
> Hat's.

RHEL is certainly on the stable at the expense of slow to change side of 
things, and Debian stable is even slower.  However, at this very moment, 
there have been very recent refreshes from just about everybody such that 
the options available are very similar.  Here's the current state of 
things:

RHEL 5.0:  March 2007, kernel 2.6.18, glibc 2.5
Debian Stable 4.0:  April 2007, kernel 2.6.18, glibc 2.3.6
Ubuntu 7.0.4:  April 2007, kernel 2.6.20, glibc 2.5
Gentoo 2007.0:  May 2007, kernel 2.6.19, glibc 2.5

(http://distrowatch.com is the best site to drill through details like 
this if anyone else wants to dig further/double-check me here)

I would hate to see this system installed with any kernel <2.6.18 or with 
glibc<2.5 because that's clearly where the line of current generation 
releases starts.  I'd consider Debian Stable a poor choice accordingly. 
I don't think you're going to see a lot of difference right now between 
RHEL 5/Gentoo 2007.0/Ubuntu 7.0.4; all the major packages and kernels are 
really similar.  A year from now, there will be much more divergance were 
a fresh install done with current versions of each at that point, but 
there's nothing that says the system has to be upgraded then.

The think the main argument for either Gentoo or Ubuntu over RHEL/Centos 
comes down to ease of installing additional packages to support building 
the kinds of random software that you end up needing on a development 
system.  Not the core code, but the add-on packages needed to run the 
various benchmark/monitoring packages people may want.  To pick a random 
example, the last time I was using an older SuSE system it was a pain to 
get DBT2 running on it, and I ended up having to build the documentation 
on another system altogether because it was easier than sorting out a 
weird RPM issue I ran into.

Pulling packages from the Ubuntu universe with apt-get is usually trivial 
and the available package base is very broad.  Running emerge to get new 
things into Gentoo is normally straightforward.  RPM-based installs on 
RHEL are still sometimes tricky, and my take on the breadth of the 
official repositories is that they're not as wide.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD


Re: Machine available for community use

From
"Simon Riggs"
Date:
On Wed, 2007-07-25 at 14:32 -0400, Tom Lane wrote:
> "Dave Page" <dpage@postgresql.org> writes:
> > Perhaps RH could donate us a RHEL/RHN licence for this?
> 
> I could ask, if there's consensus we want it. 

Please.

>  It sounded like more
> people like Debian, though.

Well, if you don't we probably will go Debian. 

--  Simon Riggs EnterpriseDB  http://www.enterprisedb.com



Re: Machine available for community use

From
"Gavin M. Roy"
Date:
If RH can sponsor a license of RHEL I'm inclined to go there.  Not
that it was offered, but I think Dave's suggestion was Tom could field
that for the box if inclined.  If I'm wrong, let me know.  If that
can't happen, would people prefer CentOS or Ubuntu Server?  The people
I'm most concerned with are the people who will actually use it.  If
you consider yourself one of those people, pipe in now, I will tally
votes and go from there.  From a Gentoo side, I would have kept things
pretty stable, but I'd rather developers be comfortable with the
environment which will encourage you to use it.  I'm not interested in
running Debian, which I'm happy to talk about off topic, in private,
if anyone cares enough to want to discuss it.

What I'm most interested in to touch on Simon's request is SMP
scaling.  From another Hackers thread this month, which I can dig up,
I've walked away with the impression that after 4 cores, we don't see
the same level of per-processor performance improvement that we see <=
4 cores.  What you actually do is up to you, we want to provide this
to the hacker community to use as they see fit to continue to improve
PostgreSQL which is integral to our operation.  Any performance,
scalability or even advocacy efforts (read benchmarking) will benefit
myYearbook.

Gavin


On 7/25/07, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Wed, 2007-07-25 at 14:32 -0400, Tom Lane wrote:
> > "Dave Page" <dpage@postgresql.org> writes:
> > > Perhaps RH could donate us a RHEL/RHN licence for this?
> >
> > I could ask, if there's consensus we want it.
>
> Please.
>
> >  It sounded like more
> > people like Debian, though.
>
> Well, if you don't we probably will go Debian.
>
> --
>   Simon Riggs
>   EnterpriseDB  http://www.enterprisedb.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>


Re: Machine available for community use

From
Gregory Stark
Date:
"Greg Smith" <gsmith@gregsmith.com> writes:

> On Wed, 25 Jul 2007, Tom Lane wrote:
>
>> The problem I've got with Gentoo is that it encourages homegrown builds
>> with randomly-chosen options and compiler switches.
>
> It encourages it, but it certainly doesn't require it.  Knowing that this is a
> NOC machine, I don't think there's going to be a lot of fiddling with custom
> builds.

Does gentoo these days have binary packages? source packages do implicitly
require custom builds because even if you don't fiddle with compiler switches
or other options you end up with a different build than someone who had a
different set of libraries installed when they installed it.

>> That would tend to make me vote for RHEL/Centos, where long-term stability is
>> an explicit development goal.  Debian stable might do too, though I'm not as
>> clear about their update criteria as I am about Red Hat's.

Personally I'm a huge fan of Debian but even with that I think for this
situation I would actually agree that Redhat is a better fit in that it's
"canonical". You can tell someone else install Redhat vFoo and know they'll
have precisely the same set of packages with the same set of services running.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com



Re: Machine available for community use

From
Andrew Dunstan
Date:

Tom Lane wrote:
> I do essentially all my development work with installations that are
> --prefix'd to user directories and started/stopped by hand; it's just
> a lot easier to manage a pile of different versions that way.  Plus
> I never need to become root.  Not sure how other developers work,
> though.
>
>     
>   

That's exactly how I work - I have a set of source trees and a script 
that invokes configure with port and prefix arguments to make sure they 
don't collide.

Like you I do almost all my work on some edition of Fedora - not always 
the latest by any means (e.g. currently it's FC6).

My vote would be for RHEL5/CentOS5 (they are basically the same thing - 
CentOS is RHEL with the RH badging removed, for the most part, and you 
don't need a RHN subscription). I think that would be a good combination 
of stability and currency.

cheers

andrew


Re: Machine available for community use

From
Tom Lane
Date:
"Gavin M. Roy" <gmr@myyearbook.com> writes:
> If RH can sponsor a license of RHEL I'm inclined to go there.

I'm checking into this, but it may take a few days to get an answer
(particularly since I'm planning to take Friday through Monday off).
        regards, tom lane


Re: Machine available for community use

From
Greg Smith
Date:
On Wed, 25 Jul 2007, Gregory Stark wrote:

> Does gentoo these days have binary packages? source packages do implicitly
> require custom builds...

You can install with binaries now so it doesn't take forever to get 
started, but the minute you're adding/updating you're going to be 
building.  The main point I was trying to make is that if you don't do 
anything special to customize the standard Gentoo compilation setup, the 
amount of variation between Gentoo builds on different machines isn't 
significantly greater than that which exists between the various Linux 
distributions.  One could make a case that the big glibc differences 
between Debian Stable and everybody else right now provides a similar 
scale of variation in results that would impact reproducibility.

> for this situation I would actually agree that Redhat is a better fit in 
> that it's "canonical".

I threw out some criticism suggesting where RedHat is at a slight 
disadvantage for completeness sake, and so Gavin wasn't completely alone 
at expressing some distaste for the issues it introduces compared to 
Gentoo (potentially harder package installation and less flexiblity for 
running bleeding-edge kernels with RHEL).  His preference for Gentoo is 
completely defensible if you understand his priorities, and I'd hate to 
see a knee-jerk reaction against that distribution based just on how 
Gentoo can be abused and how it differs from other Linux variants.

But I run RHEL&Centos on several machines so I certainly wouldn't go so 
far as to argue against it being appropriate here.  The nice thing about 
RedHat and its clones is that even when you run into a situation where 
packages might be harder to install than you'd like them to be, the 
userbase is so big and skilled that the problems are usually visible (odds 
are good other people are running into the issue as well), reproducible on 
other builds, and you can get plenty of help resolving them.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD


Re: Machine available for community use

From
"Joshua D. Drake"
Date:
Tom Lane wrote:
> "Gavin M. Roy" <gmr@myyearbook.com> writes:
>> If RH can sponsor a license of RHEL I'm inclined to go there.
> 
> I'm checking into this, but it may take a few days to get an answer
> (particularly since I'm planning to take Friday through Monday off).
> 

Well if we go RHEL why not CentOS5 and just call it good?

Joshua D. Drake


>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
> 


-- 
      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/



Re: Machine available for community use

From
Gregory Stark
Date:
"Greg Smith" <gsmith@gregsmith.com> writes:

> On Wed, 25 Jul 2007, Gregory Stark wrote:
>
>> Does gentoo these days have binary packages? source packages do implicitly
>> require custom builds...
>
> You can install with binaries now so it doesn't take forever to get started,
> but the minute you're adding/updating you're going to be building.  The main
> point I was trying to make is that if you don't do anything special to
> customize the standard Gentoo compilation setup, the amount of variation
> between Gentoo builds on different machines isn't significantly greater than
> that which exists between the various Linux distributions.  One could make a
> case that the big glibc differences between Debian Stable and everybody else
> right now provides a similar scale of variation in results that would impact
> reproducibility.

Well even so another Debian system with the same set of packages (at the same
version) will be equivalent to mine.

Whereas gentoo system will depend on the order that the packages were
installed. If you installed kerberos while you had an older version of the
copiler or crypto libraries installed and then upgraded the crypto library or
compiler then your kerberos library will differ from mine which was compiled
by a different compiler or against a different set of crypto headers.

So for me to reproduce your environment you would have to send me the complete
history of what packages you installed. I would have to reproduce the entire
history including installing and building intermediate versions.

> I threw out some criticism suggesting where RedHat is at a slight disadvantage
> for completeness sake, and so Gavin wasn't completely alone at expressing some
> distaste for the issues it introduces compared to Gentoo (potentially harder
> package installation and less flexiblity for running bleeding-edge kernels with
> RHEL).  

Sure, that's why I run Debian and get really annoyed whenever I use a Redhat
system. One Redhat I'm forever saying "where's this utility" or "why is this
program 6 months out of date?". But that's a personal desktop machine. This is
shared resource that shouldn't be constantly changing or having new versions
of stuff installed

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com



Re: Machine available for community use

From
Greg Smith
Date:
On Thu, 26 Jul 2007, Gregory Stark wrote:

> So for me to reproduce your [Gentoo] environment you would have to send 
> me the complete history of what packages you installed. I would have to 
> reproduce the entire history including installing and building 
> intermediate versions.

If one's goal is to be able to make several copies of a server run 
completely identical builds of all software down to the build order level, 
then Gentoo obviously makes that more difficult than other distributions. 
It's easier if you build each replicant at the same time and then keep 
them synchronized, but cloning a machine that's already out there and has 
been through a series of updates that perfectly is as challenging as you 
describe.  If the primary goal here was reproducable benchmarks where you 
needed SPEC-submission level version control, Gentoo would be a completely 
inappropriate choice.

But this is pushing forward PostgreSQL development you're doing here.  If 
you've got a problem such that something works differently based on the 
order in which you built the packages, which is going to be unique to 
every Linux distribution already, that is itself noteworthy and deserves 
engineering out.  You might think of this high-end machine being a little 
different as usefully adding diversity robustness in a similar way to how 
the buildfarm helps improve the core right now.

I think I have to exit this discussion before I start sounding like a 
Gentoo fanboi and make my Linux consulting clients nervous.  Go RedHat!

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD


Re: Machine available for community use

From
Tom Lane
Date:
Greg Smith <gsmith@gregsmith.com> writes:
> But this is pushing forward PostgreSQL development you're doing here.  If 
> you've got a problem such that something works differently based on the 
> order in which you built the packages, which is going to be unique to 
> every Linux distribution already, that is itself noteworthy and deserves 
> engineering out.  You might think of this high-end machine being a little 
> different as usefully adding diversity robustness in a similar way to how 
> the buildfarm helps improve the core right now.

Actually, the thing that's concerning me is *exactly* lack of diversity.
If we have just one of these things then there's a significant risk of
unconsciously tuning PG towards that specific platform.  I'd rather we
take that risk with a well-standardized, widely used platform than with
something no one else can reproduce.

Really there's a pretty good argument for having several different OS'es
available on the box --- I wonder whether Gavin is up to managing some
sort of VM or multiboot setup.
        regards, tom lane


Re: Machine available for community use

From
"Joshua D. Drake"
Date:
Tom Lane wrote:
> Greg Smith <gsmith@gregsmith.com> writes:

> Really there's a pretty good argument for having several different OS'es
> available on the box --- I wonder whether Gavin is up to managing some
> sort of VM or multiboot setup.

IMO, a multiboot is o.k. but a vm isn't worth it. This box is big enough 
to actually starting looking at SMP and I/O issues for PostgreSQL that 
we normally can't because we don't have access to the hardware in the 
community.

Sincerely,

Joshua D. Drake



> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
> 
>                http://www.postgresql.org/docs/faq
> 


-- 
      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/



Re: Machine available for community use

From
Tom Lane
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:
> Tom Lane wrote:
>> Really there's a pretty good argument for having several different OS'es
>> available on the box --- I wonder whether Gavin is up to managing some
>> sort of VM or multiboot setup.

> IMO, a multiboot is o.k. but a vm isn't worth it.

Yeah, multiboot would be better --- otherwise you have to wonder if the
vm is affecting performance at all.  But I suppose multiboot would be
harder to manage.
        regards, tom lane


Re: Machine available for community use

From
"Gavin M. Roy"
Date:
Let me look at what makes sense there, I am open to it.

On 7/26/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Greg Smith <gsmith@gregsmith.com> writes:
> > But this is pushing forward PostgreSQL development you're doing here.  If
> > you've got a problem such that something works differently based on the
> > order in which you built the packages, which is going to be unique to
> > every Linux distribution already, that is itself noteworthy and deserves
> > engineering out.  You might think of this high-end machine being a little
> > different as usefully adding diversity robustness in a similar way to how
> > the buildfarm helps improve the core right now.
>
> Actually, the thing that's concerning me is *exactly* lack of diversity.
> If we have just one of these things then there's a significant risk of
> unconsciously tuning PG towards that specific platform.  I'd rather we
> take that risk with a well-standardized, widely used platform than with
> something no one else can reproduce.
>
> Really there's a pretty good argument for having several different OS'es
> available on the box --- I wonder whether Gavin is up to managing some
> sort of VM or multiboot setup.
>
>                         regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq
>


Re: Machine available for community use

From
"Joshua D. Drake"
Date:
Tom Lane wrote:
> "Joshua D. Drake" <jd@commandprompt.com> writes:
>> Tom Lane wrote:
>>> Really there's a pretty good argument for having several different OS'es
>>> available on the box --- I wonder whether Gavin is up to managing some
>>> sort of VM or multiboot setup.
> 
>> IMO, a multiboot is o.k. but a vm isn't worth it.
> 
> Yeah, multiboot would be better --- otherwise you have to wonder if the
> vm is affecting performance at all.  But I suppose multiboot would be
> harder to manage.

Personally, I think CentOS 5 is probably the most reasonable choice. It 
is what (or RHEL 5 which is the same) a good portion of our community is 
going to be running. It is also easy to work with.

Another alternative would be Debian or Ubuntu Dapper but they are all 
really the same thing :). The nice thing is any of these three are 
fairly static installs that are going to be reasonably predictable.

Joshua D. Drake

> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
> 


-- 
      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/



Re: Machine available for community use

From
Stephen Frost
Date:
* Joshua D. Drake (jd@commandprompt.com) wrote:
> Personally, I think CentOS 5 is probably the most reasonable choice. It is
> what (or RHEL 5 which is the same) a good portion of our community is going
> to be running. It is also easy to work with.
>
> Another alternative would be Debian or Ubuntu Dapper but they are all
> really the same thing :). The nice thing is any of these three are fairly
> static installs that are going to be reasonably predictable.

If we can generally agree on "Linux" then it might be reasonable to
consider using either VServers or just regular chroot's with different
OSes loaded (when/if we want to look at a particular OS).  There'd be
little to no performance impact from such a solution while we'd still
have different OSes to play with.

Of course, the kernel would be the same for all of them, so if that's
what we're interested mostly in testing/stressing then it's no good.  I
got the impression from some that various gcc builds, glibc versions,
etc, would be good to test though and a VServer or chroot setup could
work well for that.

As a Debian Developer, I have to also say that Debian would be my
choice. :)  Though I've got a number of big toys to play w/ at work
already so it's unlikely I'd have need of this system (not to mention
that most of the stuff I work on in PG is usability rather than things
like large-scale performance, currently anyway).
Thanks,
    Stephen

Re: Machine available for community use

From
Greg Smith
Date:
On Thu, 26 Jul 2007, Joshua D. Drake wrote:

> IMO, a multiboot is o.k. but a vm isn't worth it. This box is big enough to 
> actually starting looking at SMP and I/O issues for PostgreSQL that we 
> normally can't because we don't have access to the hardware in the community.

Certainly agree with that; VM overhead is much lower than it used to be, 
but it's still going to fuzz exactly the kind of performance results that 
this box would be most useful for exploring.

What I normally do in this situation is create a second primary partition 
on the boot drive with around 10GB of space on it that doesn't get touched 
by the initial OS install.  Then it's straighforward to install a second 
Linux into there; the only time that gets tricky is if you're doing two 
RedHat style installs because of how they mount partitions by label.  A 
little bit of GRUB merging after the second install, and now you've got a 
dual-boot system.  Even in a NOC setup where you don't see the boot menu, 
you'd just have to change the grub.conf default and reboot in order to 
switch between the two.

As long as a bootable partition of reasonable size is set aside like this, 
there's all kinds of flexibility for being able to confirm results apply 
to multiple Linux distributions in the future.  You might even put a BSD 
or Solaris in that space one day.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD


Re: Machine available for community use

From
Jim Nasby
Date:
Where in Texas? I might be able to assist on-site if needed (though I  
don't know much about linuxes).

On Jul 25, 2007, at 11:31 AM, Gavin M. Roy wrote:

> One thing to take into account is I dont have physical access to the
> box (It is in TX, I am in PA).  All installs but Gentoo will be
> performed by a well trained NOC monkey. *cough*
>
> On 7/25/07, Dave Page <dpage@postgresql.org> wrote:
>>
>>
>> > ------- Original Message -------
>> > From: Tom Lane <tgl@sss.pgh.pa.us>
>> > To: Greg Smith <gsmith@gregsmith.com>
>> > Sent: 25/07/07, 18:54:50
>> > Subject: Re: [HACKERS] Machine available for community use
>> >
>> > Another fairly big issue is that we need to know whether  
>> measurements we
>> > take in August are comparable to measurements we take in  
>> October, so a
>> > fairly stable platform is important.  As you say, a fast- 
>> changing kernel
>> > would make it difficult to have any confidence about  
>> comparability over
>> > time.  That would tend to make me vote for RHEL/Centos, where  
>> long-term
>> > stability is an explicit development goal.  Debian stable might  
>> do too,
>> > though I'm not as clear about their update criteria as I am  
>> about Red Hat's.
>>
>> Perhaps RH could donate us a RHEL/RHN licence for this?
>>
>> /D
>>
>> ---------------------------(end of  
>> broadcast)---------------------------
>> TIP 7: You can help support the PostgreSQL project by donating at
>>
>>                 http://www.postgresql.org/about/donate
>>
>
> ---------------------------(end of  
> broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)




Re: Machine available for community use

From
Devrim GÜNDÜZ
Date:
Hi,

On Wed, 2007-07-25 at 20:22 -0700, Joshua D. Drake wrote:
> > I'm checking into this, but it may take a few days to get an answer
> > (particularly since I'm planning to take Friday through Monday off).
>
> Well if we go RHEL why not CentOS5 and just call it good?

...because RHEL and CentOS are not really that identical. They are just
binary-compilant.

RHEL has better performance than CentOS -- I guess it is the compiler
options that Red Hat is using while compiling their RPMs.

I have performed a test using OSDL test suite a few months ago on a
system that has:

* 8 x86_64 CPUs @ 3200.263
* 16 Gigabytes of RAM
* PostgreSQL 8.1.5 (PGDG packages)

and RHEL performed much better than CentOS.

Regards,
--
Devrim GÜNDÜZ
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/



Re: Machine available for community use

From
"Joshua D. Drake"
Date:
Devrim GÜNDÜZ wrote:
> Hi,

> RHEL has better performance than CentOS -- I guess it is the compiler
> options that Red Hat is using while compiling their RPMs.
>
> I have performed a test using OSDL test suite a few months ago on a
> system that has:
>
> * 8 x86_64 CPUs @ 3200.263
> * 16 Gigabytes of RAM
> * PostgreSQL 8.1.5 (PGDG packages)
>
> and RHEL performed much better than CentOS.

Not to be unkind, but I doubt that on an identical configuration.

Joshua D. Drake

>
> Regards,


--
      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/




Re: Machine available for community use

From
Devrim GÜNDÜZ
Date:
Hi,

On Mon, 2007-07-30 at 19:14 -0700, Joshua D. Drake wrote:
> > and RHEL performed much better than CentOS.
>
> Not to be unkind, but I doubt that on an identical configuration.

Since I don't have the permission to distribute the benchmark results, I
will be happy to spend time for re-running these tests if someone
provides me an identical machine.

Each test took 1-2 days -- I will insist that CentOS performs poorer
than RHEL.

BTW, I will ask for permission to distribute the graphs that I produced
using gnuplot -- Maybe those graphs will give us some light.

Regards,
--
Devrim GÜNDÜZ
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/



Re: Machine available for community use

From
Tom Lane
Date:
Devrim GÜNDÜZ <devrim@CommandPrompt.com> writes:
> On Mon, 2007-07-30 at 19:14 -0700, Joshua D. Drake wrote:
>>> and RHEL performed much better than CentOS.

>> Not to be unkind, but I doubt that on an identical configuration.

> Each test took 1-2 days -- I will insist that CentOS performs poorer
> than RHEL.

I'm finding that hard to believe too.  There isn't any "secret sauce"
in the RHEL build process --- the CentOS guys should have been able to
duplicate the RHEL RPMs exactly.  Now it's possible that CentOS had
lagged in updating some performance-relevant package; did you compare
package versions across both OSes?
        regards, tom lane


Re: Machine available for community use

From
Devrim GÜNDÜZ
Date:
Hi,

On Mon, 2007-07-30 at 23:36 -0400, Tom Lane wrote:
> > Each test took 1-2 days -- I will insist that CentOS performs poorer
> > than RHEL.
>
> I'm finding that hard to believe too.

I have felt the same, that's why I repeated the test twice.

> There isn't any "secret sauce" in the RHEL build process

Really? Are the compiler options, etc, public?

> --- the CentOS guys should have been able to duplicate the RHEL RPMs
> exactly.  Now it's possible that CentOS had lagged in updating some
> performance-relevant package; did you compare package versions across
> both OSes?

Actually I did not compare -- But both of them were 4.3 (RHEL 4.3 and
CentOS 4.3). I'm assuming that they have the same package versions,
right?

BTW, they were stock 4.3 -- no updates, etc.

I hope I will be able to publish only the graphs, so that community will
take a look what is going on.

Regards,

--
Devrim GÜNDÜZ
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/



Re: Machine available for community use

From
Tom Lane
Date:
Devrim GÜNDÜZ <devrim@CommandPrompt.com> writes:
> On Mon, 2007-07-30 at 23:36 -0400, Tom Lane wrote:
>> There isn't any "secret sauce" in the RHEL build process

> Really? Are the compiler options, etc, public?

Certainly.  If you doubt it, try comparing pg_config output for the RHEL
and CentOS packages.  (And if the CFLAGS entries are different, you
should be mentioning it to the CentOS package maintainer, not me.)

> Actually I did not compare -- But both of them were 4.3 (RHEL 4.3 and
> CentOS 4.3). I'm assuming that they have the same package versions,
> right?

> BTW, they were stock 4.3 -- no updates, etc.

RHEL 4.3 was obsoleted more than a year ago, so I'd like to think that
nobody finds "no update" comparisons to be very relevant today ...
        regards, tom lane


Re: Machine available for community use

From
Devrim GÜNDÜZ
Date:
Hi,

On Tue, 2007-07-31 at 01:54 -0400, Tom Lane wrote:
> > Really? Are the compiler options, etc, public?
>
> Certainly.  If you doubt it, try comparing pg_config output for the
> RHEL and CentOS packages.

As I wrote before, I used PGDG packages for both -- What I'm suspecting
is the other packages like kernel, etc.

> > BTW, they were stock 4.3 -- no updates, etc.
>
> RHEL 4.3 was obsoleted more than a year ago, so I'd like to think that
> nobody finds "no update" comparisons to be very relevant today ...

I was referring to 4.3 isos of both distros, with no updates by that
time.

Regards,
--
Devrim GÜNDÜZ
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/



Re: Machine available for community use

From
"Dawid Kuroczko"
Date:
On 7/31/07, Devrim GÜNDÜZ <devrim@commandprompt.com> wrote:
> Hi,
>
> On Mon, 2007-07-30 at 19:14 -0700, Joshua D. Drake wrote:
> > > and RHEL performed much better than CentOS.
> >
> > Not to be unkind, but I doubt that on an identical configuration.
>
> Since I don't have the permission to distribute the benchmark results, I
> will be happy to spend time for re-running these tests if someone
> provides me an identical machine.
>
> Each test took 1-2 days -- I will insist that CentOS performs poorer
> than RHEL.

Would it be possibe to include Unbreakable Linux in such test?
Out of curiosity of course. :-)
  Regards,     Dawid


Re: Machine available for community use

From
Greg Smith
Date:
On Mon, 30 Jul 2007, Devrim G�ND�Z wrote:

> I have performed a test using OSDL test suite a few months ago on a
> system that has:
> * 8 x86_64 CPUs @ 3200.263...
> and RHEL [4.3] performed much better than CentOS [4.3]

RHEL 4 update 3 included some reworking of the x86_64 kernel, like adding
the kernel-largesmp for many CPU systems.  I would not be surprised to
find that the first CentOS release based on that may not have achieved a
perfect rebuild because of all that, and since you didn't do any updates
from the initial ISO images you were basically running the CentOS beta for
that feature set.

I think it's accurate to say "sometimes CentOS releases have bugs that
make them perform worse than the RHEL they're derived from", and would not
dispute your results accordingly.  I've seen fuzzy periods where CentOS
had a release out to match a new RHEL version, but it wasn't quite right
until after CentOS released an update or two.  There can be some lag
there, particularly in the period after a new major release.  Right now,
for example, I still don't completely trust the CentOS build based on the
recent RHEL 5, and have been following the developer mailing lists to get
a feel for when things have settled down.  It is one of the risks that
goes along with using CentOS, and removing it by using a genuine RHEL
certainly has value.

At the same time, I've done a fair amount of benchmarking work on machines
that switched from RHEL<->CentOS where performance was completely
identical.  I'd need to see a lot more than one test result suggesting
otherwise before I'd believe that CentOS is slower in general than the
RHEL it's derived from.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Machine available for community use

From
Josh Berkus
Date:
Gavin,

I'm actually in the middle of assembling a general performance test lab for 
the PostgreSQL hackers, using equipment donated by Sun, Hi5, and (hopefully) 
Unisys and Intel.  While your machine would obviously stay in Pennsylvania, 
it would be cool if we could somehow arrange a unified authentication & 
booking system.

I'm pretty sure I can even raise money to get one created. 

How long will this system remain available to us?

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco


Re: Machine available for community use

From
"Gavin M. Roy"
Date:
It's actually in Texas, and we have no intention to put a time limit
on its availability. I think the availability will be there as long as
there is use and we're in the Texas data center, which I don't see
ending any time soon.

On 7/31/07, Josh Berkus <josh@agliodbs.com> wrote:
> Gavin,
>
> I'm actually in the middle of assembling a general performance test lab for
> the PostgreSQL hackers, using equipment donated by Sun, Hi5, and (hopefully)
> Unisys and Intel.  While your machine would obviously stay in Pennsylvania,
> it would be cool if we could somehow arrange a unified authentication &
> booking system.
>
> I'm pretty sure I can even raise money to get one created.
>
> How long will this system remain available to us?
>
> --
> Josh Berkus
> PostgreSQL @ Sun
> San Francisco
>


Re: Machine available for community use

From
Josh Berkus
Date:
Folks,

Hey, this is looking like a serious case of "Bike Shedding".  That is, a dozen 
people are arguing about what color to paint the bike shed instead of getting 
it built.[1]

Given that there are much more substantial issues: what performance software 
to install and how to install it, how to set up authentication and 
time-sharing for running tests, whether we can set up automated perf testing,  
getting money so some of our unfunded performance developers can work on it, 
etc., is the "which Linux distro" question worth spending our time on?

[1] http://www.bikeshed.com/

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco


Re: Machine available for community use

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> Hey, this is looking like a serious case of "Bike Shedding".  That is, a dozen 
> people are arguing about what color to paint the bike shed instead of getting 
> it built.[1]

FWIW, it's looking like Red Hat will donate a RHEL/RHN subscription if
we want one, though I don't have final approval quite yet.
        regards, tom lane


Re: Machine available for community use

From
Greg Smith
Date:
On Tue, 31 Jul 2007, Josh Berkus wrote:

> That is, a dozen people are arguing about what color to paint the bike 
> shed instead of getting it built.

Until there's an OS installed on it and it's on a network, the machine 
essentially doesn't exist--so there was no way to work on the 
building--and there was a clearly a gap between what Gavin was planning to 
do and what the aggregate hacker community wanted.  If there's a 
bike-shedding analogy here, the argument has been about what type of 
foundation to build the shed on.  The design of the shed itself may be 
much more complicated than that part, but if you put it someplace that's 
not level you may not ever get what you wanted no matter how much work you 
put into it later.  That's why I thought it was important to at least talk 
through the Linux distribution topic, so everyone was aware of the 
trade-offs involved.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD


Re: Machine available for community use

From
Josh Berkus
Date:
Tom,

> FWIW, it's looking like Red Hat will donate a RHEL/RHN subscription if
> we want one, though I don't have final approval quite yet.

Great.  Any chance of a machine?  Can RH exert some leverage with Dell?

We could use up to 8 servers for performance testing, so I'm asking 
everyone.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco


Re: Machine available for community use

From
Mark Kirkwood
Date:
Tom Lane wrote:
>
> FWIW, it's looking like Red Hat will donate a RHEL/RHN subscription if
> we want one, though I don't have final approval quite yet.
>
>     
One possible point favoring the use of Centos over RHEL - its a little 
easier for community members to reproduce or test any findings... i.e. 
you don't have to get a RHEL sub!

Cheers

Mark


Re: Machine available for community use

From
"Gavin M. Roy"
Date:
Let us know when/if and we'll pay command prompt to install the base OS on the system.  All that we're waiting on at this point is the final on the OS.  

Gavin

On 7/31/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Josh Berkus <josh@agliodbs.com> writes:
> Hey, this is looking like a serious case of "Bike Shedding".  That is, a dozen
> people are arguing about what color to paint the bike shed instead of getting
> it built.[1]

FWIW, it's looking like Red Hat will donate a RHEL/RHN subscription if
we want one, though I don't have final approval quite yet.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

                http://www.postgresql.org /about/donate

 

Re: Machine available for community use

From
"Gavin M. Roy"
Date:
Just a follow-up to note that Red Hat has graciously donated a 1 year
RHEL subscription and myYearbook is paying Command Prompt to setup the
RHEL box for community use.

We've not worked out a scheduling methodology, or how to best organize
the use of said hardware, but I know that Tom and others are
interested.

Does anyone have a scheduling solution for things like this to make
sure people aren't stepping on each others toes processor/ram/disk
wise?

Also, what should the policies be for making sure that people can use
the box for what they need to use the box for?

Should people clean up after themselves data usage wise after their
scheduled time?

Should people only be able to run PostgreSQL in the context of their
own user?  Do we have experience with such setups in the past?  What
has worked well and what hasn't?

Gavin

On 7/25/07, Gavin M. Roy <gavinmroy@gmail.com> wrote:
> Recently I've been involved in or overheard discussions about SMP
> scalability at both the PA PgSQL get together and in some list
> traffic.
>
> myYearbook.com would ike to make one of our previous production
> machines available to established PgSQL Hackers who don't have access
> to this level of hardware for testing, benchmarking and development to
> work at improving SMP scalability and related projects.
>
> The machine is a HP 585 G1, 8 Core AMD, 32GB RAM with one 400GB 14
> Spindle DAS Array dedicated to community use.  I've attached a text
> file with dmesg and /proc/cpuinfo output.
>
> I'm working on how this will be setup and am open to suggestions on
> how to structure access.
>
> I'm currently in the process of having Gentoo linux reinstalled on the
> box since that is what I am most comfortable administering from a
> security perspective.  If this will be a blocker for developers who
> would actually work on it, please let me know.
>
> If you're interested in access, my only requirement is that you're a
> current PgSQL Hacker with a proven track-record of committing patches
> to the community.  This is a resource we could be using for something
> else, and I'd like to see the community get direct benefit from it as
> opposed to it being a play sandbox for people who want to tinker.
>
> Please let me know thoughts, concerns or suggestions.
>
> Gavin M. Roy
> CTO
> myYearbook.com
> gmr@myyearbook.com
>
>


Re: Machine available for community use

From
Tom Lane
Date:
"Gavin M. Roy" <gavinmroy@gmail.com> writes:
> Just a follow-up to note that Red Hat has graciously donated a 1 year
> RHEL subscription and myYearbook is paying Command Prompt to setup the
> RHEL box for community use.

Sorry that Red Hat was so slow about that :-(

> [ various interesting questions snipped ]

> Should people only be able to run PostgreSQL in the context of their
> own user?  Do we have experience with such setups in the past?  What
> has worked well and what hasn't?

Yeah, I'd vote for people just building private PG installations in
their own home directories.  I am not aware of any performance-testing
reason why we'd want a shared installation, and given that people are
likely to be testing many different code variants, a shared installation
would be a management nightmare.  Also, with personal installations,
nobody need have root privileges, which just seems like a real good idea.

I don't have any special insights about the other management issues
you mentioned, but I'm sure someone does ...
        regards, tom lane


Re: Machine available for community use

From
"Joshua D. Drake"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, 02 Nov 2007 15:37:17 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> "Gavin M. Roy" <gavinmroy@gmail.com> writes:
> > Just a follow-up to note that Red Hat has graciously donated a 1
> > year RHEL subscription and myYearbook is paying Command Prompt to
> > setup the RHEL box for community use.
> 
> Sorry that Red Hat was so slow about that :-(
> 
> > [ various interesting questions snipped ]
> 
> > Should people only be able to run PostgreSQL in the context of their
> > own user?  Do we have experience with such setups in the past?  What
> > has worked well and what hasn't?
> 
> Yeah, I'd vote for people just building private PG installations in
> their own home directories.  I am not aware of any performance-testing
> reason why we'd want a shared installation, and given that people are
> likely to be testing many different code variants, a shared

The only caveat here is that our thinking was that the actual arrays
would be able to be re-provisioned all the time. E.g; test with RAID 10
with x stripe size, Software RAID 6, what is the real difference
between 28 spindles with RAID 5 versus 10?

> installation would be a management nightmare.  Also, with personal
> installations, nobody need have root privileges, which just seems
> like a real good idea.

No question.

Joshua D. Drake

> 
> I don't have any special insights about the other management issues
> you mentioned, but I'm sure someone does ...
> 
>             regards, tom lane
> 
> ---------------------------(end of
> broadcast)--------------------------- TIP 4: Have you searched our
> list archives?
> 
>                http://archives.postgresql.org
> 


- -- 
     === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/        UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHK412ATb/zqfZUUQRAg4eAJ0YubwkLQ3mU0st5jPhUnC6dWrqeACeMjQe
TFxunw+efuh3XNtMv+whKBI=
=RzC/
-----END PGP SIGNATURE-----

Re: Machine available for community use

From
Tom Lane
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah, I'd vote for people just building private PG installations in
>> their own home directories.  I am not aware of any performance-testing
>> reason why we'd want a shared installation, and given that people are
>> likely to be testing many different code variants, a shared

> The only caveat here is that our thinking was that the actual arrays
> would be able to be re-provisioned all the time. E.g; test with RAID 10
> with x stripe size, Software RAID 6, what is the real difference
> between 28 spindles with RAID 5 versus 10?

Well, we need some workspace that won't go away when that happens.
I'd suggest that the OS and people's home directories be mounted on
a "permanent" partition with plenty of space for source code, say a
few tens of GB, and then there be a farm of data workspace that's
understood to be transient and can be reconfigured as needed for tests
like that.
        regards, tom lane


Re: Machine available for community use

From
"Joshua D. Drake"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, 02 Nov 2007 17:11:30 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> "Joshua D. Drake" <jd@commandprompt.com> writes:
> > Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Yeah, I'd vote for people just building private PG installations in
> >> their own home directories.  I am not aware of any
> >> performance-testing reason why we'd want a shared installation,
> >> and given that people are likely to be testing many different code
> >> variants, a shared
> 
> > The only caveat here is that our thinking was that the actual arrays
> > would be able to be re-provisioned all the time. E.g; test with
> > RAID 10 with x stripe size, Software RAID 6, what is the real
> > difference between 28 spindles with RAID 5 versus 10?
> 
> Well, we need some workspace that won't go away when that happens.

Right which is on the internal devices.

> I'd suggest that the OS and people's home directories be mounted on
> a "permanent" partition with plenty of space for source code, say a
> few tens of GB, and then there be a farm of data workspace that's
> understood to be transient and can be reconfigured as needed for tests
> like that.

Agreed.

Sincerely,

Joshua D. Drake

> 
>             regards, tom lane
> 
> ---------------------------(end of
> broadcast)--------------------------- TIP 1: if posting/reading
> through Usenet, please send an appropriate subscribe-nomail command
> to majordomo@postgresql.org so that your message can get through to
> the mailing list cleanly
> 


- -- 
     === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/        UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHK5LwATb/zqfZUUQRApBQAJ9Gp+fpgOnA6ZONpdQl43giMcetZwCggv2Q
8A9FfkeP6VsQptWl1J8W4n8=
=nX1C
-----END PGP SIGNATURE-----