Thread: Linux cluster application

Linux cluster application

From
Andrew Watkins
Date:
I'm writing a parallel/distributed application that assesses the
performance impact of frequent insertions/selects to databases on
parallel file systems in a linux cluster environment.  Currently, the
application calls for a database server to be running on each node of
the cluster. Since my end-users may or may not have root access to the
machines in the various clusters being used, I'm installing postgres
from within the user's account to facilitate the starting and stopping
of the postmaster without the need of root access.  My general question
is if anyone has experience with such a situation where the database
server would need to be started on each machine in a parallel
environment.  If so, are there any suggestions for working with such a
condition?

Thanks.


Re: Linux cluster application

From
Scott Marlowe
Date:
On Thu, 2006-03-02 at 14:41, Andrew Watkins wrote:
> I'm writing a parallel/distributed application that assesses the
> performance impact of frequent insertions/selects to databases on
> parallel file systems in a linux cluster environment.  Currently, the
> application calls for a database server to be running on each node of
> the cluster. Since my end-users may or may not have root access to the
> machines in the various clusters being used, I'm installing postgres
> from within the user's account to facilitate the starting and stopping
> of the postmaster without the need of root access.  My general question
> is if anyone has experience with such a situation where the database
> server would need to be started on each machine in a parallel
> environment.  If so, are there any suggestions for working with such a
> condition?

You could set up ssh keys with no passphrase and use ssh to do it in a
short shell script.

Re: Linux cluster application

From
Andrew Watkins
Date:
On Mar 2, 2006, at 2:44 PM, Scott Marlowe wrote:

> On Thu, 2006-03-02 at 14:41, Andrew Watkins wrote:
>> I'm writing a parallel/distributed application that assesses the
>> performance impact of frequent insertions/selects to databases on
>> parallel file systems in a linux cluster environment.  Currently, the
>> application calls for a database server to be running on each node of
>> the cluster. Since my end-users may or may not have root access to the
>> machines in the various clusters being used, I'm installing postgres
>> from within the user's account to facilitate the starting and stopping
>> of the postmaster without the need of root access.  My general
>> question
>> is if anyone has experience with such a situation where the database
>> server would need to be started on each machine in a parallel
>> environment.  If so, are there any suggestions for working with such a
>> condition?
>
> You could set up ssh keys with no passphrase and use ssh to do it in a
> short shell script.
>

Thanks. I suppose my question is less about the mechanisms for actually
starting the servers and more about where to install the servers, where
they should be running, etc. For example, if I'm using a shared file
system across each node in the cluster and postgres has been installed
in, say, /home/user/pgres, and initdb has initialized the database on,
say, /home/user/pgres/data, then it would seem like there would end up
being conflicts in file names when trying to launch a local server on
each node. On the other hand, if there is disk space local to each
node, then running the servers there would not allow for the assessing
of the impact on a parallel file system.


----------------------
Andrew Watkins, PhD
Department of Computer Science and Engineering
Mississippi State University
Box 9637, Mississippi State, MS, 39762
Office: (662) 325-7515
Fax: (662) 325-8997
http://www.cse.msstate.edu/~andrew


Re: Linux cluster application

From
Douglas McNaught
Date:
Andrew Watkins <andrew@cse.msstate.edu> writes:

> Thanks. I suppose my question is less about the mechanisms for
> actually starting the servers and more about where to install the
> servers, where they should be running, etc. For example, if I'm using
> a shared file system across each node in the cluster and postgres has
> been installed in, say, /home/user/pgres, and initdb has initialized
> the database on, say, /home/user/pgres/data, then it would seem like
> there would end up being conflicts in file names when trying to launch
> a local server on each node. On the other hand, if there is disk space
> local to each node, then running the servers there would not allow for
> the assessing of the impact on a parallel file system.

You will definitely have to run initdb, and start Postgres, with a
unique data directory for each machine (maybe named after the host?)
-- having more than one server process trying to use a single
directory will break everything.

-Doug