Thread: postgresql clustering

postgresql clustering

From

"Rafik Salama"

Date:

21 September 2005, 17:02:49

<div class="Section1"><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">Dear Sirs</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">I know that that postgresql can be configured for high availability over a clustered environment
usingpgcluster, I am currently studying in my masters the clustering using MPI and OpenMP, PVM and others packages and
Ihave to do a project, so I was thinking to use this opportunity to start implementing the clustering over postgresql
usingany of the above packages.</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">What do you think?</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">Thanks</span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Harlow Solid Italic" size="3"><span
style="font-size:12.0pt;font-family:"HarlowSolid Italic"">Rafik Salama</span></font><p class="MsoNormal"><font
face="HarlowSolid Italic" size="3"><span style="font-size:12.0pt;font-family:"Harlow Solid Italic"">Systems
Architect</span></font><pclass="MsoNormal"><font face="Times New Roman" size="3"><span style="font-size: 
12.0pt"> </span></font><p class="MsoNormal"><font face="Arial" size="1"><span style="font-size:7.5pt;
font-family:Arial">CIT Global</span></font><p class="MsoNormal"><font face="Arial" size="1"><span
style="font-size:7.5pt;font-family:Arial">CIT</span></font><fontface="Arial" size="1"><span
style="font-size:7.5pt;font-family:Arial">Building</span></font><font face="Arial" size="1"><span
style="font-size:7.5pt;font-family:Arial">,Free Zone</span></font><p class="MsoNormal"><font face="Arial"
size="1"><spanstyle="font-size:7.5pt;font-family:Arial">Nasr</span></font><font face="Arial" size="1"><span
style="font-size:7.5pt;font-family:Arial">City</span></font><font face="Arial" size="1"><span
style="font-size:7.5pt;font-family:Arial">,</span></font><pclass="MsoNormal"><font face="Arial" size="1"><span
style="font-size:7.5pt;font-family:Arial">P.O.Box11816</span></font><font face="Arial" size="1"><span
style="font-size:7.5pt;font-family:Arial">,Cairo, Egypt</span></font><p class="MsoNormal"><font face="Arial"
size="1"><spanstyle="font-size:7.5pt; 
font-family:Arial">Tel : +202 271 8794 (ext. 115)</span></font><p class="MsoNormal"><font face="Arial" size="1"><span
style="font-size:7.5pt;
font-family:Arial">Fax : +202 2748335</span></font><p class="MsoNormal"><font face="Arial" size="1"><span
style="font-size:7.5pt;
font-family:Arial">Cell: +2010 5410035</span></font><p class="MsoNormal"><font face="Arial" size="1"><span
style="font-size:7.5pt;
font-family:Arial"><a href="http://www.citglobal.com">http://www.citglobal.com</a></span></font><p
class="MsoNormal"><fontface="Times New Roman" size="3"><span style="font-size: 
12.0pt"> </span></font><p class="MsoNormal"><font face="Times New Roman" size="3"><span style="font-size:
12.0pt"> </span></font></div>

Re: postgresql clustering

From

David Fetter

Date:

21 September 2005, 17:11:54

On Wed, Sep 21, 2005 at 08:01:08PM +0300, Rafik Salama wrote:
> Dear Sirs
> 
> I know that that postgresql can be configured for high availability
> over a clustered environment using pgcluster,

Do you have a case study showing this?

> I am currently studying in my masters the clustering using MPI and
> OpenMP, PVM and others packages and I have to do a project, so I was
> thinking to use this opportunity to start implementing the
> clustering over postgresql using any of the above packages.
>  
> What do you think?

Let a thousand schools of thought content.  Let a hundred flowers
bloom.

Cheers,
D
-- 
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

Re: postgresql clustering

From

Aly Dharshi

Date:

21 September 2005, 17:44:30

I think its a great idea to give it a shot, maybe you can present a 
proposal to the list of how you wish to go about it. There could be some 
experts on the list who may give you some input and direction.

Aly.

David Fetter wrote:
> On Wed, Sep 21, 2005 at 08:01:08PM +0300, Rafik Salama wrote:
> 
>>Dear Sirs
>>
>>I know that that postgresql can be configured for high availability
>>over a clustered environment using pgcluster,
> 
> 
> Do you have a case study showing this?
> 
> 
>>I am currently studying in my masters the clustering using MPI and
>>OpenMP, PVM and others packages and I have to do a project, so I was
>>thinking to use this opportunity to start implementing the
>>clustering over postgresql using any of the above packages.
>> 
>>What do you think?
> 
> 
> Let a thousand schools of thought content.  Let a hundred flowers
> bloom.
> 
> Cheers,
> D

-- 
Aly Dharshi
aly.dharshi@telus.net
         "A good speech is like a good dress          that's short enough to be interesting          and long enough to
coverthe subject"

Re: postgresql clustering

From

"Rafik Salama"

Date:

21 September 2005, 17:48:44

No I do not have a case study, I just read so, but what I am suggesting to
start doing is that if there is no cluster implementation to give high
availability of the database, I will start doing this project through the
message passing technique and I already have in the university a cluster of
19 machine intel xeon, you can see it in this URL
http://www.cs.aucegypt.edu/~cluster

But any way I was just asking so as not to reinvent the Wheel, in case there
is something like that, but since there is not, I will give it a try, at the
end of the day it is open source and I can do anything and if it happens to
work, who knows!!!!

Thanks

Rafik Salama
Systems Architect

CIT Global
CIT Building, Free Zone
Nasr City,
P.O.Box 11816, Cairo, Egypt
Tel : +202 271 8794 (ext. 115)
Fax : +202 2748335
Cell: +2010 5410035
http://www.citglobal.com

-----Original Message-----
From: David Fetter [mailto:david@fetter.org] 
Sent: Wednesday, September 21, 2005 8:12 PM
To: Rafik Salama
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] postgresql clustering

On Wed, Sep 21, 2005 at 08:01:08PM +0300, Rafik Salama wrote:
> Dear Sirs
> 
> I know that that postgresql can be configured for high availability
> over a clustered environment using pgcluster,

Do you have a case study showing this?

> I am currently studying in my masters the clustering using MPI and
> OpenMP, PVM and others packages and I have to do a project, so I was
> thinking to use this opportunity to start implementing the
> clustering over postgresql using any of the above packages.
>  
> What do you think?

Let a thousand schools of thought content.  Let a hundred flowers
bloom.

Cheers,
D
-- 
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

Re: postgresql clustering

From

"Jonah H. Harris"

Date:

21 September 2005, 18:22:39

In the past couple years I've worked on several personal/business projects to cluster PostgreSQL and InnoDB (without MySQL). I've tested shared-nothing, shared-memory, and shared-disk models. IMHO, shared-disk is the only viable option for performance and/or large production business environments. Using shared-memory or shared-nothing architectures in a database are fine for high-availability, but are expensive from a business-case for added performance. I'd be happy to share any of my clustering knowledge with ya offline. Have fun!

On 9/21/05, Rafik Salama <rafikamir@gmail.com> wrote:

No I do not have a case study, I just read so, but what I am suggesting to
start doing is that if there is no cluster implementation to give high
availability of the database, I will start doing this project through the
message passing technique and I already have in the university a cluster of
19 machine intel xeon, you can see it in this URL
http://www.cs.aucegypt.edu/~cluster

But any way I was just asking so as not to reinvent the Wheel, in case there
is something like that, but since there is not, I will give it a try, at the
end of the day it is open source and I can do anything and if it happens to
work, who knows!!!!

Thanks

Rafik Salama
Systems Architect

CIT Global
CIT Building, Free Zone
Nasr City,
P.O.Box 11816, Cairo, Egypt
Tel : +202 271 8794 (ext. 115)
Fax : +202 2748335
Cell: +2010 5410035
http://www.citglobal.com

-----Original Message-----
From: David Fetter [mailto:david@fetter.org]
Sent: Wednesday, September 21, 2005 8:12 PM
To: Rafik Salama
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] postgresql clustering

On Wed, Sep 21, 2005 at 08:01:08PM +0300, Rafik Salama wrote:
> Dear Sirs
>
> I know that that postgresql can be configured for high availability
> over a clustered environment using pgcluster,

Do you have a case study showing this?

> I am currently studying in my masters the clustering using MPI and
> OpenMP, PVM and others packages and I have to do a project, so I was
> thinking to use this opportunity to start implementing the
> clustering over postgresql using any of the above packages.
>
> What do you think?

Let a thousand schools of thought content. Let a hundred flowers
bloom.

Cheers,
D
--
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100 mobile: +1 415 235 3778

Remember to vote!

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

--
Respectfully,

Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
http://www.enterprisedb.com/

Re: postgresql clustering

From

"Daniel Duvall"

Date:

23 September 2005, 05:49:09

Jonah,

I stumbled on this discussion in one of my recurring searches for an
open-source database app capable of true clustering (failover, load
balancing, etc) that I can pair with my PHP application.  A search
that, sadly, most often ends in disappointment -- there's tons and tons
of database marketing BS out there.

Part of my frustration is do to my lack of a real understanding of the
models you mentioned in your comment.  I've been searching for
meaningful text and comparisons of the different clustering models, but
have yet to find anything that truely breaks it down well (and deep).

Could you perhaps point me -- and anyone else that happens upon this
post with the same frustrations -- in the right direction?

I've looked at PostgreSQL and EnterpriseDB, but I can't find anything
definitive  as far as clustering capabilities.  What kinds of projects
are there for clustering PgSQL, and are any of them mature enough for
commercial apps?

Best,
Dan


"Jonah H. Harris" wrote:
> In the past couple years I've worked on several personal/business projects
> to cluster PostgreSQL and InnoDB (without MySQL). I've tested
> shared-nothing, shared-memory, and shared-disk models. IMHO, shared-disk is
> the only viable option for performance and/or large production business
> environments. Using shared-memory or shared-nothing architectures in a
> database are fine for high-availability, but are expensive from a
> business-case for added performance. I'd be happy to share any of my
> clustering knowledge with ya offline. Have fun!
>
>
>
> On 9/21/05, Rafik Salama <rafikamir@gmail.com> wrote:
> >
> > No I do not have a case study, I just read so, but what I am suggesting to
> > start doing is that if there is no cluster implementation to give high
> > availability of the database, I will start doing this project through the
> > message passing technique and I already have in the university a cluster
> > of
> > 19 machine intel xeon, you can see it in this URL
> > http://www.cs.aucegypt.edu/~cluster
> >
> > But any way I was just asking so as not to reinvent the Wheel, in case
> > there
> > is something like that, but since there is not, I will give it a try, at
> > the
> > end of the day it is open source and I can do anything and if it happens
> > to
> > work, who knows!!!!
> >
> > Thanks
> >
> > Rafik Salama
> > Systems Architect
> >
> > CIT Global
> > CIT Building, Free Zone
> > Nasr City,
> > P.O.Box 11816, Cairo, Egypt
> > Tel : +202 271 8794 (ext. 115)
> > Fax : +202 2748335
> > Cell: +2010 5410035
> > http://www.citglobal.com
> >
> > -----Original Message-----
> > From: David Fetter [mailto:david@fetter.org]
> > Sent: Wednesday, September 21, 2005 8:12 PM
> > To: Rafik Salama
> > Cc: pgsql-hackers@postgresql.org
> > Subject: Re: [HACKERS] postgresql clustering
> >
> > On Wed, Sep 21, 2005 at 08:01:08PM +0300, Rafik Salama wrote:
> > > Dear Sirs
> > >
> > > I know that that postgresql can be configured for high availability
> > > over a clustered environment using pgcluster,
> >
> > Do you have a case study showing this?
> >
> > > I am currently studying in my masters the clustering using MPI and
> > > OpenMP, PVM and others packages and I have to do a project, so I was
> > > thinking to use this opportunity to start implementing the
> > > clustering over postgresql using any of the above packages.
> > >
> > > What do you think?
> >
> > Let a thousand schools of thought content. Let a hundred flowers
> > bloom.
> >
> > Cheers,
> > D
> > --
> > David Fetter david@fetter.org http://fetter.org/
> > phone: +1 510 893 6100 mobile: +1 415 235 3778
> >
> > Remember to vote!
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: don't forget to increase your free space map settings
> >
>
>
>
> --
> Respectfully,
>
> Jonah H. Harris, Database Internals Architect
> EnterpriseDB Corporation
> http://www.enterprisedb.com/

Re: postgresql clustering

From

Gaetano Mendola

Date:

28 September 2005, 23:37:36

Daniel Duvall wrote:

> I've looked at PostgreSQL and EnterpriseDB, but I can't find anything
> definitive  as far as clustering capabilities.  What kinds of projects
> are there for clustering PgSQL, and are any of them mature enough for
> commercial apps?

As you well know "clustering" means all and nothing at the same time.
We do have a commercial failover cluster for provided by Redhat,
with postgres running on it. The Postgres is installed on both nodes and the
data are stored on SAN, only one instance of postgres run at time in one
of two nodes. In last 2 years we had a failure and the service relocation
worked as expected.

Consider also that applications shall have a good behaviour like "try" to
close the current connection and retry to open a new one for a while....

Regards
Gaetano Mendola

Re: postgresql clustering

From

"Joshua D. Drake"

Date:

29 September 2005, 00:38:05

Gaetano Mendola wrote:

>Daniel Duvall wrote:
>
>  
>
>>I've looked at PostgreSQL and EnterpriseDB, but I can't find anything
>>definitive  as far as clustering capabilities.  What kinds of projects
>>are there for clustering PgSQL, and are any of them mature enough for
>>commercial apps?
>>    
>>

Are you looking for clustering or replication? There are two very 
popular replication
solutions: Slony-I and Mammoth Replicator.

Slony-I is an external replication solution, Mammoth Replicator is a 
complete
PostgreSQL + Replication solution.

Sincerely,

Joshua D. Drake

>
>As you well know "clustering" means all and nothing at the same time.
>We do have a commercial failover cluster for provided by Redhat,
>with postgres running on it. The Postgres is installed on both nodes and the
>data are stored on SAN, only one instance of postgres run at time in one
>of two nodes. In last 2 years we had a failure and the service relocation
>worked as expected.
>
>Consider also that applications shall have a good behaviour like "try" to
>close the current connection and retry to open a new one for a while....
>
>Regards
>Gaetano Mendola
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 5: don't forget to increase your free space map settings
>  
>


-- 
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/

Re: postgresql clustering

From

"Daniel Duvall"

Date:

29 September 2005, 12:58:16

While "clustering" in some circles may be an open-ended buzzword --
mainly the commercial DB marketing crowd -- there are concepts beneath
the bull that are even inherent in the name.  However, I understand
your point.

>From what I've researched, the concepts and practices seem to fall
under one of two abstract categorizations: fail-over (ok...
high-availability), and parallel execution (high-performance... sure).
While some consider the implementation of only one of these to qualify
a cluster, others seem to demand that a "true" cluster must
implement both.

What I'm really after is a DB setup that does fail-over and parallel
execution.  Your setup sounds like it would gracefully handle the
former, but cannot achieve the latter.  Perhaps I'm simply asking too
much of a free software setup.

Thanks for your response.

Re: postgresql clustering

From

Tino Wildenhain

Date:

29 September 2005, 13:40:14

Daniel Duvall schrieb:
> While "clustering" in some circles may be an open-ended buzzword --
> mainly the commercial DB marketing crowd -- there are concepts beneath
> the bull that are even inherent in the name.  However, I understand
> your point.
> 
>>From what I've researched, the concepts and practices seem to fall
> under one of two abstract categorizations: fail-over (ok...
> high-availability), and parallel execution (high-performance... sure).

Well, I dont know why many people believe parallel execution
automatically means high performance. Actually most of the time
the performance is much worser this way.
If your dataset remains statically and you do only read-only
requets, you get higher performance thru load-balancing.
If howewer you do some changes to the data, the change has to
be propagated to all nodes - which in fact costs performance.
This highly depends on the link speed between the nodes.

> While some consider the implementation of only one of these to qualify
> a cluster, others seem to demand that a "true" cluster must
> implement both.
> 
> What I'm really after is a DB setup that does fail-over and parallel
> execution.  Your setup sounds like it would gracefully handle the
> former, but cannot achieve the latter.  Perhaps I'm simply asking too
> much of a free software setup.

commercial vendors arent much better here - they just dont tell you :-)
There is pgpool or SQLRelay for example if you want to parallelize
requests, you can combine with the various replication mechanism
also available for PG and get what you want - and most important
- get whats possible. Nobody can trick the math :-)


Greets
Tino

Re: postgresql clustering

From

"Jonah H. Harris"

Date:

29 September 2005, 14:43:08

On 9/29/05, Tino Wildenhain <tino@wildenhain.de> wrote:

Well, I dont know why many people believe parallel execution
automatically means high performance. Actually most of the time
the performance is much worser this way.
If your dataset remains statically and you do only read-only
requets, you get higher performance thru load-balancing.
If howewer you do some changes to the data, the change has to
be propagated to all nodes - which in fact costs performance.
This highly depends on the link speed between the nodes.

I think you should clarify that the type of clustering you're discussing is the, "shared-nothing" model which is most prevalent in open-source databases. Shared-disk and shared-memory clustered systems do not have the "propagation" issue but do have others (distributed lock manager, etc). Don't make blind statements. If you want more information about "real-world" clustering, read the research for DB2 (Mainframe) and Oracle RAC.

--
Respectfully,

Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
http://www.enterprisedb.com/

Re: postgresql clustering

From

Gaetano Mendola

Date:

29 September 2005, 15:24:36

Daniel Duvall wrote:
> While "clustering" in some circles may be an open-ended buzzword --
> mainly the commercial DB marketing crowd -- there are concepts beneath
> the bull that are even inherent in the name.  However, I understand
> your point.
> 
>>From what I've researched, the concepts and practices seem to fall
> under one of two abstract categorizations: fail-over (ok...
> high-availability), and parallel execution (high-performance... sure).
> While some consider the implementation of only one of these to qualify
> a cluster, others seem to demand that a "true" cluster must
> implement both.
> 
> What I'm really after is a DB setup that does fail-over and parallel
> execution.  Your setup sounds like it would gracefully handle the
> former, but cannot achieve the latter.  Perhaps I'm simply asking too
> much of a free software setup.
> 
> Thanks for your response.
> 

Also consider the PITR and some work I did last year:
http://archives.postgresql.org/pgsql-admin/2005-06/msg00013.php

With PITR you can have one or more remote machine/s that
continuously replay log from main, and if the main crash
the "mirrors" can come out from their reply and go "on line".

At that time was not possible connect to a "replayng" engine
to perform ( at least ) queries, dunno if this changed in 8.1

BTW, did someone go further with that idea? If not I'd like rewrite
that stuff in C ( I do prefer C++ ).

Regards
Gaetano Mendola

Re: postgresql clustering

From

Tino Wildenhain

Date:

29 September 2005, 15:33:41

Jonah H. Harris schrieb:
> On 9/29/05, *Tino Wildenhain* <tino@wildenhain.de 
> <mailto:tino@wildenhain.de>> wrote:
> 
>     Well, I dont know why many people believe parallel execution
>     automatically means high performance. Actually most of the time
>     the performance is much worser this way.
>     If your dataset remains statically and you do only read-only
>     requets, you get higher performance thru load-balancing.
>     If howewer you do some changes to the data, the change has to
>     be propagated to all nodes - which in fact costs performance.
>     This highly depends on the link speed between the nodes. 
> 
> 
> I think you should clarify that the type of clustering you're discussing 
> is the, "shared-nothing" model which is most prevalent in open-source 
> databases.  Shared-disk and shared-memory clustered systems do not have 
> the "propagation" issue but do have others (distributed lock manager, 
> etc).  Don't make blind statements.  If you want more information about 
> "real-world" clustering, read the research for DB2 (Mainframe) and 
> Oracle RAC.

No, thats not a blind statement ;) It does not matter how the
information is technically shared - shared mem must be
copied or accessed over network links if you have more then
one independend system. Locks are informations too - thus the
same constraints apply.

So no matter how you label the problem, the basic constraints:
read communication and synchronisation overhead will remain.

Costom solutions can circumvent some of the problems if you
can shift the problem area (e.g. have some read-only areas,
some seldom-write areas and some high write, some seldom read
and not immediately propagated data)

Re: postgresql clustering

From

"Luke Lonergan"

Date:

29 September 2005, 15:34:13

Daniel,

>From what I've researched, the concepts and practices seem to fall
> under one of two abstract categorizations: fail-over (ok...
> high-availability), and parallel execution (high-performance... sure).
> While some consider the implementation of only one of these to qualify
> a cluster, others seem to demand that a "true" cluster must
> implement both.

If you want to get a high degree of parallelism, 10s or 100s of machines are required.   At that size, you must have
faulttolerance to make the ystem usable. 

> What I'm really after is a DB setup that does fail-over and parallel
> execution.  Your setup sounds like it would gracefully handle the
> former, but cannot achieve the latter.  Perhaps I'm simply asking too
> much of a free software setup.

We've spent the last 3 years developing a parallel database that does both and I can tell you that it takes a huge
developmenteffort to get it right for the general audience.  Bizgres MPP is capable of handling ANSI SQL, is ACID
compliantand scales to tens of terabytes, but it's not free (sorry about that).  It is tons cheaper than Oracle or
Teradatathough, and it's based on Postgres. 

- Luke

Re: postgresql clustering

From

"Daniel Duvall"

Date:

30 September 2005, 13:10:55

Thanks for your reply Luke.

Bizgres looks like a very promissing project.  I'll be sure to follow
it.

Thanks to everyone for their comments.  I'm starting to understand the
truth behind the hype and where these performance gains and hits stem
from.

-Dan

Re: postgresql clustering

From

"Daniel Duvall"

Date:

30 September 2005, 13:11:30

What about clustered filesystems?  At first blush I would think the
overhead of something like GFS might kill performance.  Could one
potentially achieve a fail-over config using multiple nodes with GFS,
each having there own instance of PostgreSQL (but only one running at
any given moment)?

Best,
Dan

Fwd: Re: postgresql clustering

From

Trent Shipley

Date:

30 September 2005, 13:13:05

What is the relationship between database support for clustering and grid 
computing and support for distributed databases?

Two-phase COMMIT is comming in 8.1.  What effect will this have in promoting 
FOSS grid support or distribution solutions for Postgresql?

Re: postgresql clustering

From

"Luke Lonergan"

Date:

30 September 2005, 21:04:26

Dan,

On 9/29/05 3:23 PM, "Daniel Duvall" <the.liberal.media@gmail.com> wrote:

> What about clustered filesystems?  At first blush I would think the
> overhead of something like GFS might kill performance.  Could one
> potentially achieve a fail-over config using multiple nodes with GFS,
> each having there own instance of PostgreSQL (but only one running at
> any given moment)?

Interestingly - my friend Matt O'Keefe built GFS at UMN, I was one of his
first customers/sponsors of the research in 1998 when I implemented an
8-node shared disk cluster on Alpha Linux using GFS and Fibre Channel.

Again - it depends on what you're doing - if it's OLTP, you will spend too
much time in lock management for disk access and things like Oracle RAC's
CacheFusion becomes critical to reduce the number of times you have to hit
disks.  For warehousing/sequential scans, this kind of clustering is
irrelevant.

- Luke

Re: postgresql clustering

From

Hans-Jürgen Schönig

Date:

30 September 2005, 21:40:59

Luke Lonergan wrote:
> Dan,
> 
> On 9/29/05 3:23 PM, "Daniel Duvall" <the.liberal.media@gmail.com> wrote:
> 
> 
>>What about clustered filesystems?  At first blush I would think the
>>overhead of something like GFS might kill performance.  Could one
>>potentially achieve a fail-over config using multiple nodes with GFS,
>>each having there own instance of PostgreSQL (but only one running at
>>any given moment)?
> 
> 
> Interestingly - my friend Matt O'Keefe built GFS at UMN, I was one of his
> first customers/sponsors of the research in 1998 when I implemented an
> 8-node shared disk cluster on Alpha Linux using GFS and Fibre Channel.
> 
> Again - it depends on what you're doing - if it's OLTP, you will spend too
> much time in lock management for disk access and things like Oracle RAC's
> CacheFusion becomes critical to reduce the number of times you have to hit
> disks.  

Hitting the disk is really bad. However, we have seen that consulting 
the network for small portions of data (e.g. locks) is even more 
critical. you will see that the CPU on all nodes is running at 1% or so 
while the network is waiting for data to be exchanged (latency) - this 
is the real problem.

i don't know what oracle is doing in detail but they have real problem 
when losing a node inside the cluster (syncing again is really time 
consuming).

> For warehousing/sequential scans, this kind of clustering is
> irrelevant.

I suggest to look at Teradata - for do really nice query partitioning on 
so called AMPs (we'd simply call it node). It is really nice for really 
ugly warehousing queries (ugly in terms of amount of data).
Hans

-- 
Cybertec Geschwinde & Schönig GmbH
Schöngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at