Thread: Bigtime scaling of Postgresql (cluster and stuff I suppose)

Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

"Phoenix Kiula"

Date:

25 August 2007, 09:12:59

We're moving from MySQL to PG, a move I am rather enjoying, but we're
currently running both databases. As we web-enable our financial
services in fifteen countries, I would like to recommend the team that
we move entirely to PG.

In doing research on big installations of the two databases, I read
this from a MySQL senior exec on Slashdot:


QUOTE---
"...We did it ourselves with MySQL Cluster. That's technology
originally developed by Ericsson - and today it is an absolute
category leader in telco and networking. Nokia, Alcatel and Nortel are
all building real-time network nodes on top of MySQL Cluster. No one
else has a main-memory based shared-nothing cluster with that high
throughput and availability. Or take the new Monitoring and Advisory
Service that we are launching as part of MySQL Enterprise - this is a
novel innovatoin built on the feedback from our most advanced users
and customers."
--UNQUOTE


The sentence that caught my attention is "Nokia, Alcatel and Nortel
are all building real-time network nodes on top of MySQL Cluster."

My experiences with MySQL so far have been less than exhilerating
(only tried it for our web stuff, which is not much so far but
increasingly rapidly) but I have seen a lot of talk about MySQL
clusters.

Is there something similar in the PG world? PG Clusters?

Google search comes up (apart from the "cluster" command) with stuff
like this in the mailing list archives -- http://snipr.com/1pxkf --
which I am not sure is very heartening piece of news.

I'd appreciate any thoughts or experiences, or starting points for
what I should start reading in terms of PG in high availability
situations, including replication and whatnot. Am I speaking of
something like "EnterpriseDB" (is that Postgresql with tech support,
or a different DB?)

TIA!

Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

Bill Moran

Date:

25 August 2007, 11:18:44

"Phoenix Kiula" <phoenix.kiula@gmail.com> wrote:
>
> We're moving from MySQL to PG, a move I am rather enjoying, but we're
> currently running both databases. As we web-enable our financial
> services in fifteen countries, I would like to recommend the team that
> we move entirely to PG.
>
> In doing research on big installations of the two databases, I read
> this from a MySQL senior exec on Slashdot:

Senior MySQL exec means this is a marketing blurb, which means it's
exaggerated, lacking any honest assessment of challenges and difficulties,
and possibly an outright lie.  I've no doubt that MySQL can do clusters
if you know what you're doing, but if you want the facts, you're going
to have to look deeper than that obviously biased quote.  I seem to
remember a forum thread with someone having considerable difficulty
with MySQL cluster, and actual MySQL employees jumping in to try to
help and no solution ever found.  Anyone have that link lying around?

In any event, replication is a large and complex topic.  To do it well
takes research, planning, and know-how.  Anyone who tells you their
solution will just drop in and work is either lying or charging you
a bunch of money for their consultants to investigate your scenario
and set it up for you.

First off, "clustering" is a word that is too vague to be useful, so
I'll stop using it.  There's multi-master replication, where every
database is read-write, then there's master-slave replication, where
only one server is read-write and the rest are read-only.  You can
add failover capabilities to master-slave replication.  Then there's
synchronous replication, where all servers are guaranteed to get
updates at the same time.  And asynchronous replication, where other
servers may take a while to get updates.  These descriptions aren't
really specific to PostgreSQL -- every database replication system
has to make design decisions about which approaches to support.

PostgreSQL has some built-in features to allow synchronous multi-master
database replication.  Two-phase commit allows you to reliably commit
transactions to multiple servers concurrently, but it requires support
at the application level, which will require you to rewrite any existing
applications.

Pgcluster is multi-master synchronous replication, but I believe it's
still in beta.

Note that no synchronous replication system works well over
geographically large distances.  The time required for the masters
to synchronize over (for example) the Internet kills performance to
the point of uselessness.  Again, this is not a PostgreSQL problem,
MSSQL suffers the same problem.  Synchronous replication is only
really used when two servers are right next to each other with a
high-speed link (probably gigabit) between them.

PostgreSQL-R is in development, and targeted to allow multi-master,
asynchronous replication without rewriting your application.  As
far as I know, it works, but it's still beta.

pgpool supports multi-master synchronous replication as well as
failover.

Slony supports master-slave asynchronous replication and works _very_
well over long distances (such as from an east coast to a west coast
datacenter)

Once you've looked at your requirements, start looking at the tool
that matches those requirements, and I think you'll find what you
need.

BTW: does anyone know of a link that describes these high-level concepts?
If not, I think I'll write this up formally and post it.

--
Bill Moran
http://www.potentialtech.com

Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

Tom Lane

Date:

25 August 2007, 12:13:54

Bill Moran <wmoran@potentialtech.com> writes:
> First off, "clustering" is a word that is too vague to be useful, so
> I'll stop using it.

Right.  MySQL Cluster, on the other hand, is a very specific technology.
http://dev.mysql.com/doc/refman/5.0/en/mysql-cluster.html

It's interesting but far from an all-purpose solution.  I am no expert
but I believe these are true statements:

* The data is all kept in memory, with *no* on-disk backup.  This is why
it's so fast.  As for reliability, the rationale is that if you have
enough replicated nodes then you'll never lose all of them at once.

* It's a synchronous multi-master system, meaning that performance is
only good if there is very low communication delay between the nodes.
Not only do they have to be all on the same LAN segment, MySQL says
it had better be at least 100MB Ethernet.

In case you hadn't noticed the disconnect between these statements:
if they have to be that close together, there *will* be a single point
of failure.  Fire in your data center, for instance, will take out every
copy of your data.  So as a "high availability" solution I don't find
it all that compelling.

It is, however, capable of being d*mn fast for read-mostly workloads
that can fit their whole dataset into RAM --- and with the price of
RAM what it is, that's not such a big limitation, though remember you
need N servers with that much RAM not only one.  (For write-mostly
workloads I imagine it doesn't scale out so well, because of the
costs of updating multiple servers synchronously.)

            regards, tom lane

Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

Shane Ambler

Date:

25 August 2007, 14:31:48

Bill Moran wrote:

> BTW: does anyone know of a link that describes these high-level concepts?
> If not, I think I'll write this up formally and post it.
>

Chapter 24 -
http://www.postgresql.org/docs/8.2/interactive/high-availability.html
is a recent addition to the manual that starts to explain most of these
terms.

--

Shane Ambler
pgSQL@Sheeky.Biz

Get Sheeky @ http://Sheeky.Biz

Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

David Fetter

Date:

26 August 2007, 19:09:51

On Sat, Aug 25, 2007 at 10:18:25AM -0400, Bill Moran wrote:
> "Phoenix Kiula" <phoenix.kiula@gmail.com> wrote:
> >
> > We're moving from MySQL to PG, a move I am rather enjoying, but
> > we're currently running both databases. As we web-enable our
> > financial services in fifteen countries, I would like to recommend
> > the team that we move entirely to PG.
> >
> > In doing research on big installations of the two databases, I
> > read this from a MySQL senior exec on Slashdot:
>
> Senior MySQL exec means this is a marketing blurb, which means it's
> exaggerated, lacking any honest assessment of challenges and
> difficulties, and possibly an outright lie.  I've no doubt that
> MySQL can do clusters if you know what you're doing, but if you want
> the facts, you're going to have to look deeper than that obviously
> biased quote.  I seem to remember a forum thread with someone having
> considerable difficulty with MySQL cluster, and actual MySQL
> employees jumping in to try to help and no solution ever found.
> Anyone have that link lying around?

I think this is the one.

http://forums.mysql.com/read.php?25,93181,93181

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666
                              Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate

Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

Vivek Khera

Date:

27 August 2007, 11:50:00

On Aug 25, 2007, at 8:12 AM, Phoenix Kiula wrote:

> The sentence that caught my attention is "Nokia, Alcatel and Nortel
> are all building real-time network nodes on top of MySQL Cluster."
>
> My experiences with MySQL so far have been less than exhilerating
> (only tried it for our web stuff, which is not much so far but
> increasingly rapidly) but I have seen a lot of talk about MySQL
> clusters.
>
> Is there something similar in the PG world? PG Clusters?

MySQL Cluster is a specific product.  It is not just mysql databases
lumped together to make a big mysql DB.  It is a memory-based
database that requires at least 2 machines to run, and is not
usefully redundant and distributed until you have many machines.  The
telco's use it for their logging and switching infrastructure where
you need 100% uptime and wicked fast response (thus memory based.)
And you'll note it was developed by Ericsson...

There is no equivalent in Postgres.

There are several replication choices for Postgres. Google will find
them for you, or just look on the postgres.org pages for the list.
We use slony1 and it works very well for our need.

Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

Andrew Sullivan

Date:

27 August 2007, 12:04:24

On Sat, Aug 25, 2007 at 11:13:45AM -0400, Tom Lane wrote:
> In case you hadn't noticed the disconnect between these statements:
> if they have to be that close together, there *will* be a single point
> of failure.  Fire in your data center, for instance, will take out every
> copy of your data.  So as a "high availability" solution I don't find
> it all that compelling.

Indeed.  There's another issue, too, which you have to spend some
time reading the manual to get.  The clustering stuff is _yet
another_ table type, with subtly different semantics from other table
types.  As usual, this means that you can blow off your foot by
mixing table types in a transaction.  As near as I can tell, the
cluster table type (I disremeber the name of it) cannot be run in
strict mode, either.

To answer the OP's question, you can do some "cluster-like" things by
doing hardware clustering -- two machines attached to a RAID with
some sort of hardware fail-over in place.

I think that the MySQL cluster stuff wasn't intended as an HA
feature, though (although they might well be selling it that way).
It was a way to scale many small systems for certain kinds of
workloads.  My impression is that in most cases, it's a SQL-ish
solution to a problem where someone decided to use the SQL nail
because that's the hammer they had.  I can think of ways you could
use it, and I'm not surprised that Some Giant Corp is doing so.  But
I'd be astonished if someone used it for truly valuable data.  I
would think very hard about the qualifications of someone who
proposed using it for financial data.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
Users never remark, "Wow, this software may be buggy and hard
to use, but at least there is a lot of code underneath."
        --Damien Katz

Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

Vivek Khera

Date:

27 August 2007, 13:01:52

On Aug 27, 2007, at 11:04 AM, Andrew Sullivan wrote:

> It was a way to scale many small systems for certain kinds of
> workloads.  My impression is that in most cases, it's a SQL-ish
> solution to a problem where someone decided to use the SQL nail
> because that's the hammer they had.  I can think of ways you could

The underlying table type can be used directly, similarly to how the
"DB" table type is built on the DB library.  The issue is whether you
can do that without some strange licensing problems.

I wouldn't use it for "permanent" data.  It would be awesome for a
cache of the "hot" data in your app, and perhaps even for realized
views that  speed up your interactive app.

Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

Philip Hallstrom

Date:

27 August 2007, 18:42:01

> Bill Moran <wmoran@potentialtech.com> writes:
>> First off, "clustering" is a word that is too vague to be useful, so
>> I'll stop using it.
>
> Right.  MySQL Cluster, on the other hand, is a very specific technology.
> http://dev.mysql.com/doc/refman/5.0/en/mysql-cluster.html
>
> It is, however, capable of being d*mn fast for read-mostly workloads
> that can fit their whole dataset into RAM --- and with the price of

There are however some things that won't work (or work well) using NDB
that will drive you crazy.

VARCHAR's aren't varchars.  They are fixed to the max length.  There's
also a limit on overall row length which is pretty small (don't remember
what it is off hand).  Cluster doesn't really enjoy processing queries
with left outer joins or joins in general -- what will take <1s on a
single mysql instance can take several seconds on the cluster.  Some of
this is because the storage nodes can't do it so copy all the tables
involved to the api nodes for processing.  Even on a fast network this
takes a lot of time.  You can't have a query with two OR'd LIKE clauses.
Instead you have to break them into their own query and UNION the result.
You can't insert/update/delete more than 32000 rows at a time.  In
practice (and no I don't understand why) sometimes this really means more
like 10000.

Most annoying however is that to make a change to the database schema you
have to shut down all the nodes except one.  Not sure if this is typical
of other systems or not, but it kind of sucks :/

There's other things too, but I don't remember what they are until I build
something that works fine with a single mysql instance and then doesn't on
the cluster...

-philip

Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

Markus Schiltknecht

Date:

28 August 2007, 09:48:37

Hi,

Bill Moran wrote:
> First off, "clustering" is a word that is too vague to be useful, so
> I'll stop using it.  There's multi-master replication, where every
> database is read-write, then there's master-slave replication, where
> only one server is read-write and the rest are read-only.  You can
> add failover capabilities to master-slave replication.  Then there's
> synchronous replication, where all servers are guaranteed to get
> updates at the same time.  And asynchronous replication, where other
> servers may take a while to get updates.  These descriptions aren't
> really specific to PostgreSQL -- every database replication system
> has to make design decisions about which approaches to support.

Good explanation!

> Synchronous replication is only
> really used when two servers are right next to each other with a
> high-speed link (probably gigabit) between them.

Why is that so? There's certainly very valuable data which would gain
from an inter-continental database system. For money transfers, for
example, I'd rather wait half a second for a round trip around the
world, to make sure the RDBS does not 'loose' my money.

> PostgreSQL-R is in development, and targeted to allow multi-master,
> asynchronous replication without rewriting your application.  As
> far as I know, it works, but it's still beta.

Sorry, this is nitpicking, but for some reason (see current naming
discussion on -advocacy :-) ), it's "Postgres-R".

Additionally, Postgres-R is considered to be a *synchronous* replication
system, because once you get your commit confirmation, your transaction
is guaranteed to be deliverable and *committable* on all running nodes
(i.e. it's durable and consistent). Or put it another way: asynchronous
systems have to deal with conflicting, but already committed
transactions - Postgres-R does not.

Certainly, this is slightly less restrictive than saying that a
transaction needs to be *committed* on all nodes, before confirming the
commit to the client. But as long as a database session is tied to a
node, this optimization does not alter any transactional semantics. And
despite that limitation, which is mostly the case in reality anyway, I
still consider this to be synchronous replication.

[ To get a strictly synchronous system with Postgres-R, you'd have to
delay read only transactions on a node which hasn't applied all remote
transactions, yet. In most cases, that's unwanted. Instead, a consistent
snapshot is enough, just as if the transaction started *before* the
remote ones which still need to be applied. ]

> BTW: does anyone know of a link that describes these high-level concepts?
> If not, I think I'll write this up formally and post it.

Hm.. somewhen before 8.3 was released, we had lots of discussions on
-docs about the "high availability and replication" section of the
PostgreSQL documentation. I'd have liked to add these fundamental
concepts, but Bruce - rightly - wanted to keep focused on existing
solutions. And unfortunately, most existing solutions are async,
single-master. So explaining all these wonderful theoretic concepts only
to state that there are no real solutions would have been silly.

Regards

Markus

Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

Bill Moran

Date:

28 August 2007, 11:21:22

In response to Markus Schiltknecht <markus@bluegap.ch>:

> Hi,
>
> Bill Moran wrote:
> > First off, "clustering" is a word that is too vague to be useful, so
> > I'll stop using it.  There's multi-master replication, where every
> > database is read-write, then there's master-slave replication, where
> > only one server is read-write and the rest are read-only.  You can
> > add failover capabilities to master-slave replication.  Then there's
> > synchronous replication, where all servers are guaranteed to get
> > updates at the same time.  And asynchronous replication, where other
> > servers may take a while to get updates.  These descriptions aren't
> > really specific to PostgreSQL -- every database replication system
> > has to make design decisions about which approaches to support.
>
> Good explanation!
>
> > Synchronous replication is only
> > really used when two servers are right next to each other with a
> > high-speed link (probably gigabit) between them.
>
> Why is that so? There's certainly very valuable data which would gain
> from an inter-continental database system. For money transfers, for
> example, I'd rather wait half a second for a round trip around the
> world, to make sure the RDBS does not 'loose' my money.

While true, I feel those applications are the exception, not the rule.
Most DBs these days are the blogs and the image galleries, etc.  And
those don't need or want the overhead associated with synchronous
replication.

> > PostgreSQL-R is in development, and targeted to allow multi-master,
> > asynchronous replication without rewriting your application.  As
> > far as I know, it works, but it's still beta.
>
> Sorry, this is nitpicking, but for some reason (see current naming
> discussion on -advocacy :-) ), it's "Postgres-R".

Sorry.

> Additionally, Postgres-R is considered to be a *synchronous* replication
> system, because once you get your commit confirmation, your transaction
> is guaranteed to be deliverable and *committable* on all running nodes
> (i.e. it's durable and consistent). Or put it another way: asynchronous
> systems have to deal with conflicting, but already committed
> transactions - Postgres-R does not.

I find that line fuzzy.  It's synchronous for the reason you describe,
but it's asynchronous because a query that has returned successfully
is not _guaranteed_ to be committed everywhere yet.  Seems like we're
dealing with a limitation in the terminology :)

> Certainly, this is slightly less restrictive than saying that a
> transaction needs to be *committed* on all nodes, before confirming the
> commit to the client. But as long as a database session is tied to a
> node, this optimization does not alter any transactional semantics. And
> despite that limitation, which is mostly the case in reality anyway, I
> still consider this to be synchronous replication.

This could potentially be a problem on (for example) a web application,
where a particular user's experience may be load-balanced to another
node at any time.  Of course, you just have to write the application
with that knowledge.

> [ To get a strictly synchronous system with Postgres-R, you'd have to
> delay read only transactions on a node which hasn't applied all remote
> transactions, yet. In most cases, that's unwanted. Instead, a consistent
> snapshot is enough, just as if the transaction started *before* the
> remote ones which still need to be applied. ]

Agreed.

> > BTW: does anyone know of a link that describes these high-level concepts?
> > If not, I think I'll write this up formally and post it.
>
> Hm.. somewhen before 8.3 was released, we had lots of discussions on
> -docs about the "high availability and replication" section of the
> PostgreSQL documentation. I'd have liked to add these fundamental
> concepts, but Bruce - rightly - wanted to keep focused on existing
> solutions. And unfortunately, most existing solutions are async,
> single-master. So explaining all these wonderful theoretic concepts only
> to state that there are no real solutions would have been silly.

Someone else posted a link, and the docs look pretty comprehensive at this
point ... enough so that I'm not going to bother writing up my own
explanation.

--
Bill Moran
http://www.potentialtech.com

Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

Markus Schiltknecht

Date:

28 August 2007, 13:10:53

Hi,

Bill Moran wrote:
> While true, I feel those applications are the exception, not the rule.
> Most DBs these days are the blogs and the image galleries, etc.  And
> those don't need or want the overhead associated with synchronous
> replication.

Uhm.. do blogs and image galleries need replication at all?

I'm thinking more of the business critical applications, where high
availability is a real demand - and where your data *should* better be
distributed among multiple data centers just to avoid a single point of
failure.

<rant> for most other stuff MySQL is good enough </rant>

> I find that line fuzzy.

Yeah, it is.

> It's synchronous for the reason you describe,
> but it's asynchronous because a query that has returned successfully
> is not _guaranteed_ to be committed everywhere yet.  Seems like we're
> dealing with a limitation in the terminology :)

Certainly! But sync and async replication are so well known and used
terms... on the other hand, I certainly agree that in Postgres-R, the
nodes do not process transactions synchronously, but asynchronous.

Maybe it's really better to speak of eager and lazy replication, as in
some literature (namely the initial Postgres-R paper of Bettina Kemme).

> This could potentially be a problem on (for example) a web application,
> where a particular user's experience may be load-balanced to another
> node at any time.  Of course, you just have to write the application
> with that knowledge.

IMO, such heavily dynamic load-balancing is rarely useful.

With application support, it's easily doable: let the first transaction
on node A query the (global) transaction identifier and after connecting
to the next node B, ask that to wait until that transaction has committed.

It gets a little harder without application support: the load balancer
would have to keep track of sessions and their last (writing) transaction.

Again, thank you for pointing this out.

Regards

Markus

Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

Trent Shipley

Date:

29 August 2007, 20:05:27

On Monday 2007-08-27 08:04, Andrew Sullivan wrote:
> On Sat, Aug 25, 2007 at 11:13:45AM -0400, Tom Lane wrote:
> > In case you hadn't noticed the disconnect between these statements:
> > if they have to be that close together, there *will* be a single point
> > of failure.  Fire in your data center, for instance, will take out every
> > copy of your data.  So as a "high availability" solution I don't find
> > it all that compelling.
>
> Indeed.  There's another issue, too, which you have to spend some
> time reading the manual to get.  The clustering stuff is _yet
> another_ table type, with subtly different semantics from other table
> types.  As usual, this means that you can blow off your foot by
> mixing table types in a transaction.  As near as I can tell, the
> cluster table type (I disremeber the name of it) cannot be run in
> strict mode, either.
>
> To answer the OP's question, you can do some "cluster-like" things by
> doing hardware clustering -- two machines attached to a RAID with
> some sort of hardware fail-over in place.
>
> I think that the MySQL cluster stuff wasn't intended as an HA
> feature, though (although they might well be selling it that way).
> It was a way to scale many small systems for certain kinds of
> workloads.  My impression is that in most cases, it's a SQL-ish
> solution to a problem where someone decided to use the SQL nail
> because that's the hammer they had.  I can think of ways you could
> use it, and I'm not surprised that Some Giant Corp is doing so.  But
> I'd be astonished if someone used it for truly valuable data.  I
> would think very hard about the qualifications of someone who
> proposed using it for financial data.
>
If it was developed by Ericson for Telco purposes then it would be designed to
be wicked fast for OLTP (billing-switching is an OLTP application) with VERY
high up time and reliable (customers and on occasion regulators get angry
when the phones dont work).  It wouldn't matter if it can be geographically
distributed.  If the switching center catches fire you're hosed anyway.

Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

Bill Moran

Date:

01 September 2007, 09:24:43

Markus Schiltknecht <markus@bluegap.ch> wrote:
>
> Hi,
>
> Bill Moran wrote:
> > While true, I feel those applications are the exception, not the rule.
> > Most DBs these days are the blogs and the image galleries, etc.  And
> > those don't need or want the overhead associated with synchronous
> > replication.
>
> Uhm.. do blogs and image galleries need replication at all?

Ever read anything on how myspace is laid out?  The big ones need
replication to handle the traffic.

> I'm thinking more of the business critical applications, where high
> availability is a real demand - and where your data *should* better be
> distributed among multiple data centers just to avoid a single point of
> failure.

Agreed.  Judged by importance, they're bigger.  Judged by raw numbers, I
suspect that they're less prominent than the blogs and image galleries.
I guess it depends on your viewpoint.

> <rant> for most other stuff MySQL is good enough </rant>

I hate when people say that.  Any company that says that is of limited
viability in my opinion.  For one thing, saying it's "good enough" is
setting yourself up for trouble when your company expands its
requirements.

> > It's synchronous for the reason you describe,
> > but it's asynchronous because a query that has returned successfully
> > is not _guaranteed_ to be committed everywhere yet.  Seems like we're
> > dealing with a limitation in the terminology :)
>
> Certainly! But sync and async replication are so well known and used
> terms... on the other hand, I certainly agree that in Postgres-R, the
> nodes do not process transactions synchronously, but asynchronous.

Good point.

> Maybe it's really better to speak of eager and lazy replication, as in
> some literature (namely the initial Postgres-R paper of Bettina Kemme).
>
> > This could potentially be a problem on (for example) a web application,
> > where a particular user's experience may be load-balanced to another
> > node at any time.  Of course, you just have to write the application
> > with that knowledge.
>
> IMO, such heavily dynamic load-balancing is rarely useful.
>
> With application support, it's easily doable: let the first transaction
> on node A query the (global) transaction identifier and after connecting
> to the next node B, ask that to wait until that transaction has committed.
>
> It gets a little harder without application support: the load balancer
> would have to keep track of sessions and their last (writing) transaction.
>
> Again, thank you for pointing this out.

Keep up the good work.  I wish I had some spare cycles to work on
Postgres-R -- it seems like a very interesting problem.

--
Bill Moran
http://www.potentialtech.com

Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

"chris smith"

Date:

01 September 2007, 10:12:58

> Ever read anything on how myspace is laid out?  The big ones need
> replication to handle the traffic.

Actually no.

http://highscalability.com/livejournal-architecture

"Using MySQL replication only takes you so far." (Yeh it's mysql but
the point is valid regardless).
"You can't keep adding read slaves and scale."

A lot use sharding now to keep scaling (limiting to "X" users/accounts
per database system and just keep adding more database servers for the
next "X" accounts).


Myspace info here:

http://highscalability.com/myspace-architecture

At 3mill users:

- split its user base into chunks of 1 million accounts and put all
the data keyed to those accounts in a separate instance of SQL Server

I'm sure there's replication behind the scenes to help with
read-queries but it's definitely not a magic wand that will fix
everything.

--
Postgresql & php tutorials
http://www.designmagick.com/

Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

Bill Moran

Date:

01 September 2007, 10:29:41

"chris smith" <dmagick@gmail.com> wrote:
>
> > Ever read anything on how myspace is laid out?  The big ones need
> > replication to handle the traffic.
>
> Actually no.
>
> http://highscalability.com/livejournal-architecture
>
> "Using MySQL replication only takes you so far." (Yeh it's mysql but
> the point is valid regardless).
> "You can't keep adding read slaves and scale."
>
> A lot use sharding now to keep scaling (limiting to "X" users/accounts
> per database system and just keep adding more database servers for the
> next "X" accounts).

I got the impression that they hadn't moved _all_ of their DB needs to
sharding.  Just the ones that exceeded the scalability of replication,
but they don't explicitly say, IIRC.

--
Bill Moran
http://www.potentialtech.com

Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)

From

Ron Johnson

Date:

01 September 2007, 17:42:02

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 09/01/07 08:12, chris smith wrote:
>> Ever read anything on how myspace is laid out?  The big ones need
>> replication to handle the traffic.
>
> Actually no.
>
> http://highscalability.com/livejournal-architecture
>
> "Using MySQL replication only takes you so far." (Yeh it's mysql but
> the point is valid regardless).
> "You can't keep adding read slaves and scale."
>
> A lot use sharding now to keep scaling (limiting to "X" users/accounts
> per database system and just keep adding more database servers for the
> next "X" accounts).

Hmmm.  Horizontally partitioning your "database" into multiple
physical databases is 10+ years old.  At least.  This is how DEC
implemented the billing database for DirecTV, and how we implemented
*large* toll systems in the US Northeast.

In addition to the account databases, you need a "reference"
database for tables that can't be partitioned by account, be able to
run queries across databases, and middleware that knows how to
direct transactions to the correct database.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG2c6QS9HxQb37XmcRAkuGAJ4thc+owEX8OJl1qaTrY+krHkMIwgCfWver
aUEiSPkrw4Gnf7dI0ftSVJ0=
=4wFD
-----END PGP SIGNATURE-----