Re: Bigtime scaling of Postgresql (cluster and stuff I suppose) - Mailing list pgsql-general

From Markus Schiltknecht
Subject Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)
Date
Msg-id 46D448D0.4000301@bluegap.ch
Whole thread Raw
In response to Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)  (Bill Moran <wmoran@potentialtech.com>)
Responses Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)  (Bill Moran <wmoran@potentialtech.com>)
List pgsql-general
Hi,

Bill Moran wrote:
> While true, I feel those applications are the exception, not the rule.
> Most DBs these days are the blogs and the image galleries, etc.  And
> those don't need or want the overhead associated with synchronous
> replication.

Uhm.. do blogs and image galleries need replication at all?

I'm thinking more of the business critical applications, where high
availability is a real demand - and where your data *should* better be
distributed among multiple data centers just to avoid a single point of
failure.

<rant> for most other stuff MySQL is good enough </rant>

> I find that line fuzzy.

Yeah, it is.

> It's synchronous for the reason you describe,
> but it's asynchronous because a query that has returned successfully
> is not _guaranteed_ to be committed everywhere yet.  Seems like we're
> dealing with a limitation in the terminology :)

Certainly! But sync and async replication are so well known and used
terms... on the other hand, I certainly agree that in Postgres-R, the
nodes do not process transactions synchronously, but asynchronous.

Maybe it's really better to speak of eager and lazy replication, as in
some literature (namely the initial Postgres-R paper of Bettina Kemme).

> This could potentially be a problem on (for example) a web application,
> where a particular user's experience may be load-balanced to another
> node at any time.  Of course, you just have to write the application
> with that knowledge.

IMO, such heavily dynamic load-balancing is rarely useful.

With application support, it's easily doable: let the first transaction
on node A query the (global) transaction identifier and after connecting
to the next node B, ask that to wait until that transaction has committed.

It gets a little harder without application support: the load balancer
would have to keep track of sessions and their last (writing) transaction.

Again, thank you for pointing this out.

Regards

Markus


pgsql-general by date:

Previous
From: Paul Tilles
Date:
Subject: ecpg: dtime_t vs timestamp
Next
From: Teodor Sigaev
Date:
Subject: Re: PickSplit method of 2 columns ... error