Thread: HA and Replication - how to choose among all the available solutions

HA and Replication - how to choose among all the available solutions

From

Pascal Cohen

Date:

30 November 2007, 20:31:02

Hello

I am facing a probably very common problem. I made a search in the
recent archives and could find many posts related to my issue. But I did
not get exactly "the answer" to my question.
How can I chose among the different existing tools to enable High
Availability and Load-balancing/Replication.
In fact our situation is common. We have a DB that is growing and we
need to ensure High-availability.
That needs some standby solution.
On the other hand we have few updates and many reads and when we are
trying some simulations, the number of requests may become too high for
our server (I mean requests fast - less than 4ms with explain run-
starts to be logged with a threshold to 200ms) so we have thought to
replication to balance the request load among several machines. We can
afford some delay among the DB (which means sync replication is welcome
but not mandatory and should probably be avoided if we want to improve
the perfs).
I had mainly a look at several solutions like:
PGCluster - but I could not get some fresh stuff
Slony-I - seems mature and used
PgPool-II - active and interesting project
PostgreSQL-R - was not able to test it.
C-JDBC/Sequoia.

I mainly focused on PgPool-II and Slony which seemed to be active projects.
I also looked at the Log-Shipping section in PostgreSQL documentation to
ensure high-availability.

Concerning the high-availability issue, I can in fact imagine using
log-shipping, or Slony-I or PgPool-II in multi-master mode with only one
server available.
But I don't know how to chose a solution. I don't have much experience
and can't see precisely the advantages and the drawbacks of each solution.

Concerning Scalability, I think I would need PgPool-II connected in
front to at least load balance select to the different servers. But I
can use the "multi-master" solution or rely on Slony to replicate data
while PgPool balances the select and only sends updates to the Master.
In fact that last solution makes me more comfortable because I fear that
multi-master can lead to complex conflict resolution while a single
master avoids such concern.
With a single master I can't have any data mismatch case I was able to
simulate with PgPool-II (inserting manually in a DB a duplicate value or
adding a trigger to simulate an exception).
What is the experience concerning that kind of question?

Mixing both Scalability and High-availability: with PgPool-II running in
"multi-master" solution, no real issue. With single master it is less
obvious :(
I had imagined a mixed solution using a Master relying on a warm standby
using log-shipping and Linux-HB while Replicated slaves would be
replicated using Slony and load balancing would be managed by PgPool-II.
But perhaps it is stupid and using only PgPool-II is safe enough.
And why should I chose Slony instead of Log-shipping to update the slaves ?

The hardest in the situation is to make the (or at least a) good choice
among all the available solutions and to chose a realistic one

That are many questions in this quite long post.
Thanks for any help or advice you could give me.

Pascal

Re: HA and Replication - how to choose among all the available solutions

From

Pascal Cohen

Date:

01 December 2007, 13:02:56

By the way I found also another tool called CyberCluster that will
probably not make my choice easier ;)

Re: HA and Replication - how to choose among all the available solutions

From

Andrew Sullivan

Date:

03 December 2007, 06:32:12

On Fri, Nov 30, 2007 at 10:30:47PM +0100, Pascal Cohen wrote:
> I am facing a probably very common problem. I made a search in the
> recent archives and could find many posts related to my issue. But I did
> not get exactly "the answer" to my question.

No, and I doubt you will.

> But I don't know how to chose a solution. I don't have much experience
> and can't see precisely the advantages and the drawbacks of each solution.

That's because, quite frankly, it's hard to answer that question in general.
You have to answer it under your circumstances.  Since you're the one who
knows those, you'll have to do the answering, I'm afraid.  This probably
means "do some tests".

> And why should I chose Slony instead of Log-shipping to update the slaves ?

This is a good example (and the only thing in your list I feel comfortable
talking about): if you think that you might find it handy to be able to
query the replicas, use Slony, because you can't do it with log shipping
(yet).

A

Re: HA and Replication - how to choose among all the available solutions

From

Pascal Cohen

Date:

03 December 2007, 07:16:56

> On Fri, Nov 30, 2007 at 10:30:47PM +0100, Pascal Cohen wrote:
>
>> I am facing a probably very common problem. I made a search in the
>> recent archives and could find many posts related to my issue. But I did
>> not get exactly "the answer" to my question.
>>
>
> No, and I doubt you will.
>
I don't why I had such a feeling ;)
Thanks for your answer. It was an open question and your answer
concerning Log-Shipping/Slony is very helpful.
>
>> But I don't know how to chose a solution. I don't have much experience
>> and can't see precisely the advantages and the drawbacks of each solution.
>>
>
> That's because, quite frankly, it's hard to answer that question in general.
> You have to answer it under your circumstances.  Since you're the one who
> knows those, you'll have to do the answering, I'm afraid.  This probably
> means "do some tests".
>
That is what I am currently doing.
My purpose is to study (we are absolutely not in a hurry so I have time
to evaluate all the cases) how to handle:
High-availability and scalability in our environment where there are
many reads and few writes.

* For High-availability, I mainly studied PgPool and Log-shipping (and
in fact forgot Slony).
Until now I feel more comfortable with Log-shipping because it seems
safer (I am not sure I can't get some problems with sequences and
current_timestamp with PgPool - it is even mentioned in the doc).

* For Scalability, I had a look to sharding, and tried to look at
replication/load-balancing.
For load-balancing, I really appreciated PgPool, but again the statement
replication does not make me really comfortable so I studied two ways:
Synchronous replication and asynchronous replication.
I think that synchronous synchronization is a nice to have feature but
absolutely not mandatory. So if I am concerned by scalability, I think
we can avoid overload caused by synchronous solution.
For now I think using PgPool as a load-balancer while Slony handles
Master-Slave replication seems a good solution.
But I will anyway have a look to CyberCluster.

>> And why should I chose Slony instead of Log-shipping to update the slaves ?
>>
>
> This is a good example (and the only thing in your list I feel comfortable
> talking about): if you think that you might find it handy to be able to
> query the replicas, use Slony, because you can't do it with log shipping
> (yet).
>
Do you mean that Log-shipping is well for a Standby server not handling
request while Slony is good with "select" load balancing ?
Why and what limits Log-shipping for that case ?

Thanks for the answer!

Pascal

Re: HA and Replication - how to choose among all the available solutions

From

Andrew Sullivan

Date:

03 December 2007, 07:43:56

On Mon, Dec 03, 2007 at 09:16:40AM +0100, Pascal Cohen wrote:
>
> * For High-availability, I mainly studied PgPool and Log-shipping (and
> in fact forgot Slony).
> Until now I feel more comfortable with Log-shipping because it seems
> safer (I am not sure I can't get some problems with sequences and
> current_timestamp with PgPool - it is even mentioned in the doc).

Slony can do most things the current log shipping code can do (but it does
_not_ solve DDL changes automatically; you must control that differently).

Slony imposes a load on your origin server that log shipping does not.

> Do you mean that Log-shipping is well for a Standby server not handling
> request while Slony is good with "select" load balancing ?
> Why and what limits Log-shipping for that case ?

When you ship logs, you more or less have the "replica" in recovery mode all
the time (as though you pulled the plug on the server).  So it is always
recovering the shipped logs.  This is also how Oracle did this for years,
although they've recently addressed the limitation.  There is some promise
of doing this in Postgres in future, but it's not done yet.

Also, have a look at hardware answers for HA.  That's the real current
answer for always-on service, if you ask me.

A

Re: HA and Replication - how to choose among all the available solutions

From

Pascal Cohen

Date:

03 December 2007, 07:58:02

On Mon, Dec 03, 2007 at 09:16:40AM +0100, Pascal Cohen wrote:
>> * For High-availability, I mainly studied PgPool and Log-shipping (and
>> in fact forgot Slony).
>> Until now I feel more comfortable with Log-shipping because it seems
>> safer (I am not sure I can't get some problems with sequences and
>> current_timestamp with PgPool - it is even mentioned in the doc).
>>
>
> Slony can do most things the current log shipping code can do (but it does
> _not_ solve DDL changes automatically; you must control that differently).
>
> Slony imposes a load on your origin server that log shipping does not.
>
I am not concerned with warm DDL changes (I would say currently because
one never knows ;) ).
In fact, if my master server only handles updates while request are
handled by one or more slaves, I can probably accept some overload on it
or if I adjust weights between Master and Slaves through PgPool, I
should be able to accept some load on Master.

>
>> Do you mean that Log-shipping is well for a Standby server not handling
>> request while Slony is good with "select" load balancing ?
>> Why and what limits Log-shipping for that case ?
>>
>
> When you ship logs, you more or less have the "replica" in recovery mode all
> the time (as though you pulled the plug on the server).  So it is always
> recovering the shipped logs.  This is also how Oracle did this for years,
> although they've recently addressed the limitation.  There is some promise
> of doing this in Postgres in future, but it's not done yet.
>
> Also, have a look at hardware answers for HA.  That's the real current
> answer for always-on service, if you ask me.
>
What kind of hardware solutions do you know ? - I will look on my own
what I can find.

Thanks

Pascal

Re: HA and Replication - how to choose among all the available solutions

From

Andrew Sullivan

Date:

03 December 2007, 13:43:45

On Mon, Dec 03, 2007 at 09:57:30AM +0100, Pascal Cohen wrote:
> >
> What kind of hardware solutions do you know ? - I will look on my own
> what I can find.

Have a look at the discussion in the 8.3 manual, about shared disk and block
level replication:

http://www.postgresql.org/docs/8.3/static/high-availability.html

A

Re: HA and Replication - how to choose among all the available solutions

From

Pascal Cohen

Date:

03 December 2007, 14:16:20

Andrew Sullivan wrote:
> On Mon, Dec 03, 2007 at 09:57:30AM +0100, Pascal Cohen wrote:
>
>>>
>>>
>> What kind of hardware solutions do you know ? - I will look on my own
>> what I can find.
>>
>
> Have a look at the discussion in the 8.3 manual, about shared disk and block
> level replication:
>
> http://www.postgresql.org/docs/8.3/static/high-availability.html
>
> A
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>
>
Thanks
I had a look to that page. Shared disk seems nice, but I suppose one
needs fast connections like fiber channels ? What is the impact on I/O
and general DB perfs ? - Sorry if my question is stupid - not an expert
on such things.
And I suppose it is not very cheap ;)

Re: HA and Replication - how to choose among all the available solutions

From

Andrew Sullivan

Date:

03 December 2007, 17:13:36

On Mon, Dec 03, 2007 at 04:16:04PM +0100, Pascal Cohen wrote:
> needs fast connections like fiber channels ? What is the impact on I/O
> and general DB perfs ? - Sorry if my question is stupid - not an expert
> on such things.
> And I suppose it is not very cheap ;)

Well, if you think you're going to get high performance and super
redundancy on cheap hardware, you need to think again.  ECC memory isn't
cheap either, but if you care about your data, you have to decide whether
it really needs to be protected.

No, you don't necessarily need fibre.  Some people do this over copper-based
LAN links.

A

Re: HA and Replication - how to choose among all the available solutions

From

Pascal Cohen

Date:

03 December 2007, 19:28:44

Andrew Sullivan wrote:
> On Mon, Dec 03, 2007 at 04:16:04PM +0100, Pascal Cohen wrote:
>
>> needs fast connections like fiber channels ? What is the impact on I/O
>> and general DB perfs ? - Sorry if my question is stupid - not an expert
>> on such things.
>> And I suppose it is not very cheap ;)
>>
>
> Well, if you think you're going to get high performance and super
> redundancy on cheap hardware, you need to think again.  ECC memory isn't
> cheap either, but if you care about your data, you have to decide whether
> it really needs to be protected.
>
> No, you don't necessarily need fibre.  Some people do this over copper-based
> LAN links.
>
>
That is not what I meant.
For example if my DB is 2Gb large, I probably don't need 32Gb RAM.
The same way, if I can get a fast and reliable system with copper based
LAN and not Fiber-Channel, this can save some money that can be used
somewhere else for better use (CPU/RAM for example).
Currently I am not concerned with prices. I am trying to gather as much
information as I can in order to be able to give good "recommendations"
or advices when I could be asked for.
So there are things I can test (mainly softwares) but testing hardware
is less easy.

Thanks

Pascal