Home > mailing lists

Thread: Real application clustering in postgres.

Real application clustering in postgres.

From

Daulat Ram

Date:

05 March 2020, 07:45:35

Hi team,

Is there any possibility/options to setup a real application clustering in Postgres as in Oracle we have a RAC feature.

What about multi-master replication in Postgres. would you please suggest how it is useful and how can setup it.

Thanks.

Re: Real application clustering in postgres.

From

Laurenz Albe

Date:

05 March 2020, 12:07:13

On Thu, 2020-03-05 at 07:45 +0000, Daulat Ram wrote:
> Is there any possibility/options to setup a real application clustering in Postgres as in Oracle we have a  RAC
feature.

No, and as far as I know nobody feels interested in providing it.

RAC is a complicated architecture that doesn't do much good, so most
people feel that it would be a waste of time and effort.

RAC ist not really a scaling solution: because of the shared storage,
you can only scale for more CPUs; I/O remains the bottleneck.

RAC is not really a high availability solution: because of the shared
storage, it has a sibgle point of failure.

Today, people use shared-nothing architectures for high avaliability,
like Patroni.

> What about multi-master replication in Postgres. would you please suggest how it is useful and how can setup it.

There is no support for that in core PostgreSQL.

There is a closed-source implementation that you can buy:
https://www.2ndquadrant.com/en/resources/postgres-bdr-2ndquadrant/

But multi-master replication is complicated to get right, and
an applicatoin that uses it has to be specifically designed for that.
Very often a single-master failover solution is a better solution.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

RE: Real application clustering in postgres.

From

Daulat Ram

Date:

05 March 2020, 16:48:30

Thanks for your inputs Laurenz Albe.

Would you please explain single-master failover solution.

Suppose we have promoted  standby (replica) as master after the h/w issue at Master. 
If after few hours we recovered  the h/w then how we can switchback on the old primary. . 

As in Oracle we have switchover method for Dataguard. How we can do in Postgres.

Thanks,

-----Original Message-----
From: Laurenz Albe <laurenz.albe@cybertec.at> 
Sent: Thursday, March 5, 2020 5:37 PM
To: Daulat Ram <Daulat.Ram@exponential.com>; pgsql-general@lists.postgresql.org
Subject: Re: Real application clustering in postgres.

On Thu, 2020-03-05 at 07:45 +0000, Daulat Ram wrote:
> Is there any possibility/options to setup a real application clustering in Postgres as in Oracle we have a  RAC
feature.

No, and as far as I know nobody feels interested in providing it.

RAC is a complicated architecture that doesn't do much good, so most people feel that it would be a waste of time and
effort.

RAC ist not really a scaling solution: because of the shared storage, you can only scale for more CPUs; I/O remains the
bottleneck.

RAC is not really a high availability solution: because of the shared storage, it has a sibgle point of failure.

Today, people use shared-nothing architectures for high avaliability, like Patroni.

> What about multi-master replication in Postgres. would you please suggest how it is useful and how can setup it.

There is no support for that in core PostgreSQL.

There is a closed-source implementation that you can buy:
https://www.2ndquadrant.com/en/resources/postgres-bdr-2ndquadrant/

But multi-master replication is complicated to get right, and an applicatoin that uses it has to be specifically
designedfor that.

Very often a single-master failover solution is a better solution.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

Re: Real application clustering in postgres.

From

Virendra Kumar

Date:

05 March 2020, 17:06:17

Failover is easy but failback is little bit tricky.

I have implemented failback by doing following steps:

1. Start original primary which will be doing crash recovery. It should be designed in such a way that once it is up application should not start connecting to it otherwise there will be split brain and data-mistach between two instances. I implemented it by using a virtual IP mounting on server which is actual primary using keepalived.

2. Shutdown original primary and do a pg_rewind to make that as slave for new primary.

3. Once slave (original primary) is caught up with primary do failback

4. Repeat steps #1-#3 to make failed over instance slave again.

Regards,

Virendra

On Thursday, March 5, 2020, 8:48:54 AM PST, Daulat Ram <daulat.ram@exponential.com> wrote:

Thanks for your inputs Laurenz Albe.

Would you please explain single-master failover solution.

Suppose we have promoted standby (replica) as master after the h/w issue at Master.
If after few hours we recovered the h/w then how we can switchback on the old primary. .

As in Oracle we have switchover method for Dataguard. How we can do in Postgres.

Thanks,

-----Original Message-----
From: Laurenz Albe <laurenz.albe@cybertec.at>
Sent: Thursday, March 5, 2020 5:37 PM
To: Daulat Ram <Daulat.Ram@exponential.com>; pgsql-general@lists.postgresql.org
Subject: Re: Real application clustering in postgres.

On Thu, 2020-03-05 at 07:45 +0000, Daulat Ram wrote:
> Is there any possibility/options to setup a real application clustering in Postgres as in Oracle we have a RAC feature.

No, and as far as I know nobody feels interested in providing it.

RAC is a complicated architecture that doesn't do much good, so most people feel that it would be a waste of time and effort.

RAC ist not really a scaling solution: because of the shared storage, you can only scale for more CPUs; I/O remains the bottleneck.

RAC is not really a high availability solution: because of the shared storage, it has a sibgle point of failure.

Today, people use shared-nothing architectures for high avaliability, like Patroni.

> What about multi-master replication in Postgres. would you please suggest how it is useful and how can setup it.

There is no support for that in core PostgreSQL.

There is a closed-source implementation that you can buy:
https://www.2ndquadrant.com/en/resources/postgres-bdr-2ndquadrant/

But multi-master replication is complicated to get right, and an applicatoin that uses it has to be specifically designed for that.
Very often a single-master failover solution is a better solution.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

Re: Real application clustering in postgres.

From

Laurenz Albe

Date:

06 March 2020, 08:59:33

On Thu, 2020-03-05 at 17:06 +0000, Virendra Kumar wrote:
> Failover is easy but failback is little bit tricky.
> I have implemented failback by doing following steps:
> 
> 1. Start original primary which will be doing crash recovery. It should be designed in such a way that once it is up
applicationshould not start connecting to it otherwise there will be split brain
 
> and data-mistach between two instances. I implemented it by using a virtual IP mounting on server which is actual
primaryusing keepalived.
 
> 2. Shutdown original primary and do a pg_rewind to make that as slave for new primary.
> 3. Once slave (original primary) is caught up with primary do failback
> 4. Repeat steps #1-#3 to make failed over instance slave again.

Or you use Patroni and simply say

  patronictl.py switchover --candidate db2 --scheduled '2020-03-06 12:00:00' mycluster

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Real application clustering in postgres.

From

Ron

Date:

06 March 2020, 09:25:31

On 3/5/20 6:07 AM, Laurenz Albe wrote:

On Thu, 2020-03-05 at 07:45 +0000, Daulat Ram wrote:

Is there any possibility/options to setup a real application clustering in Postgres as in Oracle we have a  RAC feature.

No, and as far as I know nobody feels interested in providing it.

RAC is a complicated architecture that doesn't do much good, so most
people feel that it would be a waste of time and effort.

RAC ist not really a scaling solution: because of the shared storage,
you can only scale for more CPUs; I/O remains the bottleneck.

RAC is not really a high availability solution: because of the shared
storage, it has a sibgle point of failure.

This is utter nonsense. Dual redundant storage controllers connected to disks in RAID-10 configurations have been around for at least 25 years.

Oracle got it's clustering technology from DEC, and I know that works. Cluster members, storage controllers and disks have all gone down, while the database and application keep on humming along.

--
Angular momentum makes the world go 'round.

Re: Real application clustering in postgres.

From

Laurenz Albe

Date:

06 March 2020, 14:55:27

On Fri, 2020-03-06 at 03:25 -0600, Ron wrote:
> > RAC is not really a high availability solution: because of the shared
> > storage, it has a sibgle point of failure.   
> 
> This is utter nonsense.  Dual redundant storage controllers
> connected to disks in RAID-10 configurations have been around for at
> least 25 years.
> 
> Oracle got it's clustering technology from DEC, and I know
> that works.  Cluster members, storage controllers and disks have all
> gone down, while the database and application keep on humming along.

I am not saying that it is buggy, it is limited by design.

If you have mirrored disks, and you write junk (e.g, because of
a flaw in a fibre channel cable, something I have witnessed),
then you have two perfectly fine copies of the junk.

I am not saying the (physical) disk is the single point of failure, the
(logical) file system is (Oracle calls it ASM / tablespace, but it is
still a file system).

Yours,
Laurenz Albe

Re: Real application clustering in postgres.

From

Ravi Krishna

Date:

06 March 2020, 15:58:18

>
> If you have mirrored disks, and you write junk (e.g, because of
> a flaw in a fibre channel cable, something I have witnessed),
> then you have two perfectly fine copies of the junk.
>

Few years ago didn't this happen to Salesforce where a firmware bug corrupted the Disk, resulting in corruption of
Oracletablespace blocks 
and all RAC nodes were equally useless since all of them read from the same disk.  Salesforce lost 4 hours worth of
data.

Re: Real application clustering in postgres.

From

Ron

Date:

06 March 2020, 16:56:10

On 3/6/20 8:55 AM, Laurenz Albe wrote:
> On Fri, 2020-03-06 at 03:25 -0600, Ron wrote:
>>> RAC is not really a high availability solution: because of the shared
>>> storage, it has a sibgle point of failure.
>> This is utter nonsense.  Dual redundant storage controllers
>> connected to disks in RAID-10 configurations have been around for at
>> least 25 years.
>>
>> Oracle got it's clustering technology from DEC, and I know
>> that works.  Cluster members, storage controllers and disks have all
>> gone down, while the database and application keep on humming along.
> I am not saying that it is buggy, it is limited by design.
>
> If you have mirrored disks, and you write junk (e.g, because of
> a flaw in a fibre channel cable, something I have witnessed),
> then you have two perfectly fine copies of the junk.

Why do you have just one FC path?

> I am not saying the (physical) disk is the single point of failure, the
> (logical) file system is (Oracle calls it ASM / tablespace, but it is
> still a file system).

Why isn't the filesystem (or RDBMS) throwing checksum errors?  This was 
standard stuff in legacy Enterprise RDBMSs 20 years ago.

-- 
Angular momentum makes the world go 'round.

Re: Real application clustering in postgres.

From

Andrew Kerber

Date:

06 March 2020, 17:19:46

Yup, if you need true shared storage, Oracle RAC is still the only solution out there, All the other multi-master solutions work by capturing the sql statements themselves. And properly configured it RAC is indeed part of an HA solution. Any time you have everything in a single data center, as you do with basic RAC, you do have a potential single point of failure however, a tornado for example can take out the entire data center. There are solutions like stretch RAC that can account for that issue, as well as data guard. Oracle also has a nice DR feature called Dataguard, which creates an exact duplicate of your running database..The nice point of oracle dataguard is that it is a block by block copy, while all of the Postgres Multi-Master and master-slave replication solutions work by SQL capture.

Potentially DRDB could be used to set up a standby database for postgres, though it is considerably more involved to create than Oracle data guard.

With the advent of virtualization and the concomitant ease of adding CPU and memory to a running database instance, the use cases for true Oracle RAC are much less than in the past, scalability is no longer nearly the issue it used to be, and with the speed of failover at the virtual guest level, the HA requirements are much less substantial than in the past. There are probably still some use cases where the down time required for patching is not acceptable, in which case RAC is still required. The same is true for Postgres or any other Multi-Master instances of course, unless the intent of clustering/multi-master is availability instead of scalability, there isnt much point considering the ease of adding resources to virtual servers.

On Fri, Mar 6, 2020 at 10:56 AM Ron <ronljohnsonjr@gmail.com> wrote:

On 3/6/20 8:55 AM, Laurenz Albe wrote:
> On Fri, 2020-03-06 at 03:25 -0600, Ron wrote:
>>> RAC is not really a high availability solution: because of the shared
>>> storage, it has a sibgle point of failure.
>> This is utter nonsense. Dual redundant storage controllers
>> connected to disks in RAID-10 configurations have been around for at
>> least 25 years.
>>
>> Oracle got it's clustering technology from DEC, and I know
>> that works. Cluster members, storage controllers and disks have all
>> gone down, while the database and application keep on humming along.
> I am not saying that it is buggy, it is limited by design.
>
> If you have mirrored disks, and you write junk (e.g, because of
> a flaw in a fibre channel cable, something I have witnessed),
> then you have two perfectly fine copies of the junk.

Why do you have just one FC path?

> I am not saying the (physical) disk is the single point of failure, the
> (logical) file system is (Oracle calls it ASM / tablespace, but it is
> still a file system).

Why isn't the filesystem (or RDBMS) throwing checksum errors? This was
standard stuff in legacy Enterprise RDBMSs 20 years ago.

--
Angular momentum makes the world go 'round.

Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'

Re: Real application clustering in postgres.

From

Jeremy Schneider

Date:

06 March 2020, 18:32:07

On 3/6/20 01:25, Ron wrote:
> On 3/5/20 6:07 AM, Laurenz Albe wrote:
>> On Thu, 2020-03-05 at 07:45 +0000, Daulat Ram wrote:
>>> Is there any possibility/options to setup a real application clustering in Postgres as in Oracle we have a  RAC
feature.
>> No, and as far as I know nobody feels interested in providing it.
>>
>> RAC is a complicated architecture that doesn't do much good, so most
>> people feel that it would be a waste of time and effort.
>>
>> RAC ist not really a scaling solution: because of the shared storage,
>> you can only scale for more CPUs; I/O remains the bottleneck.
>> RAC is not really a high availability solution: because of the shared
>> storage, it has a sibgle point of failure.
> 
> This is utter nonsense.  Dual redundant storage controllers connected to
> disks in RAID-10 configurations have been around for *at least* 25 years.
> 
> Oracle got it's clustering technology from DEC, and I *know* that
> works.  Cluster members, storage controllers and disks have all gone
> down, while the database and application keep on humming along.

Just want to point out that while the merits of RAC are considered as
universal truth in the Oracle marketing community, they are certainly
still debated and nuanced within the Oracle technical community.

Some great reading would be Mogens Norgaard's article "You Probably
Don't Need RAC" from 2003 (which is surprisingly hard to find on the
internet) and much of the ensuing discussion between then and now, for
example this recent message over on the oracle-l list:

https://www.freelists.org/post/oracle-l/Chuckleworthy-issue-of-the-NoCOUG-Journal,1

For my part, I spent many years running RAC clusters and solving the
usual handful of problems you bump into, and I definitely have a strong
bias now toward as simple of architectures as possible.  Regardless of
which parties participate in your operations, and regardless of who owns
the data center where your stack is running.  Note that I apply this to
all the new databases as well. I enjoy and appreciate the opportunities
I've had to work on some really interesting new technology - but I also
still see merit in simple, boring, mature, well-understood architectures
if they are doing the job. Complexity will find us all soon enough
driven by true business needs without being helped by the pursuit of
shiny things!

It seemed to me there was a major and well-funded effort to market and
sell cluster databases for many years, and as a result I suspect that
while there are certainly some good use cases, there are probably also
some people using RAC today who would do fine (or better) without it.

Seems I even touched on this all the way back in 2007 in an article on
my own blog...  https://ardentperf.com/2007/05/10/fidelity-case-study/

-Jeremy

-- 
http://about.me/jeremy_schneider

Re: Real application clustering in postgres.

From

"Peter J. Holzer"

Date:

08 March 2020, 20:13:16

On 2020-03-06 15:55:27 +0100, Laurenz Albe wrote:
> On Fri, 2020-03-06 at 03:25 -0600, Ron wrote:
> > > RAC is not really a high availability solution: because of the shared
> > > storage, it has a sibgle point of failure.
> >
> > This is utter nonsense.  Dual redundant storage controllers
> > connected to disks in RAID-10 configurations have been around for at
> > least 25 years.
> >
> > Oracle got it's clustering technology from DEC, and I know
> > that works.  Cluster members, storage controllers and disks have all
> > gone down, while the database and application keep on humming along.
>
> I am not saying that it is buggy, it is limited by design.
>
> If you have mirrored disks, and you write junk (e.g, because of
> a flaw in a fibre channel cable, something I have witnessed),
> then you have two perfectly fine copies of the junk.

I have certainly seen enterprise SAN boxes go down (or deliver corrupted
data) because of controller or firmware problems or just because a
second disk in a RAID-5 failed before the spare could be brought online.

But to be fair, a master/slave setup a la patroni isn't immune against
"writing junk" either: Not on the hardware level (either of the nodes
may have faulty hardware, and you may not notice it until too late), and
more importantly, not on the software level. An erroneus DML statement
(because of a bug in the application, or because the user/admin made a
mistake) will cause the same wrong data to be distributed to all nodes
(of course this also applies to RAC).

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

signature.asc

Re: Real application clustering in postgres.

From

Christoph Moench-Tegeder

Date:

08 March 2020, 23:01:50

## Andrew Kerber (andrew.kerber@gmail.com):

>  The nice point of oracle
> dataguard is that it is a block by block copy, while all of the Postgres
> Multi-Master and master-slave replication solutions work by SQL capture.

https://www.postgresql.org/docs/12/warm-standby.html#STREAMING-REPLICATION
I wouldn't exactly call our streaming replication "SQL capture".

Regards,
Christoph

-- 
Spare Space

Re: Real application clustering in postgres.

From

Laurenz Albe

Date:

09 March 2020, 08:52:49

On Fri, 2020-03-06 at 10:56 -0600, Ron wrote:
> > > > RAC is not really a high availability solution: because of the shared
> > > > storage, it has a sibgle point of failure.
> > > This is utter nonsense.  Dual redundant storage controllers
> > > connected to disks in RAID-10 configurations have been around for at
> > > least 25 years.
> > > 
> > > Oracle got it's clustering technology from DEC, and I know
> > > that works.  Cluster members, storage controllers and disks have all
> > > gone down, while the database and application keep on humming along.
> >
> > I am not saying that it is buggy, it is limited by design.
> > 
> > If you have mirrored disks, and you write junk (e.g, because of
> > a flaw in a fibre channel cable, something I have witnessed),
> > then you have two perfectly fine copies of the junk.
> 
> Why do you have just one FC path?

We didn't.
It just happened that the cable that the data were sent over was buggy.

> > I am not saying the (physical) disk is the single point of failure, the
> > (logical) file system is (Oracle calls it ASM / tablespace, but it is
> > still a file system).
> 
> Why isn't the filesystem (or RDBMS) throwing checksum errors?  This was 
> standard stuff in legacy Enterprise RDBMSs 20 years ago.

Checksums are nice for telling you that your storage is screwed.
They don't fix the problem.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Real application clustering in postgres.

From

Laurenz Albe

Date:

09 March 2020, 08:57:37

On Sun, 2020-03-08 at 21:13 +0100, Peter J. Holzer wrote:
> But to be fair, a master/slave setup a la patroni isn't immune against
> "writing junk" either: Not on the hardware level (either of the nodes
> may have faulty hardware, and you may not notice it until too late), and
> more importantly, not on the software level. An erroneus DML statement
> (because of a bug in the application, or because the user/admin made a
> mistake) will cause the same wrong data to be distributed to all nodes
> (of course this also applies to RAC).

Of course, nobody debates that.

A high-availability solution only protects you from certain, well-defined
kinds of problems, usually related to hardware.

There is no way to protect yourself from software bugs or user errors.

If there is a hardware problem that causes one of the databases in the
Patroni cluster to become corrupted, the others are not immediately
affected.  That's the point of a shared-nothing architecture.

Of course, if the corrupted database is the primary, corruption can
eventually spread to the others.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Real application clustering in postgres.

From

Andreas Kretschmer

Date:

09 March 2020, 14:30:07

Am 05.03.20 um 13:07 schrieb Laurenz Albe:
> There is a closed-source implementation that you can buy:
> https://www.2ndquadrant.com/en/resources/postgres-bdr-2ndquadrant/
>
> But multi-master replication is complicated to get right, and
> an applicatoin that uses it has to be specifically designed for that.

Depends. We have an automation tool for setup (TPAexec), and as long you 
are not using special feature like CAMO (Commit At Most Once) you can 
use applications without changes.
(you have to set some parameters, for instance for sequences)

It works well, we have a lot of happy customers. Some of them using it 
for globally distributed databases, other for high availability with 5 
nines.
Here you can find more information:
https://www.2ndquadrant.com/en/resources/postgres-bdr-2ndquadrant/


Regards, Andreas

-- 
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com

Re: Real application clustering in postgres.

From

"Peter J. Holzer"

Date:

10 March 2020, 23:21:17

On 2020-03-09 09:57:37 +0100, Laurenz Albe wrote:
> On Sun, 2020-03-08 at 21:13 +0100, Peter J. Holzer wrote:
> > But to be fair, a master/slave setup a la patroni isn't immune against
> > "writing junk" either: Not on the hardware level (either of the nodes
> > may have faulty hardware, and you may not notice it until too late), and
> > more importantly, not on the software level. An erroneus DML statement
> > (because of a bug in the application, or because the user/admin made a
> > mistake) will cause the same wrong data to be distributed to all nodes
> > (of course this also applies to RAC).
>
> Of course, nobody debates that.
>
> A high-availability solution only protects you from certain, well-defined
> kinds of problems, usually related to hardware.

Right. And enterprise class SAN storage does this: It protects you from
failure of a single disk, a single cable, a single controller. Very
often you can physically spread out the components so that loss of a
whole rack (or server room) wouldn't affect availability. There are of
course limits: When a message sent over a single cable is corrupted in a
way that the checksum doesn't catch, corrupted data may be stored. But
then if data in RAM is corrupted that ECC doesn't catch, the same will
happen. A Patroni-based cluster isn't free of single points of failure
either.

So I don't buy the argument "X isn't a high availability solution
because it uses shared storage". While I have seen expensive SAN boxes
fail, I've also managed to wreck Patroni clusters. I'm not at all
convinced that the availability of a Patroni cluster is higher than that
of a failover cluster using shared storage.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

signature.asc