Re: Core team statement on replication in PostgreSQL - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Core team statement on replication in PostgreSQL
Date
Msg-id 1212354879.18365.30.camel@huvostro
Whole thread Raw
In response to Re: Core team statement on replication in PostgreSQL  (Robert Hodges <robert.hodges@continuent.com>)
Responses Re: Core team statement on replication in PostgreSQL
List pgsql-hackers
On Thu, 2008-05-29 at 12:05 -0700, Robert Hodges wrote:
> Hi everyone, 
> 
> First of all, I’m absolutely delighted that the PG community is
> thinking seriously about replication.  
> 
> Second, having a solid, easy-to-use database availability solution
> that works more or less out of the box would be an enormous benefit to
> customers.  Availability is the single biggest problem for customers
> in my experience and as other people have commented the alternatives
> are not nice.  It’s an excellent idea to build off an existing feature
> —PITR is already pretty useful and the proposed features are solid
> next steps.  The fact that it does not solve all problems is not a
> drawback but means it’s likely to get done in a reasonable timeframe. 
> 
> Third, you can’t stop with just this feature.  (This is the BUT part
> of the post.)  The use cases not covered by this feature area actually
> pretty large.  Here are a few that concern me: 
> 
> 1.) Partial replication. 
> 2.) WAN replication. 

1.) & 2.) are better done asunc, the domain of Slony-I/Londiste

> 3.) Bi-directional replication.  (Yes, this is evil but there are
> problems where it is indispensable.) 

Sure, it is also a lot harder and always has several dimensions
(performanse/availability7locking) which play against each other

> 4.) Upgrade support.  Aside from database upgrade (how would this ever
> really work between versions?), it would not support zero-downtime app
> upgrades, which depend on bi-directional replication tricks. 

Or you could use zero-downtime  app upgrades, which don't depend on
this :P

> 5.) Heterogeneous replication. 
> 6.) Finally, performance scaling using scale-out over large numbers of
> replicas.  I think it’s possible to get tunnel vision on this—it’s not
> a big requirement in the PG community because people don’t use PG in
> the first place when they want to do this.  They use MySQL, which has
> very good replication for performance scaling, though it’s rather weak
> for availability.  

Again, doing scale-out over large number of replicas should either be
async or for sync use some broadcast channel to all slaves (and still be
a performance problem on master, as it has to wait for slowest slave).

> As a consequence, I don’t see how you can get around doing some sort
> of row-based replication like all the other databases. 

Is'nt WAL-base replication "some sort of row-based replication" ?

>  Now that people are starting to get religion on this issue I would
> strongly advocate a parallel effort to put in a change-set extraction
> API that would allow construction of comprehensive master/slave
> replication. 

Triggers. see pgQ's logtrigga()/logutrigga(). See slides for Marko
Kreen's presentation at pgCon08.

>  (Another approach would be to make it possible for third party apps
> to read the logs and regenerate SQL.) 

which logs ? WAL or SQL command logs ?

> There are existing models for how to do change set extraction; we have
> done it several times at my company already.  There are also research
> projects like GORDA that have looked fairly comprehensively at this
> problem.

pgQ with its triggers does a pretty good job of change-set extraction.

------------------
Hannu




pgsql-hackers by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: Case-Insensitve Text Comparison
Next
From: Robert Hodges
Date:
Subject: Re: Core team statement on replication in PostgreSQL