Thread: Peer to peer replication of Postgresql databases

Peer to peer replication of Postgresql databases

From
Anuradha Ratnaweera
Date:
Hi all,

I am trying to add some replication features to postgres (yes, I have
already looked at ongoing work), in a peer to peer manner.  The goal
is to achive `nearly complete fault tolerence' by replicating data.

The basic framework I have in mind is somewhat like this.

- Postmasters are running on different computers on a networked cluster.
  Their data areas are identical at the beginning and recide on local
  storage devices.

- Each postmaster is aware that they are a part of a cluster and they
  can communicate with each other, send multicast requests and look for
  each other's presence (like heartbeat in linux-ha project).

- When a frontend process sends a read query, each backend process
  does that from its own data area.

- There are two types of write queries.  Postmasters use seperate
  communication channels for each.  One is the sequencial channel which
  carries writes whose order is important, and the non-sequencial
  channel carries write queries whose order is not important.

- When a frontend process sends non-sequencial write query to a backend,
  it is directly written to the local data area and a multicast is
  sent (preferably asynchronously) to the other postmasters who will
  also update their respective local areas.

  May be we can simply duplicate what goes to WAL into a TCP/IP socket
  (with some header info, of course).

- When a sequencial-write query is requested, the corresponding
  postmaster informs a main-postmaster (more about in the next point),
  waits for his acknowledgement, and proceeds the same way as the
  non-sequencial write.

- Each postmaster is assigned a priority.  The one with the highest
  priority is doing some bookkeeping to handle concurrency issues etc.
  If he goes away, another one takes charge.

  Or maybe we can completely ignore the main-postmaster concept and
  let the clients broadcast a request to obtain locks etc.

- When a new postmaster, hence a computer, joins the cluster, he
  will replicate the current database from one of the clients.

Suggessions and critisisms are welcome.

    Anuradha

--

Debian GNU/Linux (kernel 2.4.18-xfs-1.1)

The best audience is intelligent, well-educated and a little drunk.
        -- Maurice Baring


Re: Peer to peer replication of Postgresql databases

From
"Shridhar Daithankar"
Date:
On 11 Oct 2002 at 16:16, Anuradha Ratnaweera wrote:

> 
> Hi all,
> 
> I am trying to add some replication features to postgres (yes, I have
> already looked at ongoing work), in a peer to peer manner.  The goal
> is to achive `nearly complete fault tolerence' by replicating data.

Sounds a lot like usogres. You got it running. (I never had a chance.) I would 
like to hear how it compares against it.

Can anybody comment how maintained usogres is. It covers an important area of 
replication but I am not sure how maintained that is. If it is not, I suggest 
we pick it up and finish it.

HTH

ByeShridhar

--
You go slow, be gentle.  It's no one-way street -- you know how youfeel and 
that's all.  It's how the girl feels too.  Don't press.  Ifthe girl feels 
anything for you at all, you'll know.        -- Kirk, "Charlie X", stardate 1535.8



Re: Peer to peer replication of Postgresql databases

From
Anuradha Ratnaweera
Date:
On Fri, Oct 11, 2002 at 03:54:15PM +0530, Shridhar Daithankar wrote:
>
> On 11 Oct 2002 at 16:16, Anuradha Ratnaweera wrote:
>
> > I am trying to add some replication features to postgres (yes, I have
> > already looked at ongoing work), in a peer to peer manner.  The goal
> > is to achive `nearly complete fault tolerence' by replicating data.
>
> Sounds a lot like usogres. You got it running. (I never had a chance.) I would
> like to hear how it compares against it.
>
> Can anybody comment how maintained usogres is. It covers an important area of
> replication but I am not sure how maintained that is. If it is not, I suggest
> we pick it up and finish it.

I will look at it, too.  Thanks for the link.  In some cases, starting
anew is faster than learning unmaintained existing code.

My original mail would have been much shorter if it simply stated that I
want to add `application level RAID-0' to postgres ;)

    Anuradha

--

Debian GNU/Linux (kernel 2.4.18-xfs-1.1)

I think the world is run by C students.
        -- Al McGuire


Re: Peer to peer replication of Postgresql databases

From
"Shridhar Daithankar"
Date:
On 11 Oct 2002 at 16:29, Anuradha Ratnaweera wrote:

> On Fri, Oct 11, 2002 at 03:54:15PM +0530, Shridhar Daithankar wrote:
> I will look at it, too.  Thanks for the link.  In some cases, starting
> anew is faster than learning unmaintained existing code.

While that's true, usogres code is just few files. I wouldn't take more than 
half an hour to read up the things. And besides it contain postgresql protocol 
implementation necessary which would take some time to test and debug,

And it's in C++. I like that..;-)


> My original mail would have been much shorter if it simply stated that I
> want to add `application level RAID-0' to postgres ;)

:-)

ByeShridhar

--
QOTD:    "Do you smell something burning or is it me?"        -- Joan of Arc



Re: Peer to peer replication of Postgresql databases

From
Anuradha Ratnaweera
Date:
On Fri, Oct 11, 2002 at 04:04:29PM +0530, Shridhar Daithankar wrote:
> On 11 Oct 2002 at 16:29, Anuradha Ratnaweera wrote:
>
> > On Fri, Oct 11, 2002 at 03:54:15PM +0530, Shridhar Daithankar wrote:
> > I will look at it, too.  Thanks for the link.  In some cases, starting
> > anew is faster than learning unmaintained existing code.

Ok.  Checked out what usogres is.  It is not what I want.  I don't want
a static `main database'.  It should simply a cluster of them - just like
a set of Raid-0 disks, may be with a tempory controller for some tasks.

Also, as a matter of fact, usogres is not unmaintained code.

> While that's true, usogres code is just few files. I wouldn't take more than
> half an hour to read up the things. And besides it contain postgresql protocol
> implementation necessary which would take some time to test and debug,

Great.  I will look into this over the weekend.

> And it's in C++. I like that..;-)

And I DON'T like that ;)

    Anuradha

--

Debian GNU/Linux (kernel 2.4.18-xfs-1.1)

QOTD:
    "I ain't broke, but I'm badly bent."


Re: Peer to peer replication of Postgresql databases

From
"Shridhar Daithankar"
Date:
On 11 Oct 2002 at 16:39, Anuradha Ratnaweera wrote:

> On Fri, Oct 11, 2002 at 04:04:29PM +0530, Shridhar Daithankar wrote:
> > On 11 Oct 2002 at 16:29, Anuradha Ratnaweera wrote:
> > 
> > > On Fri, Oct 11, 2002 at 03:54:15PM +0530, Shridhar Daithankar wrote:
> > > I will look at it, too.  Thanks for the link.  In some cases, starting
> > > anew is faster than learning unmaintained existing code.
> 
> Ok.  Checked out what usogres is.  It is not what I want.  I don't want
> a static `main database'.  It should simply a cluster of them - just like
> a set of Raid-0 disks, may be with a tempory controller for some tasks.

Well, I don't think adding support for multiple slaves to usogres would be that 
problematic. Of course if you want to load balance your application queries, 
application has to be aware of that. I will not do sending requests to a mosix 
cluster anyway.

> Also, as a matter of fact, usogres is not unmaintained code.

Glad to know that. I wrote to author with some suggestion and never got a 
reply. Didn't bother joining mailing list though..


Regards,Shridhar

-----------------------------------------------------------
Shridhar Daithankar
LIMS CPE Team Member, PSPL.
mailto:shridhar_daithankar@persistent.co.in
Phone:- +91-20-5678900 Extn.270
Fax  :- +91-20-5678901 
-----------------------------------------------------------



Re: Peer to peer replication of Postgresql databases

From
Anuradha Ratnaweera
Date:
On Fri, Oct 11, 2002 at 04:29:53PM +0530, Shridhar Daithankar wrote:
> 
> Well, I don't think adding support for multiple slaves to usogres would be that 
> problematic. Of course if you want to load balance your application queries, 
> application has to be aware of that. I will not do sending requests to a mosix 
> cluster anyway.

Have already tested postgres on a mosix cluster, and as expected results
are not good.  (although mosix does the correct thing in keeping all the
database backend processes on one node).
Anuradha

-- 

Debian GNU/Linux (kernel 2.4.18-xfs-1.1)

Remember: Silly is a state of Mind, Stupid is a way of Life.    -- Dave Butler



Re: Peer to peer replication of Postgresql databases

From
Greg Copeland
Date:
I'd be curious to hear in a little more detail what constitutes "not
good" for postgres on a mosix cluster.

Greg


On Fri, 2002-10-11 at 06:15, Anuradha Ratnaweera wrote:
> On Fri, Oct 11, 2002 at 04:29:53PM +0530, Shridhar Daithankar wrote:
> >
> > Well, I don't think adding support for multiple slaves to usogres would be that
> > problematic. Of course if you want to load balance your application queries,
> > application has to be aware of that. I will not do sending requests to a mosix
> > cluster anyway.
>
> Have already tested postgres on a mosix cluster, and as expected results
> are not good.  (although mosix does the correct thing in keeping all the
> database backend processes on one node).
>
>     Anuradha
>
> --
>
> Debian GNU/Linux (kernel 2.4.18-xfs-1.1)
>
> Remember: Silly is a state of Mind, Stupid is a way of Life.
>         -- Dave Butler
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org


Re: Peer to peer replication of Postgresql databases

From
"Shridhar Daithankar"
Date:
On 11 Oct 2002 at 8:30, Greg Copeland wrote:

> I'd be curious to hear in a little more detail what constitutes "not
> good" for postgres on a mosix cluster.
> On Fri, 2002-10-11 at 06:15, Anuradha Ratnaweera wrote:
> > On Fri, Oct 11, 2002 at 04:29:53PM +0530, Shridhar Daithankar wrote:
> > Have already tested postgres on a mosix cluster, and as expected results
> > are not good.  (although mosix does the correct thing in keeping all the
> > database backend processes on one node).

Well, I guess in kind of replication we are talking here, the performance will 
be enhanced only if separate instances of psotgresql runs on separate machine. 
Now if mosix kernel applies some AI and puts all of them on same machine, it 
isn't going to be any good for the purpose replication is deployed.

I guess that's what she meant..

ByeShridhar

--
User n.:    A programmer who will believe anything you tell him.



Re: Peer to peer replication of Postgresql databases

From
Greg Copeland
Date:
Well, not scalable doesn't have to mean "not good".  That's why I
asked.  Considering this is one of the problems with mosix clusters
(process migration and associated restrictions) and the nature of
PostgreSQL's implementation I'm not sure what other result may of been
expected.  Because of that, I wasn't sure if something else was being
implied.

Greg



On Fri, 2002-10-11 at 08:40, Shridhar Daithankar wrote:
> On 11 Oct 2002 at 8:30, Greg Copeland wrote:
>
> > I'd be curious to hear in a little more detail what constitutes "not
> > good" for postgres on a mosix cluster.
> > On Fri, 2002-10-11 at 06:15, Anuradha Ratnaweera wrote:
> > > On Fri, Oct 11, 2002 at 04:29:53PM +0530, Shridhar Daithankar wrote:
> > > Have already tested postgres on a mosix cluster, and as expected results
> > > are not good.  (although mosix does the correct thing in keeping all the
> > > database backend processes on one node).
>
> Well, I guess in kind of replication we are talking here, the performance will
> be enhanced only if separate instances of psotgresql runs on separate machine.
> Now if mosix kernel applies some AI and puts all of them on same machine, it
> isn't going to be any good for the purpose replication is deployed.
>
> I guess that's what she meant..
>
> Bye
>  Shridhar
>
> --
> User n.:    A programmer who will believe anything you tell him.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org


Re: Peer to peer replication of Postgresql databases

From
Neil Conway
Date:
[ pgsql-patches removed from Cc: list ]

Anuradha Ratnaweera <anuradha@lklug.pdn.ac.lk> writes:
> I am trying to add some replication features to postgres (yes, I have
> already looked at ongoing work), in a peer to peer manner.

Did you look at the research behind Postgres-R, and the pgreplication
stuff?

> - When a frontend process sends a read query, each backend process
>   does that from its own data area.

Surely that's not correct -- a SELECT can be handled by *any one*
node, not each and every one, right?

> - There are two types of write queries.  Postmasters use seperate
>   communication channels for each.  One is the sequencial channel which
>   carries writes whose order is important, and the non-sequencial
>   channel carries write queries whose order is not important.

How do you distinguish between these?

Cheers,

Neil

-- 
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC



Re: Peer to peer replication of Postgresql databases

From
Date:
For give me for responding to the beginning of this thread, but my 
comments only apply to this post.

> already looked at ongoing work), in a peer to peer manner.  The goal
> is to achive `nearly complete fault tolerence' by replicating data.

A worthy goal indeed!

> - Postmasters are running on different computers on a networked cluster.
>   Their data areas are identical at the beginning and recide on local
>   storage devices.
> 
> - Each postmaster is aware that they are a part of a cluster and they
>   can communicate with each other, send multicast requests and look for
>   each other's presence (like heartbeat in linux-ha project).

These first two point on extending postmaster for a network cluster
and HA could be a bit tricky.  Have you considered using a group
communication system like spread.org that already has the network
cluster and heartbeat built in?


> - There are two types of write queries.  Postmasters use seperate
>   communication channels for each.  One is the sequencial channel which
>   carries writes whose order is important, and the non-sequencial
>   channel carries write queries whose order is not important.

This puts the burden of determining weather a conflict can happen on
the application or user.  Application design could become a bit tricky.
If you plan to use the non-sequential channel in an application, you
would need to make sure there are never any possible conflicts.

> 
> - When a frontend process sends non-sequencial write query to a backend,
>   it is directly written to the local data area and a multicast is
>   sent (preferably asynchronously) to the other postmasters who will
>   also update their respective local areas.

What are you planning to send? (SQL, parsed statements, or tuples)


> - When a sequencial-write query is requested, the corresponding
>   postmaster informs a main-postmaster (more about in the next point),
>   waits for his acknowledgement, and proceeds the same way as the
>   non-sequencial write.

This would make the main postmaster handle all the concurrency control for
the replicated system.  Are you thinking a two phased commit protocol here?

> 
>   Or maybe we can completely ignore the main-postmaster concept and
>   let the clients broadcast a request to obtain locks etc.

If each system can obtain locks, how will you handle deadlocks across 
system boundaries?


> 
> Suggessions and critisisms are welcome.
> 

Have you taken a look at Postgres-R or the pg-replicaiton project.  The goals 
are the same as yours, and the approach is some what similar.   There is
a mailing to discuss different approaches, and if you like what we are doing
you can certainly participate in the development.

http://gborg.postgresql.org/project/pgreplication/projdisplay.php

Regards

Darren




Re: Peer to peer replication of Postgresql databases

From
Anuradha Ratnaweera
Date:
On Fri, Oct 11, 2002 at 08:30:55AM -0500, Greg Copeland wrote:
>
> I'd be curious to hear in a little more detail what constitutes "not
> good" for postgres on a mosix cluster.

It seems that almost all the postgres processes remain in the `home'
node.

Please notice that I am not underestimating Mosix in any way.  We have
tested many programs from our parallel processing project with extreme
success on our mosix cluster.
Anuradha

-- 

Debian GNU/Linux (kernel 2.4.18-xfs-1.1)

Ginger snap.



Re: Peer to peer replication of Postgresql databases

From
Anuradha Ratnaweera
Date:
On Fri, Oct 11, 2002 at 07:10:26PM +0530, Shridhar Daithankar wrote:
> On 11 Oct 2002 at 8:30, Greg Copeland wrote:
> 
> > I'd be curious to hear in a little more detail what constitutes "not
> > good" for postgres on a mosix cluster.
> 
> Well, I guess in kind of replication we are talking here, the
> performance will be enhanced only if separate instances of psotgresql
> runs on separate machine.  Now if mosix kernel applies some AI and
> puts all of them on same machine, it isn't going to be any good for
> the purpose replication is deployed.

Exactly.  First, since we know what is going on, it is not necessary for
the OS to decide what's going on.  Secondly, database replication is not
looked after at all, unless we do some crude tricks on the filesystem.
Still it won't be efficient.

> I guess that's what she meant..                     ^^^
Correction: "that's what _HE_ meant..." ;)
Anuradha

-- 

Debian GNU/Linux (kernel 2.4.18-xfs-1.1)

All other things being equal, a bald man cannot be elected President of
the United States.    -- Vic Gold



Re: Peer to peer replication of Postgresql databases

From
"Shridhar Daithankar"
Date:
On 14 Oct 2002 at 11:55, Anuradha Ratnaweera wrote:

> On Fri, Oct 11, 2002 at 07:10:26PM +0530, Shridhar Daithankar wrote:
> > On 11 Oct 2002 at 8:30, Greg Copeland wrote:
> > 
> > > I'd be curious to hear in a little more detail what constitutes "not
> > > good" for postgres on a mosix cluster.
> > 
> > Well, I guess in kind of replication we are talking here, the
> > performance will be enhanced only if separate instances of psotgresql
> > runs on separate machine.  Now if mosix kernel applies some AI and
> > puts all of them on same machine, it isn't going to be any good for
> > the purpose replication is deployed.
> 
> Exactly.  First, since we know what is going on, it is not necessary for
> the OS to decide what's going on.  Secondly, database replication is not
> looked after at all, unless we do some crude tricks on the filesystem.
> Still it won't be efficient.

IMO any one layer of clustering should be enough. If you use mosix, you 
shouldn't need clustering in postgresql. If postgresql clustering is applied 
any heterogenous machines like freebsd/linux should do. (OK same architecture 
at least. No suns and PCs..)

Let's keep aside mosix for the time being. Application level clustering is what 
postgresql needs.

What next? which one should we work on? Postgres-R/Usogres/ER-server?

> 
> > I guess that's what she meant..
>                       ^^^
> Correction: "that's what _HE_ meant..." ;)

Argh... Extremely sorry, in India, special nouns ending with 'a' are usually 
feminine.. Like Radha..
Sorry again..:-)

ByeShridhar

--
Weiner's Law of Libraries:    There are no answers, only cross references.



Re: Peer to peer replication of Postgresql databases

From
Anuradha Ratnaweera
Date:
On Fri, Oct 11, 2002 at 12:07:00PM -0400, Neil Conway wrote:
> [ pgsql-patches removed from Cc: list ]
> 
> Anuradha Ratnaweera <anuradha@lklug.pdn.ac.lk> writes:
> > I am trying to add some replication features to postgres (yes, I
> > have already looked at ongoing work), in a peer to peer manner.
> 
> Did you look at the research behind Postgres-R, and the pgreplication
> stuff?

Am looking at the research papers related to it now.

> > - When a frontend process sends a read query, each backend process
> >   does that from its own data area.
> 
> Surely that's not correct -- a SELECT can be handled by *any one*
> node, not each and every one, right?

Yes.  Sorry about my careless wording.  Unless anything is kind of
locked, each node has a copy of the database, so each one can handle
SELECTs individually.

The actual situation will be far from this simple, because there will be
database writes going on and generating consistent SELECTs would need
careful handling of concurency issues.

> > - There are two types of write queries.  Postmasters use seperate
> >   communication channels for each.  One is the sequencial channel which
> >   carries writes whose order is important, and the non-sequencial
> >   channel carries write queries whose order is not important.
> 
> How do you distinguish between these?

Nope.  We assume that all the communication should go through the
sequencial channel unless indicated by the client.  In that case, we
will have to find a way to indicate this from the client's side.  This
doesn't sound very elegant, may be we can figure out a better way.
Anuradha

-- 

Debian GNU/Linux (kernel 2.4.18-xfs-1.1)

"Being against torture ought to be sort of a bipartisan thing."
-- Karl Lehenbauer