Thread: Re: pgsql Replication Proxy (was Re: Replication for a

Re: pgsql Replication Proxy (was Re: Replication for a

From
"Diehl, Jeffrey"
Date:
I love this idea.  The proxy could return immediately instead of making my
program block on update.

One note, though.  Instead of a stack, you need a FIFO.  For example:

delete from sometable where field=value;
insert into sometable (field) values (value1);
insert into sometable (field) values (value2);
....


This code breaks in a stack and only works in a fifo.  Minor point, though.

So do we have a volunteer to write such a tool?  <grin>

Mike Diehl.

-----Original Message-----
From: Michael A Nachbaur [mailto:mike@nachbaur.com]
Sent: Monday, May 05, 2003 1:57 PM
To: pgsql-sql@postgresql.org
Subject: pgsql Replication Proxy (was Re: [SQL] Replication for a large
database)


I've thought some more about this, and I want to pass this idea past you
guys.  
What do you think about a replication proxy, essentially a daemon that sits 
between a PostgreSQL client and server.  Every single SQL query, transaction

statement, etc that the proxy recieves it repeats back to all the database 
servers.  In this way, if a back-end database server goes down queries will 
continue unabated (except the downed server won't recieve updates).

Basically, the proxy server could intercept these queries and place them in
a 
stack (on a per-database basis) and when every server in the queue 
acknowledges the update, the query is removed from the stack.  Each database

server can have their own position in the stack, so if servers A and B 
successfully run a query, but C doesn't (e.g. it requires human 
intervention), C is removed from the list of acceptable servers but A and B 
can keep moving through the queue.

What do you think?  Also, should this discussion be moved to another mailing

list?

On Monday 05 May 2003 12:26 pm, Michael A Nachbaur wrote:
> I have thought about this.  The problem I come into is data consistancy.
I
> have at least 8 different processes that harvest data, and an intranet
> website that can also manipulate the database (to assign customers to
> different packages, re-assign modems to different customers, etc).  Trying
> to maintain consistancy across the entire application would be such a
> nightmare, I don't want to think about it.
>
> If I go with a centralized middleware server that manages all database
> access, then I could perhaps do that in there...and then I could use
> transactions on both databases, and if either transaction fails then I'll
> roll back the other.  But this would make my entire framework very rigid.


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org



Re: pgsql Replication Proxy (was Re: Replication for a

From
Michael A Nachbaur
Date:
LOL!  Depending on how much FUD I can throw at the guys higher up in the food 
chain at my office, I might be able to get some budget space to develop 
something like this.  There are some significant technical hurdles I have to 
overcome, but I think it's doable.  The analogy I came up with is SCSI RAID 
for databases.  You can rip a database server out, and the overall system 
will still function...toss it back in, and updates will still happen.  I 
would also like to be able to throw a fresh database in place and have it 
mirror the existing database servers in the background so you don't have to 
go through the complicated procedure of dumping/restoring the database 
servers by hand.

Re: FIFO, yeah, I realized that after I sent the message.

Does anyone have any ideas for me on this?  I think it might make sense to use 
PostgreSQL as the storage mechanism for the proxy server, but that sort of 
defeats the purpose of having a replication system.  Maybe spread can be used 
to distribute the messages to different servers, but I'm not too familiar 
with it.

Also, one final note, I'm a Perl programmer, so anything I build will be 
written in that.  If anyone has objections, let me know and maybe we could 
work together on something.

On Tuesday 06 May 2003 09:28 am, Diehl, Jeffrey wrote:
> I love this idea.  The proxy could return immediately instead of making my
> program block on update.
>
> One note, though.  Instead of a stack, you need a FIFO.  For example:
>
> delete from sometable where field=value;
> insert into sometable (field) values (value1);
> insert into sometable (field) values (value2);
> ....
>
>
> This code breaks in a stack and only works in a fifo.  Minor point, though.
>
> So do we have a volunteer to write such a tool?  <grin>
>
> Mike Diehl.
>
> -----Original Message-----
> From: Michael A Nachbaur [mailto:mike@nachbaur.com]
> Sent: Monday, May 05, 2003 1:57 PM
> To: pgsql-sql@postgresql.org
> Subject: pgsql Replication Proxy (was Re: [SQL] Replication for a large
> database)
>
>
> I've thought some more about this, and I want to pass this idea past you
> guys.
> What do you think about a replication proxy, essentially a daemon that sits
> between a PostgreSQL client and server.  Every single SQL query,
> transaction
>
> statement, etc that the proxy recieves it repeats back to all the database
> servers.  In this way, if a back-end database server goes down queries will
> continue unabated (except the downed server won't recieve updates).
>
> Basically, the proxy server could intercept these queries and place them in
> a
> stack (on a per-database basis) and when every server in the queue
> acknowledges the update, the query is removed from the stack.  Each
> database
>
> server can have their own position in the stack, so if servers A and B
> successfully run a query, but C doesn't (e.g. it requires human
> intervention), C is removed from the list of acceptable servers but A and B
> can keep moving through the queue.
>
> What do you think?  Also, should this discussion be moved to another
> mailing
>
> list?
>
> On Monday 05 May 2003 12:26 pm, Michael A Nachbaur wrote:
> > I have thought about this.  The problem I come into is data consistancy.
>
> I
>
> > have at least 8 different processes that harvest data, and an intranet
> > website that can also manipulate the database (to assign customers to
> > different packages, re-assign modems to different customers, etc). 
> > Trying to maintain consistancy across the entire application would be
> > such a nightmare, I don't want to think about it.
> >
> > If I go with a centralized middleware server that manages all database
> > access, then I could perhaps do that in there...and then I could use
> > transactions on both databases, and if either transaction fails then I'll
> > roll back the other.  But this would make my entire framework very rigid.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly



Re: pgsql Replication Proxy (was Re: Replication for a

From
"Mendola Gaetano"
Date:
Postgres don't have the support ( yet ) for the 2 phase commit, so
I think that is impossible do it now.
What happen if the last server do an error in commit phase ?

Regards
Gaetano

----- Original Message ----- 
From: "Diehl, Jeffrey" <jdiehl@sandia.gov>
To: "'Michael A Nachbaur'" <mike@nachbaur.com>; <pgsql-sql@postgresql.org>
Sent: Tuesday, May 06, 2003 6:28 PM
Subject: Re: pgsql Replication Proxy (was Re: [SQL] Replication for a


> I love this idea.  The proxy could return immediately instead of making my
> program block on update.
>
> One note, though.  Instead of a stack, you need a FIFO.  For example:
>
> delete from sometable where field=value;
> insert into sometable (field) values (value1);
> insert into sometable (field) values (value2);
> ....
>
>
> This code breaks in a stack and only works in a fifo.  Minor point,
though.
>
> So do we have a volunteer to write such a tool?  <grin>
>
> Mike Diehl.
>
> -----Original Message-----
> From: Michael A Nachbaur [mailto:mike@nachbaur.com]
> Sent: Monday, May 05, 2003 1:57 PM
> To: pgsql-sql@postgresql.org
> Subject: pgsql Replication Proxy (was Re: [SQL] Replication for a large
> database)
>
>
> I've thought some more about this, and I want to pass this idea past you
> guys.
> What do you think about a replication proxy, essentially a daemon that
sits
> between a PostgreSQL client and server.  Every single SQL query,
transaction
>
> statement, etc that the proxy recieves it repeats back to all the database
> servers.  In this way, if a back-end database server goes down queries
will
> continue unabated (except the downed server won't recieve updates).
>
> Basically, the proxy server could intercept these queries and place them
in
> a
> stack (on a per-database basis) and when every server in the queue
> acknowledges the update, the query is removed from the stack.  Each
database
>
> server can have their own position in the stack, so if servers A and B
> successfully run a query, but C doesn't (e.g. it requires human
> intervention), C is removed from the list of acceptable servers but A and
B
> can keep moving through the queue.
>
> What do you think?  Also, should this discussion be moved to another
mailing
>
> list?
>
> On Monday 05 May 2003 12:26 pm, Michael A Nachbaur wrote:
> > I have thought about this.  The problem I come into is data consistancy.
> I
> > have at least 8 different processes that harvest data, and an intranet
> > website that can also manipulate the database (to assign customers to
> > different packages, re-assign modems to different customers, etc).
Trying
> > to maintain consistancy across the entire application would be such a
> > nightmare, I don't want to think about it.
> >
> > If I go with a centralized middleware server that manages all database
> > access, then I could perhaps do that in there...and then I could use
> > transactions on both databases, and if either transaction fails then
I'll
> > roll back the other.  But this would make my entire framework very
rigid.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>



Re: pgsql Replication Proxy (was Re: Replication for a

From
Michael A Nachbaur
Date:
Then it is removed from the list of "valid" servers, and it isn't included in 
updates, at least until a) a DBA fixes the issue, or b) the DBA tells the 
database server to re-mirror itself from the other available servers.

Once two-phased commit support is available, this could be changed to be much 
more robust.

Any alternative ideas?

On Tuesday 06 May 2003 10:54 am, Mendola Gaetano wrote:
> Postgres don't have the support ( yet ) for the 2 phase commit, so
> I think that is impossible do it now.
> What happen if the last server do an error in commit phase ?
>
> Regards
> Gaetano
>
> ----- Original Message -----
> From: "Diehl, Jeffrey" <jdiehl@sandia.gov>
> To: "'Michael A Nachbaur'" <mike@nachbaur.com>; <pgsql-sql@postgresql.org>
> Sent: Tuesday, May 06, 2003 6:28 PM
> Subject: Re: pgsql Replication Proxy (was Re: [SQL] Replication for a
>
> > I love this idea.  The proxy could return immediately instead of making
> > my program block on update.
> >
> > One note, though.  Instead of a stack, you need a FIFO.  For example:
> >
> > delete from sometable where field=value;
> > insert into sometable (field) values (value1);
> > insert into sometable (field) values (value2);
> > ....
> >
> >
> > This code breaks in a stack and only works in a fifo.  Minor point,
>
> though.
>
> > So do we have a volunteer to write such a tool?  <grin>
> >
> > Mike Diehl.
> >
> > -----Original Message-----
> > From: Michael A Nachbaur [mailto:mike@nachbaur.com]
> > Sent: Monday, May 05, 2003 1:57 PM
> > To: pgsql-sql@postgresql.org
> > Subject: pgsql Replication Proxy (was Re: [SQL] Replication for a large
> > database)
> >
> >
> > I've thought some more about this, and I want to pass this idea past you
> > guys.
> > What do you think about a replication proxy, essentially a daemon that
>
> sits
>
> > between a PostgreSQL client and server.  Every single SQL query,
>
> transaction
>
> > statement, etc that the proxy recieves it repeats back to all the
> > database servers.  In this way, if a back-end database server goes down
> > queries
>
> will
>
> > continue unabated (except the downed server won't recieve updates).
> >
> > Basically, the proxy server could intercept these queries and place them
>
> in
>
> > a
> > stack (on a per-database basis) and when every server in the queue
> > acknowledges the update, the query is removed from the stack.  Each
>
> database
>
> > server can have their own position in the stack, so if servers A and B
> > successfully run a query, but C doesn't (e.g. it requires human
> > intervention), C is removed from the list of acceptable servers but A and
>
> B
>
> > can keep moving through the queue.
> >
> > What do you think?  Also, should this discussion be moved to another
>
> mailing
>
> > list?
> >
> > On Monday 05 May 2003 12:26 pm, Michael A Nachbaur wrote:
> > > I have thought about this.  The problem I come into is data
> > > consistancy.
> >
> > I
> >
> > > have at least 8 different processes that harvest data, and an intranet
> > > website that can also manipulate the database (to assign customers to
> > > different packages, re-assign modems to different customers, etc).
>
> Trying
>
> > > to maintain consistancy across the entire application would be such a
> > > nightmare, I don't want to think about it.
> > >
> > > If I go with a centralized middleware server that manages all database
> > > access, then I could perhaps do that in there...and then I could use
> > > transactions on both databases, and if either transaction fails then
>
> I'll
>
> > > roll back the other.  But this would make my entire framework very
>
> rigid.
>
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo@postgresql.org so that your
> > message can get through to the mailing list cleanly



Re: pgsql Replication Proxy (was Re: Replication for a

From
Wei Weng
Date:
Michael:

1: It will probably need a messaging system that can properly 
register/unregister a backend database system. a SIP like syntax comes 
into mind. (or a XML message, for the ease of future expansion)

2: What do you mean by "per-database basis"? Database as the concept of 
database schema? or seperate physical database server?

3: How does the proxy server know that a backend server failed in commit?

Wei

Michael A Nachbaur wrote:

>LOL!  Depending on how much FUD I can throw at the guys higher up in the food 
>chain at my office, I might be able to get some budget space to develop 
>something like this.  There are some significant technical hurdles I have to 
>overcome, but I think it's doable.  The analogy I came up with is SCSI RAID 
>for databases.  You can rip a database server out, and the overall system 
>will still function...toss it back in, and updates will still happen.  I 
>would also like to be able to throw a fresh database in place and have it 
>mirror the existing database servers in the background so you don't have to 
>go through the complicated procedure of dumping/restoring the database 
>servers by hand.
>
>Re: FIFO, yeah, I realized that after I sent the message.
>
>Does anyone have any ideas for me on this?  I think it might make sense to use 
>PostgreSQL as the storage mechanism for the proxy server, but that sort of 
>defeats the purpose of having a replication system.  Maybe spread can be used 
>to distribute the messages to different servers, but I'm not too familiar 
>with it.
>
>Also, one final note, I'm a Perl programmer, so anything I build will be 
>written in that.  If anyone has objections, let me know and maybe we could 
>work together on something.
>
>  
>



Re: pgsql Replication Proxy (was Re: Replication for a

From
Michael A Nachbaur
Date:
On Tuesday 06 May 2003 12:10 pm, Wei Weng wrote:
> Michael:
>
> 1: It will probably need a messaging system that can properly
> register/unregister a backend database system. a SIP like syntax comes
> into mind. (or a XML message, for the ease of future expansion)

I'm thinking Spread would be a good idea here.  To tell you the truth, I 
haven't had to deal too much with high-speed guarenteed messaging in a looong 
time, so I'm out of touch with this.

> 2: What do you mean by "per-database basis"? Database as the concept of
> database schema? or seperate physical database server?

Per database schema.  This is primarily to meet my requirements, but it makes 
sense for other scenarios.  Essentially, I have a production database server 
and a development database server.  Both have similar capabilities, and are 
always on (redundant power supplies, UPS, etc).  My development machine will 
have a replicated copy of the production database schema, but I'd also like 
to have a "private" database schema on the same machine that I can do 
development work on, where I can break it without affecting production.

I'd like to eventually do something similar to mod_backhand, where "select" 
statements are sent to a server with the least amount of load, so slower 
machines can still take part in a proxy setup without reducing performance 
(or hammering the poor little box).  One could also give a server a handicap 
so it has more capacity for other uses (e.g. my development server).

> 3: How does the proxy server know that a backend server failed in commit?

I'm thinking of having a queue of statements that the proxy server maintains.  
Each server will have it's own pointer that refers to the location in the 
queue they are at.  Each statement is then sent to the database server 
sequentially, and if a statement fails on a given server, its removed from 
the list list of valid servers, and it's index remains at the statement that 
failed.

This is really starting to sound complicated...

> Wei
>
> Michael A Nachbaur wrote:
> >LOL!  Depending on how much FUD I can throw at the guys higher up in the
> > food chain at my office, I might be able to get some budget space to
> > develop something like this.  There are some significant technical
> > hurdles I have to overcome, but I think it's doable.  The analogy I came
> > up with is SCSI RAID for databases.  You can rip a database server out,
> > and the overall system will still function...toss it back in, and updates
> > will still happen.  I would also like to be able to throw a fresh
> > database in place and have it mirror the existing database servers in the
> > background so you don't have to go through the complicated procedure of
> > dumping/restoring the database servers by hand.
> >
> >Re: FIFO, yeah, I realized that after I sent the message.
> >
> >Does anyone have any ideas for me on this?  I think it might make sense to
> > use PostgreSQL as the storage mechanism for the proxy server, but that
> > sort of defeats the purpose of having a replication system.  Maybe spread
> > can be used to distribute the messages to different servers, but I'm not
> > too familiar with it.
> >
> >Also, one final note, I'm a Perl programmer, so anything I build will be
> >written in that.  If anyone has objections, let me know and maybe we could
> >work together on something.



Re: pgsql Replication Proxy (was Re: Replication for a

From
"Mendola Gaetano"
Date:
And if the first commit fail if a key is duplicated, or for another
trigger issue? Should be possible understand why the commit
fails and consider this a normal event too, I  think anyway that
is hard.
Also the Remirror ( the point B) is hard to do without
"Point in time recovery" how do a remirror of a live DB without
have an exclusive lock on it ?

Regards
Gaetano

On: Tuesday, May 06, 2003 8:16 PM "Michael A Nachbaur" <mike@nachbaur.com>
wrote:
> Then it is removed from the list of "valid" servers, and it isn't included
in
> updates, at least until a) a DBA fixes the issue, or b) the DBA tells the
> database server to re-mirror itself from the other available servers.
>
> Once two-phased commit support is available, this could be changed to be
much
> more robust.
>
> Any alternative ideas?
>
> On Tuesday 06 May 2003 10:54 am, Mendola Gaetano wrote:
> > Postgres don't have the support ( yet ) for the 2 phase commit, so
> > I think that is impossible do it now.
> > What happen if the last server do an error in commit phase ?
> >
> > Regards
> > Gaetano
> >
> > ----- Original Message -----
> > From: "Diehl, Jeffrey" <jdiehl@sandia.gov>
> > To: "'Michael A Nachbaur'" <mike@nachbaur.com>;
<pgsql-sql@postgresql.org>
> > Sent: Tuesday, May 06, 2003 6:28 PM
> > Subject: Re: pgsql Replication Proxy (was Re: [SQL] Replication for a
> >
> > > I love this idea.  The proxy could return immediately instead of
making
> > > my program block on update.
> > >
> > > One note, though.  Instead of a stack, you need a FIFO.  For example:
> > >
> > > delete from sometable where field=value;
> > > insert into sometable (field) values (value1);
> > > insert into sometable (field) values (value2);
> > > ....
> > >
> > >
> > > This code breaks in a stack and only works in a fifo.  Minor point,
> >
> > though.
> >
> > > So do we have a volunteer to write such a tool?  <grin>
> > >
> > > Mike Diehl.
> > >
> > > -----Original Message-----
> > > From: Michael A Nachbaur [mailto:mike@nachbaur.com]
> > > Sent: Monday, May 05, 2003 1:57 PM
> > > To: pgsql-sql@postgresql.org
> > > Subject: pgsql Replication Proxy (was Re: [SQL] Replication for a
large
> > > database)
> > >
> > >
> > > I've thought some more about this, and I want to pass this idea past
you
> > > guys.
> > > What do you think about a replication proxy, essentially a daemon that
> >
> > sits
> >
> > > between a PostgreSQL client and server.  Every single SQL query,
> >
> > transaction
> >
> > > statement, etc that the proxy recieves it repeats back to all the
> > > database servers.  In this way, if a back-end database server goes
down
> > > queries
> >
> > will
> >
> > > continue unabated (except the downed server won't recieve updates).
> > >
> > > Basically, the proxy server could intercept these queries and place
them
> >
> > in
> >
> > > a
> > > stack (on a per-database basis) and when every server in the queue
> > > acknowledges the update, the query is removed from the stack.  Each
> >
> > database
> >
> > > server can have their own position in the stack, so if servers A and B
> > > successfully run a query, but C doesn't (e.g. it requires human
> > > intervention), C is removed from the list of acceptable servers but A
and
> >
> > B
> >
> > > can keep moving through the queue.
> > >
> > > What do you think?  Also, should this discussion be moved to another
> >
> > mailing
> >
> > > list?
> > >
> > > On Monday 05 May 2003 12:26 pm, Michael A Nachbaur wrote:
> > > > I have thought about this.  The problem I come into is data
> > > > consistancy.
> > >
> > > I
> > >
> > > > have at least 8 different processes that harvest data, and an
intranet
> > > > website that can also manipulate the database (to assign customers
to
> > > > different packages, re-assign modems to different customers, etc).
> >
> > Trying
> >
> > > > to maintain consistancy across the entire application would be such
a
> > > > nightmare, I don't want to think about it.
> > > >
> > > > If I go with a centralized middleware server that manages all
database
> > > > access, then I could perhaps do that in there...and then I could use
> > > > transactions on both databases, and if either transaction fails then
> >
> > I'll
> >
> > > > roll back the other.  But this would make my entire framework very
> >
> > rigid.
> >
> > > ---------------------------(end of
broadcast)---------------------------
> > > TIP 1: subscribe and unsubscribe commands go to
majordomo@postgresql.org
> > >
> > >
> > > ---------------------------(end of
broadcast)---------------------------
> > > TIP 3: if posting/reading through Usenet, please send an appropriate
> > > subscribe-nomail command to majordomo@postgresql.org so that your
> > > message can get through to the mailing list cleanly
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>