Thread: A Replication Idea
I've been thinking about replication and wanted to throw out an idea to see how fast it gets torn apart. I'm sure the problem can't be this easy but I can't think of why. Ok... Let's say you have two fresh databases, both empty. You set up a postgres proxy for them. The proxy works like this: It listens on port 5432. It pools connections to both real databases. It is very simple just forwarding requests and responses back and forth between client and server. A client can connect to the proxy and not be able to tell that it is not an actual postgres database. When connections are made to it, it proxys connections to both back-end databases. If an insert/update/delete/DDL command comes, it forwards it to both machines. If a query comes down the line it forwards it to one machine or the other. If one of the machines goes offline or is not responding the proxy queues up all update transactions intended for it and stops forwarding queries to it until it comes back online and all queued transactions have been committed. A new machine can be inserted to the cluster. When the proxy is alerted to this, it's first communication would be to pgdumpall() one of the functional databases and pipe it to the new one. At that moment, it is considered an unreachable database and all update transactions are queued for when the dump/rebuild is complete. If a machine dies in catastrophic failure it can be removed from the cluster, and once the machine is fixed, re-inserted as per above. If there were some SQL command for determining the load a machine is experiencing the proxy could intelligently balance the load to the machines in the cluster that can handle it. If the proxy were to fail, clients could safely connect to one of the back end databases in read-only mode until the proxy came back up. The proxy would store a log of incomplete transactions in some kind of presistant storage for all the databases it's connected to, so should it die, it can resume right where it left off assuming the log is intact. With the proxy set up like this you could connect to it as though it were a database, upload your current data and schema and get most all the benifits of clustering. With this setup could achieve load balancing, fail-over, master-master replication, master-slave replication, hot swap servers, dynamic addition and removal of servers and HA-like clustering. The only thing it does not do is partition data across servers. The only assumption I am aware of that I am making is that two identical databases, given the same set of arbitrary transactions will end up being the same. The only single point of failure in this system would be the proxy itself. A modification to the postgres client software could allow automatically fail-over to read-only connections with one of the back-end databases. Also, the proxy could be run on a router or other diskless system. I haven't really thought about it, but it may even be possible to use current HA technology and run a pool of failover proxy's. If the proxy ended up NOT slowing the performance of a standalone, single-system server, it could become the default connection method to PostgreSQL such that a person could do an out-of-the-box install of the database and a year later realize they really wanted a cluster, they could hot-add a server without even restarting the database. So, long story short, I'd like to get people's comments on this. If it won't/can't work or has been tried before, I want to hear about it before I start coding. I find it hard to believe that a replication/clusterings solution could be this easy to implement but I can't think of why this would not work. Orion
Orion, How would it handle functions, which could potentially modify data, even from a select statement? Thanks, Peter Darley -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Orion Henry Sent: Tuesday, February 19, 2002 10:12 AM To: pgsql-general@postgresql.org Subject: [GENERAL] A Replication Idea I've been thinking about replication and wanted to throw out an idea to see how fast it gets torn apart. I'm sure the problem can't be this easy but I can't think of why. Ok... Let's say you have two fresh databases, both empty. You set up a postgres proxy for them. The proxy works like this: It listens on port 5432. It pools connections to both real databases. It is very simple just forwarding requests and responses back and forth between client and server. A client can connect to the proxy and not be able to tell that it is not an actual postgres database. When connections are made to it, it proxys connections to both back-end databases. If an insert/update/delete/DDL command comes, it forwards it to both machines. If a query comes down the line it forwards it to one machine or the other. If one of the machines goes offline or is not responding the proxy queues up all update transactions intended for it and stops forwarding queries to it until it comes back online and all queued transactions have been committed. A new machine can be inserted to the cluster. When the proxy is alerted to this, it's first communication would be to pgdumpall() one of the functional databases and pipe it to the new one. At that moment, it is considered an unreachable database and all update transactions are queued for when the dump/rebuild is complete. If a machine dies in catastrophic failure it can be removed from the cluster, and once the machine is fixed, re-inserted as per above. If there were some SQL command for determining the load a machine is experiencing the proxy could intelligently balance the load to the machines in the cluster that can handle it. If the proxy were to fail, clients could safely connect to one of the back end databases in read-only mode until the proxy came back up. The proxy would store a log of incomplete transactions in some kind of presistant storage for all the databases it's connected to, so should it die, it can resume right where it left off assuming the log is intact. With the proxy set up like this you could connect to it as though it were a database, upload your current data and schema and get most all the benifits of clustering. With this setup could achieve load balancing, fail-over, master-master replication, master-slave replication, hot swap servers, dynamic addition and removal of servers and HA-like clustering. The only thing it does not do is partition data across servers. The only assumption I am aware of that I am making is that two identical databases, given the same set of arbitrary transactions will end up being the same. The only single point of failure in this system would be the proxy itself. A modification to the postgres client software could allow automatically fail-over to read-only connections with one of the back-end databases. Also, the proxy could be run on a router or other diskless system. I haven't really thought about it, but it may even be possible to use current HA technology and run a pool of failover proxy's. If the proxy ended up NOT slowing the performance of a standalone, single-system server, it could become the default connection method to PostgreSQL such that a person could do an out-of-the-box install of the database and a year later realize they really wanted a cluster, they could hot-add a server without even restarting the database. So, long story short, I'd like to get people's comments on this. If it won't/can't work or has been tried before, I want to hear about it before I start coding. I find it hard to believe that a replication/clusterings solution could be this easy to implement but I can't think of why this would not work. Orion ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org
> Orion, > How would it handle functions, which could potentially modify data, even > from a select statement? That is the problem with any sort of a proxy for PostgreSQL. There are too many ways for data to be changed without the proxy being made aware of it. I had started development on a query result caching system (caching the results of queries), but it really can't be done with any guarantee of reliability outside of PostgreSQL itself. And since I'm not a big C programmer, I'm not really up to doing it myself, so it will have to be left until someone else decides to pick it up. steve
I would like to see some work in this area as well... First, there are different mode of HA or clusters. Perhaps we can start easy and get to the advanced modes later. In a Stand-By mode which is the simplest, all the redundant node needs to have/be is up to date (however one would define up to date, daily, hourly, minutely or transaction-ly). And the proxy will thru some hartbeat (or even DNS alone) will find out when primary is gone and will route to secondary. And if the secondary is say a fraction of the primary (from a resource point of view), say a $1000 box, now you have moved a risk from critical path to non-critical path (known as degregated mode, its not a failure) with minimal effort which ironically satisfies 80% of people. The trick then becomes a mater of after the fact replication. So after each commit, one could replicate or even let cron do it on the time axes. But a more daring concurrent server is more exciting with the following issues I see two sets of problems; Transient, and Steady State. (both terminologies borrowed from hardware engineering). Transient Issues: - Both servers comming online at time T0. The initial boot. The proxy can get confused unless hardcoded or time delayed. - Join of a server and a period of ambiguity for the joining server until he catches up. Proxy will also be confused as the heatbeat indicate secondary is online but its state has not reached a logical equilibrium...(forgive my spelling) Steady State issues: - Write Policies would impact performance and complexity. Say the policy is to write to both nodes (or worse all nodes). Say the plicy is further to consider a write completed when all nodes ack the write. This will slow down the entire operation. Another policy would be the first ack of a write marks this write as a completed one. Increased performance at the expense of reliability. Who knows maybe node i failed to complete the write and 5 seconds later the proxy hands him a job that deponds on that write. I'm not discouraging, I'm writing to learn myself....I love this distributed stuff and I don't know anything about it.... Orion Henry wrote: > I've been thinking about replication and wanted to throw out an idea to see > how fast it gets torn apart. I'm sure the problem can't be this easy but I > can't think of why. > > Ok... Let's say you have two fresh databases, both empty. You set up a > postgres proxy for them. The proxy works like this: > > It listens on port 5432. > It pools connections to both real databases. > It is very simple just forwarding requests and responses back and > forth between client and server. A client can connect to > the proxy and not be able to tell that it is not an actual > postgres database. > When connections are made to it, it proxys connections to both > back-end databases. > If an insert/update/delete/DDL command comes, it forwards it to both > machines. > If a query comes down the line it forwards it to one machine or the > other. > If one of the machines goes offline or is not responding the proxy > queues up all update transactions intended for it and stops > forwarding queries to it until it comes back online and all > queued transactions have been committed. > A new machine can be inserted to the cluster. When the proxy is > alerted to this, it's first communication would be to > pgdumpall() one of the functional databases and pipe it to > the new one. At that moment, it is considered an > unreachable database and all update transactions are queued > for when the dump/rebuild is complete. > If a machine dies in catastrophic failure it can be removed from the > cluster, and once the machine is fixed, re-inserted as per > above. > If there were some SQL command for determining the load a machine > is experiencing the proxy could intelligently balance the > load to the machines in the cluster that can handle it. > If the proxy were to fail, clients could safely connect to one of > the back end databases in read-only mode until the proxy > came back up. > The proxy would store a log of incomplete transactions in some kind > of presistant storage for all the databases it's connected > to, so should it die, it can resume right where it left off > assuming the log is intact. > > With the proxy set up like this you could connect to it as though it were a > database, upload your current data and schema and get most all the benifits > of clustering. > > With this setup could achieve load balancing, fail-over, master-master > replication, master-slave replication, hot swap servers, dynamic addition > and removal of servers and HA-like clustering. The only thing it does not > do is partition data across servers. The only assumption I am aware of > that I am making is that two identical databases, given the same set of > arbitrary transactions will end up being the same. The only single point > of failure in this system would be the proxy itself. A modification to the > postgres client software could allow automatically fail-over to read-only > connections with one of the back-end databases. Also, the proxy could be > run on a router or other diskless system. I haven't really thought about > it, but it may even be possible to use current HA technology and run a pool > of failover proxy's. > > If the proxy ended up NOT slowing the performance of a standalone, > single-system server, it could become the default connection method to > PostgreSQL such that a person could do an out-of-the-box install of the > database and a year later realize they really wanted a cluster, they could > hot-add a server without even restarting the database. > > So, long story short, I'd like to get people's comments on this. If it > won't/can't work or has been tried before, I want to hear about it before I > start coding. I find it hard to believe that a replication/clusterings > solution could be this easy to implement but I can't think of why this > would not work. > > Orion > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org -- ------------------------------------------------------------------------- Medi Montaseri medi@CyberShell.com Unix Distributed Systems Engineer HTTP://www.CyberShell.com CyberShell Engineering -------------------------------------------------------------------------
>How would it handle functions, which could potentially modify data, even >from a select statement? It seems that you'd have two options, if you wanted the proxy to be truly transparent to the client: 1. Send ALL SQL statements down the wire to each node, including SELECT statements, since selected functions may modify data. 2. Write a small, fast, reliable parser that checks for criteria which would make the statement potentially data-modifying (e.g., the existence of a function), and send only data-modifying SELECTs along with your standard UPDATEs, DELETEs, etc. However, it probably just occurred to you all as it just occurred to me that this is pretty moot, because functions aren't the only concern: you could have a trigger on a table that would wipe out idea #2. ;) Really, there are too many transparent ways data can be modified by seemingly innocuous statements, so parsing a statement for distribution is right out; it seems as though each node is going to have to require a copy of EACH statement that the proxy runs into in order to maintain 100% integrity. However, that doesn't mean your proxy needs to get answer back from all of the nodes in terms of result sets. Something as simple as a systemic packet indicating that the downstream-execution was successful would be enough data for the proxy to know what's going on, provided it knows it should get its answer soon from another node (e.g., the node with the lowest load). Result sets could still be cached based on a statement, within some specified degree of accuracy (e.g., how much time elapses before a cached resultset expires); you'd just need to make sure that even though you're returning a cached result set, you still send the request to each back-end to get processed in its own time. Seems like some *really* careful threading might be called for; one thread to listen to incoming traffic, from which downstream events are queued up, another thread sending off those events to the back-end in the order they were received, and another thread listening for answers from nodes, and queueing up responses to be sent back to the appropriate client's socket. Regards, Jw. -- jlx@commandprompt.com, by way of pgsql-general@commandprompt.com http://www.postgresql.info/ http://www.commandprompt.com/
high level SQL language. The proxy should delve into a deeper layer after
the plan has been written and before the execuation is kicked in.
In other words, you take a PG engine, you pill off the fron end, parser, planner
part and then slip in a layer before the execution.
See your installation docs, "Chap 2, Section 2.1 The Path of a Query"
The path is
Connection, Parser Stage, Rewrite System, Planner/Optimizer, Executor.
In fact the name is already there "Planner/Optimizer" what we want is
optimization. I know people usually mean a different thing, but why not.
HA is optimization as well...
By the way I got this idea from Solaris Virtual File System (VFS), I call
this VDB (Virtual DataBase).
"Command Prompt, Inc." wrote:
>How would it handle functions, which could potentially modify data, even
>from a select statement?It seems that you'd have two options, if you wanted the proxy to be truly
transparent to the client:1. Send ALL SQL statements down the wire to each node, including SELECT
statements, since selected functions may modify data.2. Write a small, fast, reliable parser that checks for criteria which
would make the statement potentially data-modifying (e.g., the
existence of a function), and send only data-modifying SELECTs along
with your standard UPDATEs, DELETEs, etc.However, it probably just occurred to you all as it just occurred to me
that this is pretty moot, because functions aren't the only concern: you
could have a trigger on a table that would wipe out idea #2. ;)Really, there are too many transparent ways data can be modified by
seemingly innocuous statements, so parsing a statement for distribution
is right out; it seems as though each node is going to have to require a
copy of EACH statement that the proxy runs into in order to maintain 100%
integrity.However, that doesn't mean your proxy needs to get answer back from all of
the nodes in terms of result sets. Something as simple as a systemic
packet indicating that the downstream-execution was successful would be
enough data for the proxy to know what's going on, provided it knows it
should get its answer soon from another node (e.g., the node with the
lowest load).Result sets could still be cached based on a statement, within some
specified degree of accuracy (e.g., how much time elapses before a cached
resultset expires); you'd just need to make sure that even though you're
returning a cached result set, you still send the request to each back-end
to get processed in its own time.Seems like some *really* careful threading might be called for; one thread
to listen to incoming traffic, from which downstream events are queued up,
another thread sending off those events to the back-end in the order they
were received, and another thread listening for answers from nodes, and
queueing up responses to be sent back to the appropriate client's socket.Regards,
Jw.
--
jlx@commandprompt.com, by way of pgsql-general@commandprompt.com
http://www.postgresql.info/
http://www.commandprompt.com/---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
-- ------------------------------------------------------------------------- Medi Montaseri medi@CyberShell.com Unix Distributed Systems Engineer HTTP://www.CyberShell.com CyberShell Engineering -------------------------------------------------------------------------
Just for caching in Java we use PoolMan from CodeStudio.com. The way I would do replicaton is have 1 RW server and many RO servers. When you select, you select from "local" RO DB server. When you update, you update the RW server. If you must have latest data, you go to the 1 RW server. The RW server's tables have a flag, that indicates it needs to replicated to local. ANd there is a lazy, delayed process that reads from RW row by row, and after each "local" RO server commits, it updates the flag. On to the next row. VIc Peter Darley wrote: > Orion, > How would it handle functions, which could potentially modify data, even > from a select statement? > Thanks, > Peter Darley > > -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Orion Henry > Sent: Tuesday, February 19, 2002 10:12 AM > To: pgsql-general@postgresql.org > Subject: [GENERAL] A Replication Idea > > > > I've been thinking about replication and wanted to throw out an idea to see > how fast it gets torn apart. I'm sure the problem can't be this easy but I > can't think of why. > > Ok... Let's say you have two fresh databases, both empty. You set up a > postgres proxy for them. The proxy works like this: > > It listens on port 5432. > It pools connections to both real databases. > It is very simple just forwarding requests and responses back and > forth between client and server. A client can connect to > the proxy and not be able to tell that it is not an actual > postgres database. > When connections are made to it, it proxys connections to both > back-end databases. > If an insert/update/delete/DDL command comes, it forwards it to both > machines. > If a query comes down the line it forwards it to one machine or the > other. > If one of the machines goes offline or is not responding the proxy > queues up all update transactions intended for it and stops > forwarding queries to it until it comes back online and all > queued transactions have been committed. > A new machine can be inserted to the cluster. When the proxy is > alerted to this, it's first communication would be to > pgdumpall() one of the functional databases and pipe it to > the new one. At that moment, it is considered an > unreachable database and all update transactions are queued > for when the dump/rebuild is complete. > If a machine dies in catastrophic failure it can be removed from the > cluster, and once the machine is fixed, re-inserted as per > above. > If there were some SQL command for determining the load a machine > is experiencing the proxy could intelligently balance the > load to the machines in the cluster that can handle it. > If the proxy were to fail, clients could safely connect to one of > the back end databases in read-only mode until the proxy > came back up. > The proxy would store a log of incomplete transactions in some kind > of presistant storage for all the databases it's connected > to, so should it die, it can resume right where it left off > assuming the log is intact. > > With the proxy set up like this you could connect to it as though it were a > database, upload your current data and schema and get most all the benifits > of clustering. > > With this setup could achieve load balancing, fail-over, master-master > replication, master-slave replication, hot swap servers, dynamic addition > and removal of servers and HA-like clustering. The only thing it does not > do is partition data across servers. The only assumption I am aware of > that I am making is that two identical databases, given the same set of > arbitrary transactions will end up being the same. The only single point > of failure in this system would be the proxy itself. A modification to the > postgres client software could allow automatically fail-over to read-only > connections with one of the back-end databases. Also, the proxy could be > run on a router or other diskless system. I haven't really thought about > it, but it may even be possible to use current HA technology and run a pool > of failover proxy's. > > If the proxy ended up NOT slowing the performance of a standalone, > single-system server, it could become the default connection method to > PostgreSQL such that a person could do an out-of-the-box install of the > database and a year later realize they really wanted a cluster, they could > hot-add a server without even restarting the database. > > So, long story short, I'd like to get people's comments on this. If it > won't/can't work or has been tried before, I want to hear about it before I > start coding. I find it hard to believe that a replication/clusterings > solution could be this easy to implement but I can't think of why this > would not work. > > Orion > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html >
The challenge is can you make it "Transparent". That would be
Distributed.
Vic Cekvenich wrote:
Just for caching in Java we use PoolMan from CodeStudio.com.The way I would do replicaton is have 1 RW server and many RO servers.
When you select, you select from "local" RO DB server. When you update,
you update the RW server. If you must have latest data, you go to the 1
RW server.
The RW server's tables have a flag, that indicates it needs to
replicated to local.
ANd there is a lazy, delayed process that reads from RW row by row, and
after each "local" RO server commits, it updates the flag. On to the
next row.VIc
Peter Darley wrote:
> Orion,
> How would it handle functions, which could potentially modify data, even
> from a select statement?
> Thanks,
> Peter Darley
>
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Orion Henry
> Sent: Tuesday, February 19, 2002 10:12 AM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] A Replication Idea
>
>
>
> I've been thinking about replication and wanted to throw out an idea to see
> how fast it gets torn apart. I'm sure the problem can't be this easy but I
> can't think of why.
>
> Ok... Let's say you have two fresh databases, both empty. You set up a
> postgres proxy for them. The proxy works like this:
>
> It listens on port 5432.
> It pools connections to both real databases.
> It is very simple just forwarding requests and responses back and
> forth between client and server. A client can connect to
> the proxy and not be able to tell that it is not an actual
> postgres database.
> When connections are made to it, it proxys connections to both
> back-end databases.
> If an insert/update/delete/DDL command comes, it forwards it to both
> machines.
> If a query comes down the line it forwards it to one machine or the
> other.
> If one of the machines goes offline or is not responding the proxy
> queues up all update transactions intended for it and stops
> forwarding queries to it until it comes back online and all
> queued transactions have been committed.
> A new machine can be inserted to the cluster. When the proxy is
> alerted to this, it's first communication would be to
> pgdumpall() one of the functional databases and pipe it to
> the new one. At that moment, it is considered an
> unreachable database and all update transactions are queued
> for when the dump/rebuild is complete.
> If a machine dies in catastrophic failure it can be removed from the
> cluster, and once the machine is fixed, re-inserted as per
> above.
> If there were some SQL command for determining the load a machine
> is experiencing the proxy could intelligently balance the
> load to the machines in the cluster that can handle it.
> If the proxy were to fail, clients could safely connect to one of
> the back end databases in read-only mode until the proxy
> came back up.
> The proxy would store a log of incomplete transactions in some kind
> of presistant storage for all the databases it's connected
> to, so should it die, it can resume right where it left off
> assuming the log is intact.
>
> With the proxy set up like this you could connect to it as though it were a
> database, upload your current data and schema and get most all the benifits
> of clustering.
>
> With this setup could achieve load balancing, fail-over, master-master
> replication, master-slave replication, hot swap servers, dynamic addition
> and removal of servers and HA-like clustering. The only thing it does not
> do is partition data across servers. The only assumption I am aware of
> that I am making is that two identical databases, given the same set of
> arbitrary transactions will end up being the same. The only single point
> of failure in this system would be the proxy itself. A modification to the
> postgres client software could allow automatically fail-over to read-only
> connections with one of the back-end databases. Also, the proxy could be
> run on a router or other diskless system. I haven't really thought about
> it, but it may even be possible to use current HA technology and run a pool
> of failover proxy's.
>
> If the proxy ended up NOT slowing the performance of a standalone,
> single-system server, it could become the default connection method to
> PostgreSQL such that a person could do an out-of-the-box install of the
> database and a year later realize they really wanted a cluster, they could
> hot-add a server without even restarting the database.
>
> So, long story short, I'd like to get people's comments on this. If it
> won't/can't work or has been tried before, I want to hear about it before I
> start coding. I find it hard to believe that a replication/clusterings
> solution could be this easy to implement but I can't think of why this
> would not work.
>
> Orion
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
-- ------------------------------------------------------------------------- Medi Montaseri medi@CyberShell.com Unix Distributed Systems Engineer HTTP://www.CyberShell.com CyberShell Engineering -------------------------------------------------------------------------