Hi--
I had been thinking of the issues of multimaster replication and how to
do highly available, loadballanced clustering with PostgreSQL. Here is
my outline, and I am looking for comments on the limitations of how this
would work.
Several PostgreSQL servers would share a virtual IP address, and would
coordinate among themselves which will act as "Master" for the purposes
of a single transaction (but connection could be easier). SELECT
statements are handled exclusively by the transaction master while
anything that writes to a database would be sent to all the the
"Masters." At the end of each transaction the systems would poll
eachother regarding whether they were all successful:
1: Any system which is successful in COMMITting the transaction must
ignore any system which fails the transaction untill a recovery can be made.
2: Any system which fails in COMMITting the transaction must cease to
be a master, provided that it recieves a signat from any other member of
the cluster that indicates that that member succeeded in committing the
transaction.
3: If all nodes fail to commit, then they all remain masters.
Recovery would be done in several steps:
1: The database would be copied to the failed system using pg_dump.
2: A current recovery would be done from the transaction log.
3: This would be repeated in order to ensure that the database is up to
date.
4: When two successive restores have been achieved with no new
additions to the database, the "All Recovered" signal is sent to the
cluster and the node is ready to start processing again. (need a better
way of doing this).
Note: Recovery is the problem, I know. my model is only a starting
point for the purposes of discussion and trying to bring something to
the conversation.
Any thoughts or suggestions?
Best Wishes,
Chris Travers