The BDR documentation http://wiki.postgresql.org/images/7/75/BDR_Presentation_PGCon2012.pdf says,
"Physical replication forces us to use just one node: multi-master required for write scalability"
"Physical replication provides best read scalability"
I am inclined to agree with the second statement, but I think my proposal invalidates the first statement, at least for a particular rigorous partitioning over which server owns which data.
In my own workflow, I load lots of data from different sources. The partition the data loads into depends on which source it came from, and it is never mixed or cross referenced in any operation that writes the data. It is only "mixed" in the sense that applications query data from multiple sources.
So for me, multi-master with physical replication seems possible, and would presumably provide the best read scalability. I doubt that I am in the only database user who has this kind of workflow.
The alternatives are ugly. I can load data from separate sources into separate database servers *without* replication between them, but then the application layer has to emulate queries across the data. (Yuck.) Or I can use logical replication such as BDR, but then the servers are spending more effort than with physical replication, so I get less bang for the buck when I purchase more servers to add to the cluster. Or I can use FDW to access data from other servers, but that means the same data may be pulled across the wire arbitrarily many times, with corresponding impact on the bandwidth.
Am I missing something here? Does BDR really provide an equivalent solution?
Second, it seems that BDR leaves to the client the responsibility for making schemas the same everywhere. Perhaps this is just a limitation of the implementation so far, which will be resolved in the future?
On Tuesday, December 31, 2013 12:33 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Mark Dilger wrote:
> This is not entirely "pie in the sky", but feel free to tell me why this is crazy.