Thread: AppScale backend datastore (NoSQL again kind of)
Hey PostgreSQL speed demons - At work, we're considering an AppScale deployment (that's the Google App Engine roll-your-own http://appscale.cs.ucsb.edu/). It supports multiple technologies to back the datastore part of the platform (HBase, Hypertable, MySQL Cluster, Cassandra, Voldemort, MongoDB, MemcacheDB, Redis). Oddly enough, in performance tests, the MySQL Cluster seems like the general winner (their tests, we haven't done any yet) So, my immediate thought was "How hard would it be to replace the MySQL Cluster bit w/ PostgreSQL?" I'm thinking hot standby/streaming rep. May or may not need a pooling solution in there as well (I need to look at the AppScale abstraction code, it may already be doing the pooling/direction bit.) Any thoughts from those with more experience using/building PostgreSQL clusters? Replacing MySQL Cluster? Clearly they must be using a subset of functionality, since they support so many different backend stores. I'll probably have to set up an instance of all this, run some example apps, and see what's actually stored to get a handle on it. The GAE api for the datastore is sort of a ORM, w/ yet another query language, that seems to map to SQL better than to NoSQL, in any case. There seems to be a fairly explicit exposure of a table==class sort of mapping. Ross -- Ross Reedstrom, Ph.D. reedstrm@rice.edu Systems Engineer & Admin, Research Scientist phone: 713-348-6166 Connexions http://cnx.org fax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE
Regards, Ross.
Dimitri Fontaine gave a excellent talk in the last PgCon about the migration of Fotolog from MySQL to
PostgreSQL with amazing advices around this, so you can contact him for his advice.
- size of the MySQL cluster
- size of the involving data, etc
For the pooling solution, I recommend you to see PgBouncer, it´s a great
project widely used for this topic.
Dimitri Fontaine gave a excellent talk in the last PgCon about the migration of Fotolog from MySQL to
PostgreSQL with amazing advices around this, so you can contact him for his advice.
On 09/13/2012 02:11 PM, Ross Reedstrom wrote:
It depends of many factors:Hey PostgreSQL speed demons - At work, we're considering an AppScale deployment (that's the Google App Engine roll-your-own http://appscale.cs.ucsb.edu/). It supports multiple technologies to back the datastore part of the platform (HBase, Hypertable, MySQL Cluster, Cassandra, Voldemort, MongoDB, MemcacheDB, Redis). Oddly enough, in performance tests, the MySQL Cluster seems like the general winner (their tests, we haven't done any yet) So, my immediate thought was "How hard would it be to replace the MySQL Cluster bit w/ PostgreSQL?" I'm thinking hot standby/streaming rep. May or may not need a pooling solution in there as well (I need to look at the AppScale abstraction code, it may already be doing the pooling/direction bit.)
- size of the MySQL cluster
- size of the involving data, etc
For the pooling solution, I recommend you to see PgBouncer, it´s a great
project widely used for this topic.
Best wishesAny thoughts from those with more experience using/building PostgreSQL clusters? Replacing MySQL Cluster? Clearly they must be using a subset of functionality, since they support so many different backend stores. I'll probably have to set up an instance of all this, run some example apps, and see what's actually stored to get a handle on it. The GAE api for the datastore is sort of a ORM, w/ yet another query language, that seems to map to SQL better than to NoSQL, in any case. There seems to be a fairly explicit exposure of a table==class sort of mapping. Ross
--
Marcos Luis Ortíz Valmaseda
Data Engineer && Sr. System Administrator at UCI
about.me/marcosortiz
My Blog
Tumblr's blog
@marcosluis2186
On 14/09/12 06:11, Ross Reedstrom wrote: > Hey PostgreSQL speed demons - > At work, we're considering an AppScale deployment (that's the Google App Engine > roll-your-own http://appscale.cs.ucsb.edu/). It supports multiple technologies > to back the datastore part of the platform (HBase, Hypertable, MySQL Cluster, > Cassandra, Voldemort, MongoDB, MemcacheDB, Redis). Oddly enough, in performance > tests, the MySQL Cluster seems like the general winner (their tests, we haven't > done any yet) > > So, my immediate thought was "How hard would it be to replace the MySQL > Cluster bit w/ PostgreSQL?" I'm thinking hot standby/streaming rep. May or may > not need a pooling solution in there as well (I need to look at the AppScale > abstraction code, it may already be doing the pooling/direction bit.) > > Any thoughts from those with more experience using/building PostgreSQL clusters? > Replacing MySQL Cluster? Clearly they must be using a subset of functionality, > since they support so many different backend stores. I'll probably have to > set up an instance of all this, run some example apps, and see what's actually > stored to get a handle on it. The GAE api for the datastore is sort of a ORM, > w/ yet another query language, that seems to map to SQL better than to NoSQL, > in any case. There seems to be a fairly explicit exposure of a table==class > sort of mapping. > Postgres-xc might be a good option to consider too. Regards Mark