Thread: Replication and coding good practices
I'm starting to see the need of a replication solution in my horizon. I've learned that some replication solution require code modification, some require different assumption abut how connections behave... Are there any rules of thumb to consider for making an application easier to work with a "general" replication solution? The applications I mostly deal with are e-commerce sites. -- Ivan Sergio Borgonovo http://www.webthatworks.it
On Sun, Jun 28, 2009 at 12:05:08PM +0200, Ivan Sergio Borgonovo wrote: > I'm starting to see the need of a replication solution in my > horizon. > > I've learned that some replication solution require code > modification, some require different assumption abut how connections > behave... > > Are there any rules of thumb to consider for making an application > easier to work with a "general" replication solution? > > The applications I mostly deal with are e-commerce sites. It really depends on what replication solution you choose, along with the environment you're deploying into. That said, I've noticed that the things that are generally good practice help you even more when you're doing replication. Practices I've seen help directly: * Separate read users and code from write users and code. * Separate DDL from both of the above. * Make DDL changes part of your deployment process and only allow them in files which track in your SCM system. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Sun, 2009-06-28 at 09:01 -0700, David Fetter wrote: > > Are there any rules of thumb to consider for making an application > > easier to work with a "general" replication solution? > > > > The applications I mostly deal with are e-commerce sites. > > It really depends on what replication solution you choose, along with > the environment you're deploying into. ... and why you need replication. Reliability/Availability? Data storage redundancy? Performance? And if performance, read-mostly performance or write-heavy performance? > That said, I've noticed that the things that are generally good > practice help you even more when you're doing replication. > > Practices I've seen help directly: > > * Separate read users and code from write users and code. > > * Separate DDL from both of the above. > > * Make DDL changes part of your deployment process and only allow them > in files which track in your SCM system. Version your schema, storing the schema version in a 1-row table or even as a stable function. This makes it much easier for deployment tools or staff to easily see what needs to be done to get the schema and app to the latest version - there's no "what the hell is the current state of this thing, anyway?" to worry about. -- Craig Ringer
On Mon, 29 Jun 2009 19:11:43 +0800 Craig Ringer <craig@postnewspapers.com.au> wrote: > On Sun, 2009-06-28 at 09:01 -0700, David Fetter wrote: > > > Are there any rules of thumb to consider for making an > > > application easier to work with a "general" replication > > > solution? > > > > > > The applications I mostly deal with are e-commerce sites. > > > > It really depends on what replication solution you choose, along > > with the environment you're deploying into. > > ... and why you need replication. Reliability/Availability? Data > storage redundancy? Performance? And if performance, read-mostly > performance or write-heavy performance? 1) performance, read-mostly 2) reliability I'm making large use of plpgsql mainly for: - encapsulation - single point of truth - implicit transaction Most of the write operations don't have to be aware of a multi user environment. Still in some part of the code things have to be aware of transactions, multi user environment (or better multiple connections from the same user) etc... Not only these parts are rare, they are seldom executed too. So my main concern about the parts that may be problematic in a replicated context is to keep maintenance low and development easy. eg. I've started to use temp tables but I guess they may cause some problems in conjunction with connection pooling systems. > > That said, I've noticed that the things that are generally good > > practice help you even more when you're doing replication. > > > > Practices I've seen help directly: > > > > * Separate read users and code from write users and code. > > > > * Separate DDL from both of the above. > > > > * Make DDL changes part of your deployment process and only > > allow them in files which track in your SCM system. > Version your schema, storing the schema version in a 1-row table > or even as a stable function. This makes it much easier for > deployment tools or staff to easily see what needs to be done to > get the schema and app to the latest version - there's no "what > the hell is the current state of this thing, anyway?" to worry > about. This is another area I'd like to learn more about available techniques for managing development. But currently I was more interested in coding techniques to avoid maintenance/porting problems once I'll have to support a replication solution. At the moment schema changes are saved in a file together with the web application code. I was thinking to automate the application of schema changes with a hook in svn, but right now it doesn't look as a good investment. Still I'm very interested in techniques to version schema changes and bring them together with code change and being able to diff them. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it
On Mon, Jun 29, 2009 at 07:11:43PM +0800, Craig Ringer wrote: > On Sun, 2009-06-28 at 09:01 -0700, David Fetter wrote: > > > Are there any rules of thumb to consider for making an application > > > easier to work with a "general" replication solution? > > > > > > The applications I mostly deal with are e-commerce sites. > > > > It really depends on what replication solution you choose, along with > > the environment you're deploying into. > > ... and why you need replication. Reliability/Availability? Data storage > redundancy? Performance? And if performance, read-mostly performance or > write-heavy performance? It's this kind of discussion that you might want to hire experts to help with :) Commandprompt, Endpoint, OmniTI and the outfit I work for, PostgreSQL Experts <http://www.pgexperts.com> would be examples. > > That said, I've noticed that the things that are generally good > > practice help you even more when you're doing replication. > > > > Practices I've seen help directly: > > > > * Separate read users and code from write users and code. > > > > * Separate DDL from both of the above. > > > > * Make DDL changes part of your deployment process and only allow them > > in files which track in your SCM system. > > Version your schema, storing the schema version in a 1-row table or > even as a stable function. This makes it much easier for deployment > tools or staff to easily see what needs to be done to get the schema > and app to the latest version - there's no "what the hell is the > current state of this thing, anyway?" to worry about. When versioning schemas, make sure your deployment tools are always atomic and that the schema version can't be modified by anything but those tools. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Mon, 2009-06-29 at 07:34 -0700, David Fetter wrote: > On Mon, Jun 29, 2009 at 07:11:43PM +0800, Craig Ringer wrote: > > On Sun, 2009-06-28 at 09:01 -0700, David Fetter wrote: > > > > Are there any rules of thumb to consider for making an application > > > > easier to work with a "general" replication solution? > > > > > > > > The applications I mostly deal with are e-commerce sites. > > > > > > It really depends on what replication solution you choose, along with > > > the environment you're deploying into. > > > > ... and why you need replication. Reliability/Availability? Data storage > > redundancy? Performance? And if performance, read-mostly performance or > > write-heavy performance? > > It's this kind of discussion that you might want to hire experts to > help with :) Commandprompt, Endpoint, OmniTI and the outfit I work > for, PostgreSQL Experts <http://www.pgexperts.com> would be examples. Probably should mention 2ndQuadrant also, since we have an Italian office (for the original poster) and we have been developing replication for PostgreSQL for some time now. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support