Thread: Load Balancing/Multiple Postgres Machines
Hi all,
Not sure if this has been asked before - it probably has been - but I figure its probably just easier if I ask so my question is this:
How easy is it - or rather is it possible to create multiple instances of the same db on multiple machines?
We are currently running a very highly used postgres db (Ver 7.2) that eats up a lot of cpu time when its heavily used. We recently bought some very powerful equipment to accommodate this high use, (currently its on a single CPU, 1GB of Ram SCSI server - upgraded to a dual CPU, 4GB of ram SCSI server) but my fear is over time this issue is going to haunt me again and hardware can only be upgraded so much.
So is it possible to do some level of load balancing in postgres - I would love a situation where I could have multiple machines running multiple instances of the same db (that are all synced in real-time), where there is some smart level of load balancing happening.
So if it is possible is there some instruction info out there to help me along my way. Any help or insight would be greatly appreciated.
All the best
Adile
Adile Abbadi wrote: > Hi all, > > Not sure if this has been asked before - it probably has been - but I figure > its probably just easier if I ask so my question is this: > > How easy is it - or rather is it possible to create multiple instances of > the same db on multiple machines? > > We are currently running a very highly used postgres db (Ver 7.2) that eats > up a lot of cpu time when its heavily used. We recently bought some very > powerful equipment to accommodate this high use, (currently its on a single > CPU, 1GB of Ram SCSI server - upgraded to a dual CPU, 4GB of ram SCSI > server) but my fear is over time this issue is going to haunt me again and > hardware can only be upgraded so much. Well you have a very long way to go before you get to that "too much" stage. Within your current platform it should be possible to upgrade to 8GB of ram and you can almost always add more hard drives. > So is it possible to do some level of load balancing in postgres - I would > love a situation where I could have multiple machines running multiple > instances of the same db (that are all synced in real-time), where there is > some smart level of load balancing happening. You may want to look at pgpool, slony or Mammoth Replicator. > > So if it is possible is there some instruction info out there to help me > along my way. Any help or insight would be greatly appreciated. Sincerely, Joshua D. Drake > > All the best > > Adile > > --- > Outgoing mail is certified Virus Free. > Checked by AVG anti-virus system (http://www.grisoft.com). > Version: 6.0.801 / Virus Database: 544 - Release Date: 11/24/04 > -- Command Prompt, Inc., home of PostgreSQL Replication, and plPHP. Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
Attachment
On November 29, 2004 11:30 am, Joshua D. Drake wrote: > Adile Abbadi wrote: > > Hi all, > > > > Not sure if this has been asked before - it probably has been - but I > > figure its probably just easier if I ask so my question is this: > > > > How easy is it - or rather is it possible to create multiple instances of > > the same db on multiple machines? > > > > We are currently running a very highly used postgres db (Ver 7.2) that > > eats up a lot of cpu time when its heavily used. We recently bought some > > very powerful equipment to accommodate this high use, (currently its on a > > single CPU, 1GB of Ram SCSI server - upgraded to a dual CPU, 4GB of ram > > SCSI server) but my fear is over time this issue is going to haunt me > > again and hardware can only be upgraded so much. > > Well you have a very long way to go before you get to that "too much" > stage. > > Within your current platform it should be possible to upgrade to 8GB of > ram and you can almost always add more hard drives. > > > So is it possible to do some level of load balancing in postgres - I > > would love a situation where I could have multiple machines running > > multiple instances of the same db (that are all synced in real-time), > > where there is some smart level of load balancing happening. > > You may want to look at pgpool, slony or Mammoth Replicator. > > > So if it is possible is there some instruction info out there to help me > > along my way. Any help or insight would be greatly appreciated. You may also want to look at upgrading to PostgreSQL 7.4.6 or 8.0 since they both go a long way to improve performance, which might mittage your CPU bound problem as it is now. > > Sincerely, > > Joshua D. Drake > > > All the best > > > > Adile > > > > --- > > Outgoing mail is certified Virus Free. > > Checked by AVG anti-virus system (http://www.grisoft.com). > > Version: 6.0.801 / Virus Database: 544 - Release Date: 11/24/04 -- Darcy Buskermolen Wavefire Technologies Corp. ph: 250.717.0200 fx: 250.763.1759 http://www.wavefire.com
Hi, I have been considering this in my spare time for a little while too. Joshua already mentioned slony and pgpool, and you may want to look at heartbeat too, though I don't know if it is strictly neccessary in a slony/pgpoos installation. My focus is more on replication for high availability and failover than performance. Replication will involve some overhead so to offset that your application must be set up to take good advantage of the load balancing possibilities. There is no guarantee that your application will perform faster even if you do implement replication. As such, I don't know of any viable true _synchronous_ replication system for postgres, the options listed above seem best suited for async replication (using slony) and load balancing of SELECTs (not UPDATE INSERT DELETE) using pg pool. If your application has a very heavy SELECT component from browsing users (perhaps from a web application) then you may be able to have a large portion of your SELECT SQL diverted to the slave database, freeing the master to handle all updating of data. Whether this will help you or not, I don't know. Analyse the options and your requirements and test it. If you come up with any good information, I'd be very intertested to hear it. I posted a summary of options as I understood them in the admin section. If your search on availabilty and failover you should find it. Regards Iain ----- Original Message ----- From: Adile Abbadi To: pgsql-admin@postgresql.org Sent: Tuesday, November 30, 2004 3:34 AM Subject: [ADMIN] Load Balancing/Multiple Postgres Machines Hi all, Not sure if this has been asked before - it probably has been - but I figure its probably just easier if I ask so my question is this: How easy is it - or rather is it possible to create multiple instances of the same db on multiple machines? We are currently running a very highly used postgres db (Ver 7.2) that eats up a lot of cpu time when its heavily used. We recently bought some very powerful equipment to accommodate this high use, (currently its on a single CPU, 1GB of Ram SCSI server - upgraded to a dual CPU, 4GB of ram SCSI server) but my fear is over time this issue is going to haunt me again and hardware can only be upgraded so much. So is it possible to do some level of load balancing in postgres - I would love a situation where I could have multiple machines running multiple instances of the same db (that are all synced in real-time), where there is some smart level of load balancing happening. So if it is possible is there some instruction info out there to help me along my way. Any help or insight would be greatly appreciated. All the best Adile
Adile Abbadi wrote: > Hi all, > > Not sure if this has been asked before - it probably has been - but I > figure its probably just easier if I ask so my question is this: > > How easy is it - or rather is it possible to create multiple instances > of the same db on multiple machines? > > We are currently running a very highly used postgres db (Ver 7.2) that > eats up a lot of cpu time when its heavily used. We recently bought > some very powerful equipment to accommodate this high use, (currently > its on a single CPU, 1GB of Ram SCSI server - upgraded to a dual CPU, > 4GB of ram SCSI server) but my fear is over time this issue is going > to haunt me again and hardware can only be upgraded so much. > > So is it possible to do some level of load balancing in postgres - I > would love a situation where I could have multiple machines running > multiple instances of the same db (that are all synced in real-time), > where there is some smart level of load balancing happening. > > So if it is possible is there some instruction info out there to help > me along my way. Any help or insight would be greatly appreciated. > > All the best > > Adile > Hello, I read an interesting paper on this a while ago, you can find it here: http://www.cnds.jhu.edu/pub/papers/cnds-2003-3.pdf It's called "On the Performance of Consistent Wide-Area Database Replication", and presents an analysis of a Postgres-based replication system. Unfortunately, the implementation wasn't made available, but the ideas may be worth investigating for someone with more free time than myself. -Mark.
Adile Abbadi wrote: > Thank you very Joshua for your help - Glad to hear my hardware should be > able to handle my needs and there is something out there. Do you know of any > documentation or online help that I could refer on the apps listed below? > Also are they separate apps that are to be installed and configured outside > postgres or are they included with the postgres build? Slony: www.slony.info Mammoth Replicator: www.commandprompt.com PgPool: http://www2b.biglobe.ne.jp/~caco/pgpool/index-e.html > > Cheers > > Adile > > > -----Original Message----- > From: Joshua D. Drake [mailto:jd@commandprompt.com] > Sent: November 29, 2004 12:30 PM > To: Adile Abbadi > Cc: pgsql-admin@postgresql.org > Subject: Re: [ADMIN] Load Balancing/Multiple Postgres Machines > > > Adile Abbadi wrote: > >>Hi all, >> >>Not sure if this has been asked before - it probably has been - but I > > figure > >>its probably just easier if I ask so my question is this: >> >>How easy is it - or rather is it possible to create multiple instances of >>the same db on multiple machines? >> >>We are currently running a very highly used postgres db (Ver 7.2) that > > eats > >>up a lot of cpu time when its heavily used. We recently bought some very >>powerful equipment to accommodate this high use, (currently its on a > > single > >>CPU, 1GB of Ram SCSI server - upgraded to a dual CPU, 4GB of ram SCSI >>server) but my fear is over time this issue is going to haunt me again and >>hardware can only be upgraded so much. > > > Well you have a very long way to go before you get to that "too much" stage. > > Within your current platform it should be possible to upgrade to 8GB of > ram and you can almost always add more hard drives. > > >>So is it possible to do some level of load balancing in postgres - I would >>love a situation where I could have multiple machines running multiple >>instances of the same db (that are all synced in real-time), where there > > is > >>some smart level of load balancing happening. > > > You may want to look at pgpool, slony or Mammoth Replicator. > > > >>So if it is possible is there some instruction info out there to help me >>along my way. Any help or insight would be greatly appreciated. > > > Sincerely, > > Joshua D. Drake > > > > > >>All the best >> >>Adile >> >>--- >>Outgoing mail is certified Virus Free. >>Checked by AVG anti-virus system (http://www.grisoft.com). >>Version: 6.0.801 / Virus Database: 544 - Release Date: 11/24/04 >> > > > > -- > Command Prompt, Inc., home of PostgreSQL Replication, and plPHP. > Postgresql support, programming shared hosting and dedicated hosting. > +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com > Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL > > --- > Incoming mail is certified Virus Free. > Checked by AVG anti-virus system (http://www.grisoft.com). > Version: 6.0.801 / Virus Database: 544 - Release Date: 11/24/04 > > --- > Outgoing mail is certified Virus Free. > Checked by AVG anti-virus system (http://www.grisoft.com). > Version: 6.0.801 / Virus Database: 544 - Release Date: 11/24/04 -- Command Prompt, Inc., home of PostgreSQL Replication, and plPHP. Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
Attachment
> So what you are saying is that pg pool is used for load balancing and > slony > is used for replication? Yes, as I understand it. > I do have the option programmatically in my code to > distribute the load on the DB as you had mentioned (for example the SELECT > issue you mentioned), so if I was to go down that road would sloany be the > best way to go, or do I still need to use pgpool. Also how easy is sloany > to > setup? It sounds like the slony/pg pool combination might be worth considering. I havn't had a chance to play with slony yet, so I can't say.too much about whether it is easy to setup or not. From what I have seen, it looks reasonably easy considering what it it is trying to achieve. I might have a chance to test some of this out for myself in the new year, but that's not decided yet. Regards Iain > > Cheers > > Adile > > > -----Original Message----- > From: Iain [mailto:iain@mst.co.jp] > Sent: November 29, 2004 7:22 PM > To: Adile Abbadi; pgsql-admin@postgresql.org > Subject: Re: [ADMIN] Load Balancing/Multiple Postgres Machines > > > Hi, > > I have been considering this in my spare time for a little while too. > > Joshua already mentioned slony and pgpool, and you may want to look at > heartbeat too, though I don't know if it is strictly neccessary in a > slony/pgpoos installation. My focus is more on replication for high > availability and failover than performance. > > Replication will involve some overhead so to offset that your application > must be set up to take good advantage of the load balancing possibilities. > There is no guarantee that your application will perform faster even if > you > do implement replication. > > As such, I don't know of any viable true _synchronous_ replication system > for postgres, the options listed above seem best suited for async > replication (using slony) and load balancing of SELECTs (not UPDATE INSERT > DELETE) using pg pool. If your application has a very heavy SELECT > component > from browsing users (perhaps from a web application) then you may be able > to > have a large portion of your SELECT SQL diverted to the slave database, > freeing the master to handle all updating of data. > > Whether this will help you or not, I don't know. Analyse the options and > your requirements and test it. If you come up with any good information, > I'd > be very intertested to hear it. > > I posted a summary of options as I understood them in the admin section. > If > your search on availabilty and failover you should find it. > > Regards > Iain > > ----- Original Message ----- > From: Adile Abbadi > To: pgsql-admin@postgresql.org > Sent: Tuesday, November 30, 2004 3:34 AM > Subject: [ADMIN] Load Balancing/Multiple Postgres Machines > > > Hi all, > > Not sure if this has been asked before - it probably has been - but I > figure > its probably just easier if I ask so my question is this: > > How easy is it - or rather is it possible to create multiple instances of > the same db on multiple machines? > > We are currently running a very highly used postgres db (Ver 7.2) that > eats > up a lot of cpu time when its heavily used. We recently bought some very > powerful equipment to accommodate this high use, (currently its on a > single > CPU, 1GB of Ram SCSI server - upgraded to a dual CPU, 4GB of ram SCSI > server) but my fear is over time this issue is going to haunt me again and > hardware can only be upgraded so much. > > So is it possible to do some level of load balancing in postgres - I would > love a situation where I could have multiple machines running multiple > instances of the same db (that are all synced in real-time), where there > is > some smart level of load balancing happening. > > So if it is possible is there some instruction info out there to help me > along my way. Any help or insight would be greatly appreciated. > > All the best > > Adile > > --- > Incoming mail is certified Virus Free. > Checked by AVG anti-virus system (http://www.grisoft.com). > Version: 6.0.801 / Virus Database: 544 - Release Date: 11/24/04 > > --- > Outgoing mail is certified Virus Free. > Checked by AVG anti-virus system (http://www.grisoft.com). > Version: 6.0.801 / Virus Database: 544 - Release Date: 11/24/04
Thanx a ton - I will look into it when I get a moment. Cheers Adile -----Original Message----- From: m [mailto:postgres@markreid.org] Sent: November 30, 2004 12:04 PM To: Adile Abbadi Cc: pgsql-admin@postgresql.org Subject: Re: [ADMIN] Load Balancing/Multiple Postgres Machines Adile Abbadi wrote: > Hi all, > > Not sure if this has been asked before - it probably has been - but I > figure its probably just easier if I ask so my question is this: > > How easy is it - or rather is it possible to create multiple instances > of the same db on multiple machines? > > We are currently running a very highly used postgres db (Ver 7.2) that > eats up a lot of cpu time when its heavily used. We recently bought > some very powerful equipment to accommodate this high use, (currently > its on a single CPU, 1GB of Ram SCSI server - upgraded to a dual CPU, > 4GB of ram SCSI server) but my fear is over time this issue is going > to haunt me again and hardware can only be upgraded so much. > > So is it possible to do some level of load balancing in postgres - I > would love a situation where I could have multiple machines running > multiple instances of the same db (that are all synced in real-time), > where there is some smart level of load balancing happening. > > So if it is possible is there some instruction info out there to help > me along my way. Any help or insight would be greatly appreciated. > > All the best > > Adile > Hello, I read an interesting paper on this a while ago, you can find it here: http://www.cnds.jhu.edu/pub/papers/cnds-2003-3.pdf It's called "On the Performance of Consistent Wide-Area Database Replication", and presents an analysis of a Postgres-based replication system. Unfortunately, the implementation wasn't made available, but the ideas may be worth investigating for someone with more free time than myself. -Mark. --- Incoming mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.801 / Virus Database: 544 - Release Date: 11/24/04 --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.801 / Virus Database: 544 - Release Date: 11/24/04
Hi Iain, So what you are saying is that pg pool is used for load balancing and slony is used for replication? I do have the option programmatically in my code to distribute the load on the DB as you had mentioned (for example the SELECT issue you mentioned), so if I was to go down that road would sloany be the best way to go, or do I still need to use pgpool. Also how easy is sloany to setup? Cheers Adile -----Original Message----- From: Iain [mailto:iain@mst.co.jp] Sent: November 29, 2004 7:22 PM To: Adile Abbadi; pgsql-admin@postgresql.org Subject: Re: [ADMIN] Load Balancing/Multiple Postgres Machines Hi, I have been considering this in my spare time for a little while too. Joshua already mentioned slony and pgpool, and you may want to look at heartbeat too, though I don't know if it is strictly neccessary in a slony/pgpoos installation. My focus is more on replication for high availability and failover than performance. Replication will involve some overhead so to offset that your application must be set up to take good advantage of the load balancing possibilities. There is no guarantee that your application will perform faster even if you do implement replication. As such, I don't know of any viable true _synchronous_ replication system for postgres, the options listed above seem best suited for async replication (using slony) and load balancing of SELECTs (not UPDATE INSERT DELETE) using pg pool. If your application has a very heavy SELECT component from browsing users (perhaps from a web application) then you may be able to have a large portion of your SELECT SQL diverted to the slave database, freeing the master to handle all updating of data. Whether this will help you or not, I don't know. Analyse the options and your requirements and test it. If you come up with any good information, I'd be very intertested to hear it. I posted a summary of options as I understood them in the admin section. If your search on availabilty and failover you should find it. Regards Iain ----- Original Message ----- From: Adile Abbadi To: pgsql-admin@postgresql.org Sent: Tuesday, November 30, 2004 3:34 AM Subject: [ADMIN] Load Balancing/Multiple Postgres Machines Hi all, Not sure if this has been asked before - it probably has been - but I figure its probably just easier if I ask so my question is this: How easy is it - or rather is it possible to create multiple instances of the same db on multiple machines? We are currently running a very highly used postgres db (Ver 7.2) that eats up a lot of cpu time when its heavily used. We recently bought some very powerful equipment to accommodate this high use, (currently its on a single CPU, 1GB of Ram SCSI server - upgraded to a dual CPU, 4GB of ram SCSI server) but my fear is over time this issue is going to haunt me again and hardware can only be upgraded so much. So is it possible to do some level of load balancing in postgres - I would love a situation where I could have multiple machines running multiple instances of the same db (that are all synced in real-time), where there is some smart level of load balancing happening. So if it is possible is there some instruction info out there to help me along my way. Any help or insight would be greatly appreciated. All the best Adile --- Incoming mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.801 / Virus Database: 544 - Release Date: 11/24/04 --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.801 / Virus Database: 544 - Release Date: 11/24/04
Thank you very Joshua for your help - Glad to hear my hardware should be able to handle my needs and there is something out there. Do you know of any documentation or online help that I could refer on the apps listed below? Also are they separate apps that are to be installed and configured outside postgres or are they included with the postgres build? Cheers Adile -----Original Message----- From: Joshua D. Drake [mailto:jd@commandprompt.com] Sent: November 29, 2004 12:30 PM To: Adile Abbadi Cc: pgsql-admin@postgresql.org Subject: Re: [ADMIN] Load Balancing/Multiple Postgres Machines Adile Abbadi wrote: > Hi all, > > Not sure if this has been asked before - it probably has been - but I figure > its probably just easier if I ask so my question is this: > > How easy is it - or rather is it possible to create multiple instances of > the same db on multiple machines? > > We are currently running a very highly used postgres db (Ver 7.2) that eats > up a lot of cpu time when its heavily used. We recently bought some very > powerful equipment to accommodate this high use, (currently its on a single > CPU, 1GB of Ram SCSI server - upgraded to a dual CPU, 4GB of ram SCSI > server) but my fear is over time this issue is going to haunt me again and > hardware can only be upgraded so much. Well you have a very long way to go before you get to that "too much" stage. Within your current platform it should be possible to upgrade to 8GB of ram and you can almost always add more hard drives. > So is it possible to do some level of load balancing in postgres - I would > love a situation where I could have multiple machines running multiple > instances of the same db (that are all synced in real-time), where there is > some smart level of load balancing happening. You may want to look at pgpool, slony or Mammoth Replicator. > > So if it is possible is there some instruction info out there to help me > along my way. Any help or insight would be greatly appreciated. Sincerely, Joshua D. Drake > > All the best > > Adile > > --- > Outgoing mail is certified Virus Free. > Checked by AVG anti-virus system (http://www.grisoft.com). > Version: 6.0.801 / Virus Database: 544 - Release Date: 11/24/04 > -- Command Prompt, Inc., home of PostgreSQL Replication, and plPHP. Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL --- Incoming mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.801 / Virus Database: 544 - Release Date: 11/24/04 --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.801 / Virus Database: 544 - Release Date: 11/24/04
On Mon, Nov 29, 2004 at 04:06:50PM -0800, Darcy Buskermolen wrote: > > You may also want to look at upgrading to PostgreSQL 7.4.6 or 8.0 since they > both go a long way to improve performance, which might mittage your CPU bound > problem as it is now. What Darcy said. We had an immense speed improvement by moving from 7.2.x to 7.4.x -- so much so that my QA guys didn't believe their tests the first time, and ran them all again. A -- Andrew Sullivan | ajs@crankycanuck.ca Information security isn't a technological problem. It's an economics problem. --Bruce Schneier