Thread: Clustering, mirroriing, or replication?
We're trying to figure out how we can implement a reasonably simple cluster of postgres servers on a private network at our store locations. The idea is to have a group of 2-6 PCs each be able to share data in a replicated manner. Having a separate database server is not an option due to the immense cost and logistics (we have over 800 stores), so we need to find a way to mirror or replicate data between all the nodes currently up and on the network. Some nodes might be not always be up, so we need a failover system. I'm testing to see if the dbmirror that comes with Postgres will work, but I haven't got it completely figured out yet. If anyone has experience setting up and maintaing clusters with replication/mirroring, I'd like to hear of your experience and maybe pick your brain a little. Any help you could give would be excellent! Thanks! Jon Brisbin ======================================= Development Programmer NPC International (620) 235-7726
Jon Brisbin wrote: > We're trying to figure out how we can implement a reasonably simple cluster > of postgres servers on a private network at our store locations. The idea > is to have a group of 2-6 PCs each be able to share data in a replicated > manner. Having a separate database server is not an option due to the > immense cost and logistics (we have over 800 stores), so we need to find a > way to mirror or replicate data between all the nodes currently up and on > the network. Some nodes might be not always be up, so we need a failover > system. I'm testing to see if the dbmirror that comes with Postgres will > work, but I haven't got it completely figured out yet. I'm talking well out of my experience here, having never tried to set up 800 sets of desktop PCs as replicated shared DB servers, but I can't believe it's more expensive to put servers in. In my experience, even a cheap white-box dedicated server is more reliable than a PC being used for desktop apps. Assume each user kicks the power cable out of their PC once a year, that means you'll be dealing with several failovers a day on average. Assuming you can't get a server for each store (that'd be a minimum of £400 * 800 = £320,000 in the UK), is there any networking between them? -- Richard Huxton Archonet Ltd
Jon Brisbin wrote: > At each store, there are a number of POS registers that are networked. If > they enter a ticket at one register, they should be able to go to any other > register to tender the transaction, etc... > > We already have mechanisms in place to remotely download and install > software on existing PCs, so no additional outlay of cash would be required > to set up these existing machines with more software. To install new PCs in > 800 stores at the same time would be a nightmare of logistics that we can't > currently contemplate. We also can't contemplate what nasty things will be > done to us if we go to the executives and say we're going to fix their > problems by forcing the stores to rely on only one machine. The restaurant > business is not kind to PC hardware. OK, fair enough. > One option is to have the POS software always write to a local database, > with that change getting replicated to all other active nodes. Another > option is to have a "master" database, with a failover as backup. Downside > here is that if both machines fail, then they are dead in the water. We > can't make money if the registers aren't operational. However, this is > similar to what happens now (which is why we want to change it.) > > Having mucked with postgres replication a little in the last couple of days, > I'm starting to wonder just how long it will take us to develop a good > replication mechanism. If one is already in place, I'd just like to use > that. Or at least learn from someone else trying to set up something > similar. Well, sounds like Slony would pretty much do what you want (one master, multiple backups) apart from the failover. I'm assuming your POS database is fairly simple and your transactions are at a slow rate which means the traffic generated by replication would be minimal. What constitutes a "dead" database is the tricky part. Do you have a "master" terminal at the moment or are they all strictly peers? -- Richard Huxton Archonet Ltd
At each store, there are a number of POS registers that are networked. If they enter a ticket at one register, they should be able to go to any other register to tender the transaction, etc... We already have mechanisms in place to remotely download and install software on existing PCs, so no additional outlay of cash would be required to set up these existing machines with more software. To install new PCs in 800 stores at the same time would be a nightmare of logistics that we can't currently contemplate. We also can't contemplate what nasty things will be done to us if we go to the executives and say we're going to fix their problems by forcing the stores to rely on only one machine. The restaurant business is not kind to PC hardware. One option is to have the POS software always write to a local database, with that change getting replicated to all other active nodes. Another option is to have a "master" database, with a failover as backup. Downside here is that if both machines fail, then they are dead in the water. We can't make money if the registers aren't operational. However, this is similar to what happens now (which is why we want to change it.) Having mucked with postgres replication a little in the last couple of days, I'm starting to wonder just how long it will take us to develop a good replication mechanism. If one is already in place, I'd just like to use that. Or at least learn from someone else trying to set up something similar. Thanks! Jon Brisbin ======================================= Development Programmer NPC International (620) 235-7726 -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Richard Huxton Sent: Thursday, August 12, 2004 3:32 AM To: Jon Brisbin Cc: pgSQL General Subject: Re: [GENERAL] Clustering, mirroriing, or replication? Jon Brisbin wrote: > We're trying to figure out how we can implement a reasonably simple cluster > of postgres servers on a private network at our store locations. The idea > is to have a group of 2-6 PCs each be able to share data in a replicated > manner. Having a separate database server is not an option due to the > immense cost and logistics (we have over 800 stores), so we need to find a > way to mirror or replicate data between all the nodes currently up and on > the network. Some nodes might be not always be up, so we need a failover > system. I'm testing to see if the dbmirror that comes with Postgres will > work, but I haven't got it completely figured out yet. I'm talking well out of my experience here, having never tried to set up 800 sets of desktop PCs as replicated shared DB servers, but I can't believe it's more expensive to put servers in. In my experience, even a cheap white-box dedicated server is more reliable than a PC being used for desktop apps. Assume each user kicks the power cable out of their PC once a year, that means you'll be dealing with several failovers a day on average. Assuming you can't get a server for each store (that'd be a minimum of £400 * 800 = £320,000 in the UK), is there any networking between them? -- Richard Huxton Archonet Ltd ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Most replication systems add a fair amount of complexity. How reliable are current replication systems? Are they replication systems for performance or for reliability+availability? How much does it cost to make sure that the probability of both master and failover machines failing is lower or as low as the probability of the replication system failing? What would the impact be e.g. resulting downtime etc? I have seen some HA + load balancing firewall (not DB) systems which were hardly worth the hassle - seemed to have just about the same amount of downtime (if not more due to the added complexity and interactions with other stuff - e.g. buggy switches). Some people have made similar mutterings about some DB clustering solutions. Even if that's because they didn't do it right, it may be because it's difficult to do it right, and that's not good is it? Could just be a "feel good thing" that doesn't really add much to reliability for all the added effort and overhead. Does anyone know what's the most reliable platform postgresql can run on? With or without scheduled downtime? At 11:03 AM 8/12/2004 -0500, Jon Brisbin wrote: >One option is to have the POS software always write to a local database, >with that change getting replicated to all other active nodes. Another >option is to have a "master" database, with a failover as backup. Downside >here is that if both machines fail, then they are dead in the water. We >can't make money if the registers aren't operational. However, this is >similar to what happens now (which is why we want to change it.) > >Having mucked with postgres replication a little in the last couple of days, >I'm starting to wonder just how long it will take us to develop a good >replication mechanism. If one is already in place, I'd just like to use >that. Or at least learn from someone else trying to set up something >similar.
I find 7.3 and 7.4 to be rock solid on dual processor servers of Intel P4 and Xeon running RH9 and Xeon running Fedora Core 2. My systems run heavy load for about 6 hours straight each night when it is syncing data with legacy systems (disk io and cpu maxed out). I don't use replication, just nightly dump and restore to backup server. Indeed the only problems I have ever had turned out to be hardware, a bad motherboard. Terry Fielder Manager Software Development and Deployment Great Gulf Homes / Ashton Woods Homes terry@greatgulfhomes.com Fax: (416) 441-9085 > -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Lincoln Yeoh > Sent: Friday, August 13, 2004 10:25 AM > To: Jon Brisbin; Richard Huxton > Cc: pgSQL General > Subject: Re: [GENERAL] Clustering, mirroriing, or replication? > > > Most replication systems add a fair amount of complexity. How > reliable are > current replication systems? Are they replication systems for > performance > or for reliability+availability? > > How much does it cost to make sure that the probability of > both master and > failover machines failing is lower or as low as the > probability of the > replication system failing? What would the impact be e.g. resulting > downtime etc? > > I have seen some HA + load balancing firewall (not DB) > systems which were > hardly worth the hassle - seemed to have just about the same > amount of > downtime (if not more due to the added complexity and > interactions with > other stuff - e.g. buggy switches). > > Some people have made similar mutterings about some DB clustering > solutions. Even if that's because they didn't do it right, it may be > because it's difficult to do it right, and that's not good is > it? Could > just be a "feel good thing" that doesn't really add much to > reliability for > all the added effort and overhead. > > Does anyone know what's the most reliable platform postgresql > can run on? > With or without scheduled downtime? > > At 11:03 AM 8/12/2004 -0500, Jon Brisbin wrote: > >One option is to have the POS software always write to a > local database, > >with that change getting replicated to all other active > nodes. Another > >option is to have a "master" database, with a failover as > backup. Downside > >here is that if both machines fail, then they are dead in > the water. We > >can't make money if the registers aren't operational. > However, this is > >similar to what happens now (which is why we want to change it.) > > > >Having mucked with postgres replication a little in the last > couple of days, > >I'm starting to wonder just how long it will take us to > develop a good > >replication mechanism. If one is already in place, I'd just > like to use > >that. Or at least learn from someone else trying to set up something > >similar. > > > ---------------------------(end of > broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index > scan if your > joining column's datatypes do not match >
> > Does anyone know what's the most reliable platform postgresql can run on? > With or without scheduled downtime? This reminded me... Not long ago I was looking at used sun servers. You can pick up refurbished sun enterprise systems for between $4,000 and $10,000 fairly easily. For instance an enterprise E3500 with 8 processors, 4gb ram, and 8 18gb fibre channel drives was only $5,000. I couldn't put together an equivalent intel system for the same price. Chris