Thread: VACUUM and 24/7 database operation
Hello, For one of our customer, we are running a PostgreSQL database on a dynamic PHP-driven site. This site has a minimum of 40 visitors at a time and must be responsive 24h a day. One of the table has 500.000 rows and is very frequently accessed (it is the table registering basic users infos). We have no performance problem dispite the large amount of updates done on this table. The problem is with VACUUMing this table. It takes 2 long minutes everyday. Two minutes during wich no request can be done because of the lock on the table... (nearly every request is involving this large table). Our customer really dislike this fact and is planning to replace PostgreSQL with Oracle. 2 minutes is seen by our customer as sufficent for his customer to get away from his site. Questions : - Is 2 minutes a standard time for vacuuming a 500.000 rows table ? - Can it be reduced ? - In a far future, what are the problems we can run into not vacuuming that table ? We have already seen that after a month, some transactions involving where id >= some_value take forever, so we supressed them. Below are details on the table : erp-# \d visiteurs Table "visiteurs" Attribute | Type | Modifier ---------------+--------------+---------------------- id | integer | not null login | varchar(127) | not null password | varchar(10) | not null name | varchar(10) | not null datecrea | timestamp | not null payszoneid | varchar(127) | not null ptzoneid | varchar(127) | not null dialertitle | varchar(15) | referer | varchar(255) | exported | varchar(2) | not null default 'N' earncentmin | float8 | opearncentmin | float8 | ret | float8 | paymentid | integer | entiteid | varchar(127) | not null etat | varchar(2) | default 'E' devise | smallint | entitelogin | varchar(20) | Indices: visiterus_etat, visiteurs_exported, visiteurs_id_btree, visiteurs_login erp=# select relname,relpages,reltuples from pg_class where relname='visiteurs'; relname | relpages | reltuples -----------+----------+----------- visiteurs | 14549 | 584489 (1 row) Thank you. Thomas FAVIER thomas.favier@accelance.fr ______________________________________________________ ACCELANCE - www.accelance.fr 97, rue Racine - 69100 Villeurbanne Tel: +33 (0)4 37 43 12 22 / Fax: +33 (0)4 37 43 12 20 ______________________________________________________
> For one of our customer, we are running a PostgreSQL database on a > dynamic PHP-driven site. This site has a minimum of 40 visitors at a > time and must be responsive 24h a day. And from the bandwidth and hit logs, they cannot determine a time of day when there are hardly any hits? Possible, but it might be worth a double-check. > The problem is with VACUUMing this table. It takes 2 long minutes > everyday. Two minutes during wich no request can be done because of the > lock on the table... (nearly every request is involving this large > table). Our customer really dislike this fact and is planning to > replace PostgreSQL with Oracle. If they are sufficiently desperate over 2 minutes per day, which is, BTW, less than 0.14% of the time, to want to replace it with Oracle at the full cost of it (and last time I checked, Oracle in full server setup costs were in 5 figures, just for the software), then I'd say let them. It's their money they are throwing away. I am assuming here that they have checked and confirmed that Oracle would no suffer a similar problem? > 2 minutes is seen by our customer as sufficent for his customer to > get away from his site. That would be the 0.14%, would it? What sort of service are they running on it? > - Is 2 minutes a standard time for vacuuming a 500.000 rows table ? I'd say that is pretty good. I have a 30M row table with 30K-100K inserts/updates per day (not including selects here), and I'm quite happy if I can get it to vacuum in less than 15 minutes on a P2/400. > - Can it be reduced ? I guess. If you run your database on am ultra-fast RAID stripe, put more memory and a faster processor in the server, you could probably reduce it. Have you considered running two servers? If you set up two replicated servers, then you could run everything off one server. At a convenient time of day, when there's minimum load, you could swap vacuum the backup one, wait for the queued replicated queries to be executed, and then fail them over. Then, you can vacuum the primary server, and make it wait as the fail-over server until next time it's time to vacuum the database. The fail-over should take a matter of seconds (fractions of seconds?), depending on how you do it. A second server is likely to cost them less than a full Oracle licence... Incidentally, how do they handle backups? I am not sure how well a backup of the database will work out with it still running, and records being inserted during the backup. Running two servers will also get you around this problem, in the same way, because you can shut down a database while you back it up without loosing any uptime or data. > - In a far future, what are the problems we can run into not vacuuming > that table ? We have already seen that after a month, some transactions > involving where id >= some_value take forever, so we supressed them. Performance on inserts and updates will degrade, especially if there are lots of deletes as well. Basically, the tables will get fragmented, because deletes, AFAIK, do lazy deletion, so the tables aren't cleaned out until you vacuum them. This also tends to mess up the index timings because there is a lot of dead records in them. Eventually, it will grind to a halt. Depending on what your performance and requirements are, you could do a vacuum once per week, perhaps? Regards. Gordon
Thomas.Favier@accelance.fr writes: > - Is 2 minutes a standard time for vacuuming a 500.000 rows table ? > - Can it be reduced ? > - In a far future, what are the problems we can run into not vacuuming > that table ? We have already seen that after a month, some transactions > involving where id >= some_value take forever, so we supressed them. If it takes a month before query performance gets bad, then perhaps you could vacuum the table only once a month. However, that vacuum would probably take longer than two minutes, so it's a tradeoff... We have plans for 7.2 to reduce the need for periodic vacuums, but that won't help you much now. There are patches available for a "lazy vacuum" process on 7.0.3, which can be a win if vacuum only needs to get rid of a few rows. But they're not very thoroughly tested IMHO. See http://people.freebsd.org/~alfred/vacfix/ regards, tom lane
Tom, Shouldn't it be possible to build vacuum as an ongoing internal PG process, instead of a seperate operation? How does Oracle byepass this? Must be some way that can be implemented. Any pointers to further reading to brush up my theory in this regard please? IAC, regarding the actual inquiry, wouldn't be a replicated database on a second server be more cheaper than Oracle, if the party is satisfied with PG performance? I browsed some PG commercial organization site that told about a Replication Server being available for PG. I am about to look into that next month. Is it any good like PG? Will provide failover too..rather than using Oracle. With best regards. Sanjay. At 05:53 PM 1/23/01 , Tom Lane wrote: >Thomas.Favier@accelance.fr writes: >> - Is 2 minutes a standard time for vacuuming a 500.000 rows table ? >> - Can it be reduced ? >> - In a far future, what are the problems we can run into not vacuuming >> that table ? We have already seen that after a month, some transactions >> involving where id >= some_value take forever, so we supressed them. > >If it takes a month before query performance gets bad, then perhaps you >could vacuum the table only once a month. However, that vacuum would >probably take longer than two minutes, so it's a tradeoff... > >We have plans for 7.2 to reduce the need for periodic vacuums, but that >won't help you much now. > >There are patches available for a "lazy vacuum" process on 7.0.3, >which can be a win if vacuum only needs to get rid of a few rows. >But they're not very thoroughly tested IMHO. See >http://people.freebsd.org/~alfred/vacfix/ > > regards, tom lane >
> With best regards. > Sanjay. > > At 05:53 PM 1/23/01 , Tom Lane wrote: > >Thomas.Favier@accelance.fr writes: > >> - Is 2 minutes a standard time for vacuuming a 500.000 rows table ? > >> - Can it be reduced ? > >> - In a far future, what are the problems we can run into not vacuuming > >> that table ? We have already seen that after a month, some transactions > >> involving where id >= some_value take forever, so we supressed them. > > > >If it takes a month before query performance gets bad, then perhaps you > >could vacuum the table only once a month. However, that vacuum would > >probably take longer than two minutes, so it's a tradeoff... > > > >We have plans for 7.2 to reduce the need for periodic vacuums, but that > >won't help you much now. > > > >There are patches available for a "lazy vacuum" process on 7.0.3, > >which can be a win if vacuum only needs to get rid of a few rows. > >But they're not very thoroughly tested IMHO. See > >http://people.freebsd.org/~alfred/vacfix/ > > We've been running them since I released them with only a single problem that has never resurfaced. I would say they are pretty stable. It's not "just a few rows" by the way, it's several thousand and up to probably 50,000 rows we get about a 20-40x speedup in the time taken to vacuum (10-15 minutes to 13-40 seconds). This is on tables that are over 300megabytes and indecies that are even larger (multiple column indicies). It's a shame this still hasn't made it into 7.1 :( * Sanjay Arora <sk@pobox.com> [010123 12:10] wrote: > Tom, > > Shouldn't it be possible to build vacuum as an ongoing internal PG process, > instead of a seperate operation? How does Oracle byepass this? Must be some > way that can be implemented. > > Any pointers to further reading to brush up my theory in this regard please? Follow the long trail of my messages on the lists about it, I'd say about 1/3 of my posts have to do with the problems we were facing before contracting Vadim to do the patches available at: http://people.freebsd.org/~alfred/vacfix/ > > IAC, regarding the actual inquiry, wouldn't be a replicated database on a > second server be more cheaper than Oracle, if the party is satisfied with > PG performance? I browsed some PG commercial organization site that told > about a Replication Server being available for PG. I am about to look into > that next month. Is it any good like PG? Will provide failover too..rather > than using Oracle. It should, but I havne't read up on it much. -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] "I have the heart of a child; I keep it in a jar on my desk."
> Shouldn't it be possible to build vacuum as an ongoing internal PG process, > instead of a seperate operation? How does Oracle byepass this? Must be some > way that can be implemented. Well, here's what it comes down to: Do you want updates to happen quickly, and vacuum when load is low, or do you want updates to be slow all the time? I suppose that there are some sites that can't find two minutes per day when updates will block (not selects), but I imagine they're very few. steve
* Steve Wolfe <steve@iboats.com> [010123 13:11] wrote: > > Shouldn't it be possible to build vacuum as an ongoing internal PG > process, > > instead of a seperate operation? How does Oracle byepass this? Must be > some > > way that can be implemented. > > Well, here's what it comes down to: Do you want updates to happen > quickly, and vacuum when load is low, or do you want updates to be slow all > the time? I suppose that there are some sites that can't find two minutes > per day when updates will block (not selects), but I imagine they're very > few. There are some sites where going for more than an hour without a VACUUM makes response times unnaceptable, and each vacuum can take 20 minutes a run. Yes, I'm _an_ exception, but I know I'm not the only one out there. -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] "I have the heart of a child; I keep it in a jar on my desk."
At 1/23/01 4:20 PM, Alfred Perlstein wrote: >* Steve Wolfe <steve@iboats.com> [010123 13:11] wrote: >> Well, here's what it comes down to: Do you want updates to happen >> quickly, and vacuum when load is low, or do you want updates to be slow all >> the time? I suppose that there are some sites that can't find two minutes >> per day when updates will block (not selects), but I imagine they're very >> few. > >There are some sites where going for more than an hour without a >VACUUM makes response times unnaceptable, and each vacuum can take >20 minutes a run. In addition, the suggestion that vacuum isn't a problem because it only takes two minutes is misleading. Sure, it's only 2 minutes out of 24 hours. However, any given visitor isn't at my site for 24 hours. If she arrives just as I'm starting the vacuum, and it takes her less than two minutes to give up and go somewhere else, the site was unavailable 100% of the time as far as she's concerned. If your site is down two minutes a day, and you have 14,000 unique visitors a week, each of whom requires a database update, that's 20 people a week for whom the site isn't working when they arrive. -- Robert L Mathews, Tiger Technologies
I remember the days way back when America Online (that's AOL to you youngsters) ran under GEOWORKS and had just come out for Windows 3.1. We used to pay for every minute we were online over a certain number of hours. There were different pricing packages, but I always took the one that made the individual minutes the least expensive, and that was 6 cents per minute. Even way back then when every minute cost AOL tons and tons and tons of money, they would go down about once every two weeks for either three, four, or five hours for maintenance. I remember it quite well. A message would come on the screen at either 2am, 3am, or 4am (it varied from time to time) saying that they were going down until 7am. It was always until 7am (that was the only thing that was predictable about every instance). They were stupid for doing business that way. They probably lost customers. Maybe even 20 a week. But wait. Even though they operated that way, AOL is now a formidable media and communications powerhouse. The lesson is that you gotta to do what you gotta to do. Even those with the word "Online" right in their name would go offline every so often, regardless of their destiny to become a formidable media and communications powerhouse. If you have to vacuum for 2 minutes a day, then vacuum for 2 minutes a day. When you're a formidable media and communications powerhouse, then you can hire someone to program PostgreSQL to do what you want it to do. Of course, if you're a formidable media and communications powerhouse, you'll probably switch to Oracle and make LE richer instead of giving some poor sap a job, but that's business, too. Doug At 05:13 PM 1/23/01 -0800, Tiger Technologies wrote: >At 1/23/01 4:20 PM, Alfred Perlstein wrote: > >>* Steve Wolfe <steve@iboats.com> [010123 13:11] wrote: >>> Well, here's what it comes down to: Do you want updates to happen >>> quickly, and vacuum when load is low, or do you want updates to be slow all >>> the time? I suppose that there are some sites that can't find two minutes >>> per day when updates will block (not selects), but I imagine they're very >>> few. >> >>There are some sites where going for more than an hour without a >>VACUUM makes response times unnaceptable, and each vacuum can take >>20 minutes a run. > > >In addition, the suggestion that vacuum isn't a problem because it only >takes two minutes is misleading. > >Sure, it's only 2 minutes out of 24 hours. However, any given visitor >isn't at my site for 24 hours. If she arrives just as I'm starting the >vacuum, and it takes her less than two minutes to give up and go >somewhere else, the site was unavailable 100% of the time as far as she's >concerned. > >If your site is down two minutes a day, and you have 14,000 unique >visitors a week, each of whom requires a database update, that's 20 >people a week for whom the site isn't working when they arrive. > >-- >Robert L Mathews, Tiger Technologies
Tiger Technologies wrote: >If your site is down two minutes a day, and you have 14,000 unique >visitors a week, each of whom requires a database update, that's 20 >people a week for whom the site isn't working when they arrive. Since vacuuming is necessary, one solution is to tell the user exactly what is happening. People get very fed up with unresponsive servers, but are much more ready to accept things if they know what is happening. Just before you start the vacuum, redirect queries to a page that says: "Started database maintenance at <time>; this process normally takes about two minutes; time taken so far <mm:ss>. While you are waiting, here are some nice photos to look at." -- Oliver Elphick Oliver.Elphick@lfix.co.uk Isle of Wight http://www.lfix.co.uk/oliver PGP: 1024R/32B8FAA1: 97 EA 1D 47 72 3F 28 47 6B 7E 39 CC 56 E4 C1 47 GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C ======================================== "If anyone has material possessions and sees his brother in need but has no pity on him, how can the love of God be in him?" I John 3:17
Any type of commerce or site that tracks user profiles... That's a LOT of sites that would have a need for a enterprise level DB. Remember, the goal is 5 nines uptime. :) Adam Lang Systems Engineer Rutgers Casualty Insurance Company http://www.rutgersinsurance.com ----- Original Message ----- From: "Steve Wolfe" <steve@iboats.com> To: <pgsql-general@postgresql.org> Sent: Tuesday, January 23, 2001 3:52 PM Subject: Re: [GENERAL] Re: VACUUM and 24/7 database operation > > Shouldn't it be possible to build vacuum as an ongoing internal PG > process, > > instead of a seperate operation? How does Oracle byepass this? Must be > some > > way that can be implemented. > > Well, here's what it comes down to: Do you want updates to happen > quickly, and vacuum when load is low, or do you want updates to be slow all > the time? I suppose that there are some sites that can't find two minutes > per day when updates will block (not selects), but I imagine they're very > few. > > steve >
> Well, here's what it comes down to: Do you want updates to happen > quickly, and vacuum when load is low, or do you want updates to be slow all > the time? I suppose that there are some sites that can't find two minutes > per day when updates will block (not selects), but I imagine they're very few. I think the problem is that there are log and audit files that are generally big on inserts, with periodic cleanups after backing the data up to some offline or other storage. These are always being updated and are only 'select'ed when researching a customer complaint, break-in, general troubleshooting, stats reports, etc. David
> In addition, the suggestion that vacuum isn't a problem because it only > takes two minutes is misleading. > > Sure, it's only 2 minutes out of 24 hours. However, any given visitor > isn't at my site for 24 hours. If she arrives just as I'm starting the > vacuum, and it takes her less than two minutes to give up and go > somewhere else, the site was unavailable 100% of the time as far as she's > concerned. Well, from what I've seen, vacuum's only block updates/inserts, not selects. So, the reality of it will be different depending on your traffic. If you run the kind of site where people are routinely signing up for your services at 4 in the morning, then it's a big deal for you. > If your site is down two minutes a day, and you have 14,000 unique > visitors a week, each of whom requires a database update, that's 20 > people a week for whom the site isn't working when they arrive. Only if by some fluke you have people arriving at evenly spaced schedules every minute of the day and night. In reality, you'll do 5 to 10 times that many (or more) during peak periods, and then in the off-hours, you'll have far, far fewer than that. Sometimes 1/100th or less (again, depending on your customers). That's when you schedule the vacuum. Vacuuming isn't perfect. It would be nice if we could not have to vacuum at all, without sacrificing update speed. For the mean time, I don't think it's going to happen. So, we schedule the vacuums when they do the least damage, and (for us, at least), that damage is negligible. steve