Re: VACUUM and 24/7 database operation - Mailing list pgsql-general
From | Gordan Bobic |
---|---|
Subject | Re: VACUUM and 24/7 database operation |
Date | |
Msg-id | 001901c08562$0296cb60$8000000a@localdomain Whole thread Raw |
In response to | VACUUM and 24/7 database operation (Thomas.Favier@accelance.fr) |
List | pgsql-general |
> For one of our customer, we are running a PostgreSQL database on a > dynamic PHP-driven site. This site has a minimum of 40 visitors at a > time and must be responsive 24h a day. And from the bandwidth and hit logs, they cannot determine a time of day when there are hardly any hits? Possible, but it might be worth a double-check. > The problem is with VACUUMing this table. It takes 2 long minutes > everyday. Two minutes during wich no request can be done because of the > lock on the table... (nearly every request is involving this large > table). Our customer really dislike this fact and is planning to > replace PostgreSQL with Oracle. If they are sufficiently desperate over 2 minutes per day, which is, BTW, less than 0.14% of the time, to want to replace it with Oracle at the full cost of it (and last time I checked, Oracle in full server setup costs were in 5 figures, just for the software), then I'd say let them. It's their money they are throwing away. I am assuming here that they have checked and confirmed that Oracle would no suffer a similar problem? > 2 minutes is seen by our customer as sufficent for his customer to > get away from his site. That would be the 0.14%, would it? What sort of service are they running on it? > - Is 2 minutes a standard time for vacuuming a 500.000 rows table ? I'd say that is pretty good. I have a 30M row table with 30K-100K inserts/updates per day (not including selects here), and I'm quite happy if I can get it to vacuum in less than 15 minutes on a P2/400. > - Can it be reduced ? I guess. If you run your database on am ultra-fast RAID stripe, put more memory and a faster processor in the server, you could probably reduce it. Have you considered running two servers? If you set up two replicated servers, then you could run everything off one server. At a convenient time of day, when there's minimum load, you could swap vacuum the backup one, wait for the queued replicated queries to be executed, and then fail them over. Then, you can vacuum the primary server, and make it wait as the fail-over server until next time it's time to vacuum the database. The fail-over should take a matter of seconds (fractions of seconds?), depending on how you do it. A second server is likely to cost them less than a full Oracle licence... Incidentally, how do they handle backups? I am not sure how well a backup of the database will work out with it still running, and records being inserted during the backup. Running two servers will also get you around this problem, in the same way, because you can shut down a database while you back it up without loosing any uptime or data. > - In a far future, what are the problems we can run into not vacuuming > that table ? We have already seen that after a month, some transactions > involving where id >= some_value take forever, so we supressed them. Performance on inserts and updates will degrade, especially if there are lots of deletes as well. Basically, the tables will get fragmented, because deletes, AFAIK, do lazy deletion, so the tables aren't cleaned out until you vacuum them. This also tends to mess up the index timings because there is a lot of dead records in them. Eventually, it will grind to a halt. Depending on what your performance and requirements are, you could do a vacuum once per week, perhaps? Regards. Gordon
pgsql-general by date: