Thread: Autovacuum / full vacuum
hi, I'm curious as to why autovacuum is not designed to do full vacuum. I know that the necessity of doing full vacuums can be reduced by increasing the FSM, but in my opinion that is the wrong decision for many applications. My application does not continuously insert/update/delete tuples at a constant rate. Basically there are long periods of relatively few modifications and short burst of high activity. Increasing the FSM so that even during these bursts most space would be reused would mean to reduce the available memory for all other database tasks. So my question is: What's the use of an autovacuum daemon if I still have to use a cron job to do full vacuums? wouldn't it just be a minor job to enhance autovacuum to be able to perform full vacuums, if one really wants it to do that - even if some developers think that it's the wrong approach? Mike
> So my question is: What's the use of an autovacuum daemon if I still > have to use a cron job to do full vacuums? wouldn't it just be a minor > job to enhance autovacuum to be able to perform full vacuums, if one > really wants it to do that - even if some developers think that it's the > wrong approach? You should never have to do full vacuums... Chris
Hi, did you read my post? In the first part I explained why I don't want to increase the FSM that much. Mike >> So my question is: What's the use of an autovacuum daemon if I still >> have to use a cron job to do full vacuums? wouldn't it just be a minor >> job to enhance autovacuum to be able to perform full vacuums, if one >> really wants it to do that - even if some developers think that it's >> the wrong approach? > > You should never have to do full vacuums... > > Chris > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend >
>> You should never have to do full vacuums... I would rather say, You should never have to do full vacuums by any periodic means. It may be done on a adhoc basis, when you have figured out that your table is never going to grow that big again. On 1/17/06, Christopher Kings-Lynne <chriskl@familyhealth.com.au> wrote: > > So my question is: What's the use of an autovacuum daemon if I still > > have to use a cron job to do full vacuums? wouldn't it just be a minor > > job to enhance autovacuum to be able to perform full vacuums, if one > > really wants it to do that - even if some developers think that it's the > > wrong approach? > > You should never have to do full vacuums... > > Chris > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend >
On Tue, Jan 17, 2006 at 11:33:02AM +0100, Michael Riess wrote: >did you read my post? In the first part I explained why I don't want to >increase the FSM that much. Since you didn't quantify it, that wasn't much of a data point. (IOW, you'd generally have to be seriously resource constrained before the FSM would be a significant source of memory consumption--in which case more RAM would probably be a much better solution than screwing with autovacuum.) Mike Stone
Michael Riess wrote: > hi, > > I'm curious as to why autovacuum is not designed to do full vacuum. Because a VACUUM FULL is too invasive. Lazy vacuum is so light on the system w.r.t. locks that it's generally not a problem to start one at any time. On the contrary, vacuum full could be a disaster on some situations. What's more, in general a lazy vacuum is enough to keep the dead space within manageability, given a good autovacuum configuration and good FSM configuration, so there's mostly no need for full vacuum. (This is the theory at least.) For the situations where there is a need, we tell you to issue it manually. > So my question is: What's the use of an autovacuum daemon if I still > have to use a cron job to do full vacuums? wouldn't it just be a minor > job to enhance autovacuum to be able to perform full vacuums, if one > really wants it to do that - even if some developers think that it's the > wrong approach? Yes, it is a minor job to "enhance" it to perform vacuum full. The problem is having a good approach to determining _when_ to issue a full vacuum, and having a way to completely disallow it. If you want to do the development work, be my guest (but let us know your design first). If you don't, I guess you would have to wait until it comes high enough on someone's to-do list, maybe because you convinced him (or her, but we don't have Postgres-ladies at the moment AFAIK) monetarily or something. You can, of course, produce a patch and use it internally. This is free software, remember. -- Alvaro Herrera Developer, http://www.PostgreSQL.org "God is real, unless declared as int"
> I'm curious as to why autovacuum is not designed to do full vacuum. Because that's terribly invasive due to the locks it takes out. Lazy vacuum may chew some I/O, but it does *not* block your application for the duration. VACUUM FULL blocks the application. That is NOT something that anyone wants to throw into the "activity mix" randomly. -- let name="cbbrowne" and tld="gmail.com" in String.concat "@" [name;tld];; http://linuxdatabases.info/info/slony.html Signs of a Klingon Programmer #11: "This machine is a piece of GAGH! I need dual Pentium processors if I am to do battle with this code!"
> VACUUM FULL blocks the application. That is NOT something that anyone > wants to throw into the "activity mix" randomly. There must be a way to implement a daemon which frees up space of a relation without blocking it too long. It could abort after a certain number of blocks have been freed and then move to the next relation.
Michael Riess wrote: > did you read my post? In the first part I explained why I don't want to > increase the FSM that much. I'm sure he did, but just because you don't have enough FSM space to capture all everything from your "burst", that doesn't mean that space can't be reclaimed. The next time a regular vacuum is run, it will once again try to fill the FSM with any remaining free space it finds in the table. What normally happens is that your table will never bee 100% free of dead space, normally it will settle at some steady state size that is small percentage bigger than the table will be after a full vacuum. As long as that percentage is small enough, the effect on performance is negligible. Have you measured to see if things are truly faster after a VACUUM FULL? Matt
Hi, Matthew, Matthew T. O'Connor wrote: > I'm sure he did, but just because you don't have enough FSM space to > capture all everything from your "burst", that doesn't mean that space > can't be reclaimed. The next time a regular vacuum is run, it will once > again try to fill the FSM with any remaining free space it finds in the > table. What normally happens is that your table will never bee 100% > free of dead space, normally it will settle at some steady state size > that is small percentage bigger than the table will be after a full > vacuum. As long as that percentage is small enough, the effect on > performance is negligible. This will work if you've a steady stream of inserts / updates, but not if you happen to have update bulks that exhaust the FSM capacity. The update first fills up all the FSM, and then allocates new pages for the rest. Then VACUUM comes and refills the FSM, however, the FSM does not contain enough free space for the next large bulk update. The same is for deletes and large bulk inserts, btw. So your table keeps growing steadily, until VACUUM FULL or CLUSTER comes along to clean up the mess. Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
Hi, yes, some heavily used tables contain approx. 90% free space after a week. I'll try to increase FSM even more, but I think that I will still have to run a full vacuum every week. Prior to 8.1 I was using 7.4 and ran a full vacuum every day, so the autovacuum has helped a lot. But actually I never understood why the database system slows down at all when there is much unused space in the files. Are the unused pages cached by the system, or is there another reason for the impact on the performance? Mike > Have you measured to see if things are truly > faster after a VACUUM FULL? > > Matt > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend >
Michael Riess wrote: > Hi, > > yes, some heavily used tables contain approx. 90% free space after a > week. I'll try to increase FSM even more, but I think that I will > still have to run a full vacuum every week. Prior to 8.1 I was using > 7.4 and ran a full vacuum every day, so the autovacuum has helped a > lot. > > But actually I never understood why the database system slows down at > all when there is much unused space in the files. Are the unused pages > cached by the system, or is there another reason for the impact on the > performance? The reason is that the system needs to LOOK at the pages/tuples to see if the tuples are dead or not. So, the number of dead tuples impacts the scans. LER > > Mike -- Larry Rosenman Database Support Engineer PERVASIVE SOFTWARE. INC. 12365B RIATA TRACE PKWY 3015 AUSTIN TX 78727-6531 Tel: 512.231.6173 Fax: 512.231.6597 Email: Larry.Rosenman@pervasive.com Web: www.pervasive.com
> did you read my post? In the first part I explained why I don't want > to increase the FSM that much. No, you didn't. You explained *that* you thought you didn't want to increase the FSM. You didn't explain why. FSM expansion comes fairly cheap, and tends to be an effective way of eliminating the need for VACUUM FULL. That is generally considered to be a good tradeoff. In future versions, there is likely to be more of this sort of thing; for instance, on the ToDo list is a "Vacuum Space Map" that would collect page IDs that need vacuuming so that PostgreSQL could do "quicker" vacuums... -- output = reverse("moc.liamg" "@" "enworbbc") http://cbbrowne.com/info/internet.html Given recent events in Florida, the tourism board in Texas has developed a new advertising campaign based on the slogan "Ya'll come to Texas, where we ain't shot a tourist in a car since November 1963."
Well, I think that the documentation is not exactly easy to understand. I always wondered why there are no examples for common postgresql configurations. All I know is that the default configuration seems to be too low for production use. And while running postgres I get no hints as to which setting needs to be increased to improve performance. I have no chance to see if my FSM settings are too low other than to run vacuum full verbose in psql, pipe the result to a text file and grep for some words to get a somewhat comprehensive idea of how much unused space there is in my system. Don't get me wrong - I really like PostgreSQL and it works well in my application. But somehow I feel that it might run much better ... about the FSM: You say that increasing the FSM is fairly cheap - how should I know that? >> did you read my post? In the first part I explained why I don't want >> to increase the FSM that much. > > No, you didn't. You explained *that* you thought you didn't want to > increase the FSM. You didn't explain why. > > FSM expansion comes fairly cheap ...
On Tue, Jan 17, 2006 at 11:18:59AM +0100, Michael Riess wrote: > hi, > > I'm curious as to why autovacuum is not designed to do full vacuum. I Because nothing that runs automatically should ever take an exclusive lock on the entire database, which is what VACUUM FULL does. > activity. Increasing the FSM so that even during these bursts most space > would be reused would mean to reduce the available memory for all > other database tasks. I don't believe the hit is enough that you should even notice it. You'd have to post some pretty incredible use cases to show that the tiny loss of memory to FSM is worth (a) an exclusive lock and (b) the loss of efficiency you get from having some preallocated pages in tables. A -- Andrew Sullivan | ajs@crankycanuck.ca The fact that technology doesn't work is no bar to success in the marketplace. --Philip Greenspun
Hi, >> hi, >> >> I'm curious as to why autovacuum is not designed to do full vacuum. I > > Because nothing that runs automatically should ever take an exclusive > lock on the entire database, which is what VACUUM FULL does. I thought that vacuum full only locks the table which it currently operates on? I'm pretty sure that once a table has been vacuumed, it can be accessed without any restrictions while the vacuum process works on the next table. > >> activity. Increasing the FSM so that even during these bursts most space >> would be reused would mean to reduce the available memory for all >> other database tasks. > > I don't believe the hit is enough that you should even notice it. > You'd have to post some pretty incredible use cases to show that the > tiny loss of memory to FSM is worth (a) an exclusive lock and (b) the > loss of efficiency you get from having some preallocated pages in > tables. I have 5000 tables and a workstation with 1 GB RAM which hosts an Apache Web Server, Tomcat Servlet Container and PostgreSQL. RAM is not something that I have plenty of ... and the hardware is fixed and cannot be changed.
On Tue, Jan 17, 2006 at 03:50:38PM +0100, Michael Riess wrote: >about the FSM: You say that increasing the FSM is fairly cheap - how >should I know that? Why would you assume otherwise, to the point of not considering changing the setting? The documentation explains how much memory is used for FSM entries. If you look at vacuum verbose output it will tell you how much memory you're currently using for the FSM. Mike Stone
On Tue, Jan 17, 2006 at 03:50:38PM +0100, Michael Riess wrote: > always wondered why there are no examples for common postgresql > configurations. You mean like this one? (for 8.0): <http://www.powerpostgresql.com/Downloads/annotated_conf_80.html> > All I know is that the default configuration seems to be > too low for production use. Define "production use". It may be too low for you. > chance to see if my FSM settings are too low other than to run vacuum > full verbose in psql, pipe the result to a text file and grep for some Not true. You don't need a FULL on there to figure this out. > about the FSM: You say that increasing the FSM is fairly cheap - how > should I know that? Do the math. The docs say this: --snip--- max_fsm_pages (integer) Sets the maximum number of disk pages for which free space will be tracked in the shared free-space map. Six bytes of shared memory are consumed for each page slot. This setting must be more than 16 * max_fsm_relations. The default is 20000. This option can only be set at server start. max_fsm_relations (integer) Sets the maximum number of relations (tables and indexes) for which free space will be tracked in the shared free-space map. Roughly seventy bytes of shared memory are consumed for each slot. The default is 1000. This option can only be set at server start. ---snip--- So by default, you have 6 B * 20,000 = 120,000 bytes for the FSM pages. By default, you have 70 B * 1,000 = 70,000 bytes for the FSM relations. Now, there are two knobs. One of them tracks the number of relations. How many relations do you have? Count the number of indexes and tables you have, and give yourself some headroom in case you add some more, and poof, you have your number for the relations. Now all you need to do is figure out what your churn rate is on tables, and count up how many disk pages that's likely to be. Give yourself a little headroom, and the number of FSM pages is done, too. This churn rate is often tough to estimate, though, so you may have to fiddle with it from time to time. A -- Andrew Sullivan | ajs@crankycanuck.ca I remember when computers were frustrating because they *did* exactly what you told them to. That actually seems sort of quaint now. --J.D. Baldwin
On Tue, Jan 17, 2006 at 09:09:02AM -0500, Matthew T. O'Connor wrote: > vacuum. As long as that percentage is small enough, the effect on > performance is negligible. Have you measured to see if things are truly Actually, as long as the percentage is small enough and the pages are really empty, the performance effect is positive. If you have VACUUM FULLed table, inserts have to extend the table before inserting, whereas in a table with some space reclaimed, the I/O effect of having to allocate another disk page is already done. A -- Andrew Sullivan | ajs@crankycanuck.ca When my information changes, I alter my conclusions. What do you do sir? --attr. John Maynard Keynes
On 1/17/06, Michael Riess <mlriess@gmx.de> wrote:
comment from original postgresql.conf file seems pretty obvious:
#max_fsm_pages = 20000 # min max_fsm_relations*16, 6 bytes each
#max_fsm_relations = 1000 # min 100, ~70 bytes each
basically setting max_fsm_pages to 1000000 consumes 6 megabytes. and i definitelly doubt you will ever hit that high.
depesz
about the FSM: You say that increasing the FSM is fairly cheap - how
should I know that?
comment from original postgresql.conf file seems pretty obvious:
#max_fsm_pages = 20000 # min max_fsm_relations*16, 6 bytes each
#max_fsm_relations = 1000 # min 100, ~70 bytes each
basically setting max_fsm_pages to 1000000 consumes 6 megabytes. and i definitelly doubt you will ever hit that high.
depesz
Michael Riess <mlriess@gmx.de> writes: > I'm curious as to why autovacuum is not designed to do full vacuum. Locking considerations. VACUUM FULL takes an exclusive lock, which blocks any foreground transactions that want to touch the table --- so it's really not the sort of thing you want being launched at unpredictable times. regards, tom lane
On Tue, Jan 17, 2006 at 03:05:29PM +0100, Michael Riess wrote: > There must be a way to implement a daemon which frees up space of a > relation without blocking it too long. Define "too long". If I have a table that needs to respond to a SELECT in 50ms, I don't have time for you to lock my table. If this were such an easy thing to do, don't you think the folks who came up wit the ingenious lazy vacuum system would have done it? Remember, a vacuum full must completely lock the table, because it is physically moving bits around on the disk. So a SELECT can't happen at the same time, because the bits might move out from under the SELECT while it's running. Concurrency is hard, and race conditions are easy, to implement. A -- Andrew Sullivan | ajs@crankycanuck.ca A certain description of men are for getting out of debt, yet are against all taxes for raising money to pay it off. --Alexander Hamilton
On Tue, Jan 17, 2006 at 04:04:41PM +0100, Michael Riess wrote: > > I thought that vacuum full only locks the table which it currently > operates on? I'm pretty sure that once a table has been vacuumed, it can > be accessed without any restrictions while the vacuum process works on > the next table. Yes, I think the way I phrased it was unfortunate. But if you issue VACUUM FULL you'll get an exclusive lock on everything, although not all at the same time. But of course, if your query load is like this BEGIN; SELECT from t1, t2 where t1.col1 = t2.col2; [application logic] UPDATE t3 . . . COMMIT; you'll find yourself blocked in the first statement on both t1 and t2; and then on t3 as well. You sure don't want that to happen automagically, in the middle of your business day. > I have 5000 tables and a workstation with 1 GB RAM which hosts an Apache > Web Server, Tomcat Servlet Container and PostgreSQL. RAM is not > something that I have plenty of ... and the hardware is fixed and cannot > be changed. I see. Well, I humbly submit that your problem is not the design of the PostgreSQL server, then. "The hardware is fixed and cannot be changed," is the first optimisation I'd make. Heck, I gave away a box to charity only two weeks ago that would solve your problem better than automatically issuing VACUUM FULL. A -- Andrew Sullivan | ajs@crankycanuck.ca Information security isn't a technological problem. It's an economics problem. --Bruce Schneier
Michael Riess <mlriess@gmx.de> writes: > But actually I never understood why the database system slows down at > all when there is much unused space in the files. Perhaps some of your common queries are doing sequential scans? Those would visit the empty pages as well as the full ones. regards, tom lane
On Tue, 2006-01-17 at 09:08, Andrew Sullivan wrote: > On Tue, Jan 17, 2006 at 03:50:38PM +0100, Michael Riess wrote: > > always wondered why there are no examples for common postgresql > > configurations. > > You mean like this one? (for 8.0): > > <http://www.powerpostgresql.com/Downloads/annotated_conf_80.html> I have to admit, looking at the documentation, that we really don't explain this all that well in the administration section, and I can see how easily led astray beginners are. I think it's time I joined the pgsql-docs mailing list...
ajs@crankycanuck.ca (Andrew Sullivan) writes: > On Tue, Jan 17, 2006 at 11:18:59AM +0100, Michael Riess wrote: >> hi, >> >> I'm curious as to why autovacuum is not designed to do full vacuum. I > > Because nothing that runs automatically should ever take an exclusive > lock on the entire database, which is what VACUUM FULL does. That's a bit more than what autovacuum would probably do... autovacuum does things table by table, so that what would be locked should just be one table. Even so, I'd not be keen on having anything that runs automatically take an exclusive lock on even as much as a table. >> activity. Increasing the FSM so that even during these bursts most >> space would be reused would mean to reduce the available memory for >> all other database tasks. > > I don't believe the hit is enough that you should even notice > it. You'd have to post some pretty incredible use cases to show that > the tiny loss of memory to FSM is worth (a) an exclusive lock and > (b) the loss of efficiency you get from having some preallocated > pages in tables. There is *a* case for setting up full vacuums of *some* objects. If you have a table whose tuples all get modified in the course of some common query, that will lead to a pretty conspicuous bloating of *that table.* Even with a big FSM, the pattern of how updates take place will lead to that table having ~50% of its space being "dead/free," which is way higher than the desirable "stable proportion" of 10-15%. For that sort of table, it may be attractive to run VACUUM FULL on a regular basis. Of course, it may also be attractive to try to come up with an update process that won't kill the whole table's contents at once ;-). -- let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];; http://cbbrowne.com/info/x.html "As long as each individual is facing the TV tube alone, formal freedom poses no threat to privilege." --Noam Chomsky
On Tue, Jan 17, 2006 at 09:59:25AM -0600, Scott Marlowe wrote: > I have to admit, looking at the documentation, that we really don't > explain this all that well in the administration section, and I can see > how easily led astray beginners are. I understand what you mean, but I suppose my reaction would be that what we really need is a place to keep these things, with a note in the docs that the "best practice" settings for these are documented at <some url>, and evolve over time as people gain expertise with the new features. I suspect, for instance, that nobody knows exactly the right settings for any generic workload yet under 8.1 (although probably people know them well enough for particular workloads). A -- Andrew Sullivan | ajs@crankycanuck.ca This work was visionary and imaginative, and goes to show that visionary and imaginative work need not end up well. --Dennis Ritchie
Hi, Michael, Michael Riess wrote: > But actually I never understood why the database system slows down at > all when there is much unused space in the files. Are the unused pages > cached by the system, or is there another reason for the impact on the > performance? No, they are not cached as such, but PostgreSQL caches whole pages, and you don't have only empty pages, but also lots of partially empty pages, so the signal/noise ratio is worse (means PostgreSQL has to fetch more pages to get the same data). Sequential scans etc. are also slower. And some file systems get slower when files get bigger or there are more files, but this effect should not really be noticeable here. HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
Chris Browne wrote: > ajs@crankycanuck.ca (Andrew Sullivan) writes: > > On Tue, Jan 17, 2006 at 11:18:59AM +0100, Michael Riess wrote: > >> hi, > >> > >> I'm curious as to why autovacuum is not designed to do full vacuum. I > > > > Because nothing that runs automatically should ever take an exclusive > > lock on the entire database, which is what VACUUM FULL does. > > That's a bit more than what autovacuum would probably do... > autovacuum does things table by table, so that what would be locked > should just be one table. Even a database-wide vacuum does not take locks on more than one table. The table locks are acquired and released one by one, as the operation proceeds. And as you know, autovacuum (both 8.1's and contrib) does issue database-wide vacuums, if it finds a database close to an xid wraparound. -- Alvaro Herrera http://www.advogato.org/person/alvherre "Las mujeres son como hondas: mientras más resistencia tienen, más lejos puedes llegar con ellas" (Jonas Nightingale, Leap of Faith)
On Tue, 2006-01-17 at 11:16, Andrew Sullivan wrote: > On Tue, Jan 17, 2006 at 09:59:25AM -0600, Scott Marlowe wrote: > > I have to admit, looking at the documentation, that we really don't > > explain this all that well in the administration section, and I can see > > how easily led astray beginners are. > > I understand what you mean, but I suppose my reaction would be that > what we really need is a place to keep these things, with a note in > the docs that the "best practice" settings for these are documented > at <some url>, and evolve over time as people gain expertise with the > new features. > > I suspect, for instance, that nobody knows exactly the right settings > for any generic workload yet under 8.1 (although probably people know > them well enough for particular workloads). But the problem is bigger than that. The administrative docs were obviously evolved over time, and now they kind of jump around and around covering the same subject from different angles and at different depths. Even I find it hard to find what I need, and I know PostgreSQL administration well enough to be pretty darned good at it. For the beginner, it must seem much more confusing. The more I look at the administration section of the docs, the more I want to reorganize the whole thing, and rewrite large sections of it as well.
On Tue, Jan 17, 2006 at 11:43:14AM -0500, Chris Browne wrote: > ajs@crankycanuck.ca (Andrew Sullivan) writes: > > Because nothing that runs automatically should ever take an exclusive > > lock on the entire database, > That's a bit more than what autovacuum would probably do... Or even VACUUM FULL, as I tried to make clearer in another message: the way I phrased it suggests that it's a simultaneous lock on the entire database (when it is most certainly not). I didn't intend to mislead; my apologies. Note, though, that the actual effect for a user might look worse than a lock on the entire database, though, if you conider statement_timeout and certain use patterns. Suppose you want to issue occasional VACCUM FULLs, but your application is prepared for this, and depends on statement_timeout to tell it "sorry, too long, try again". Now, if the exclusive lock on any given table takes less than statement_timeout, so that each statement is able to continue in its time, the application looks like it's having an outage _even though_ it is actually blocked on vacuums. (Yes, it's poor application design. There's plenty of that in the world, and you can't always fix it.) > There is *a* case for setting up full vacuums of *some* objects. If > you have a table whose tuples all get modified in the course of some > common query, that will lead to a pretty conspicuous bloating of *that > table.* Sure. And depending on your use model, that might be good. In many cases, though, a "rotor table + view + truncate" approach would be better, and would allow improved uptime. If you don't care about uptime, and can take long outages every day, then the discussion is sort of moot anyway. And _all_ of this is moot, as near as I can tell, given the OP's claim that the hardware is adequate and immutable, even though the former claim is demonstrably false. A -- Andrew Sullivan | ajs@crankycanuck.ca The fact that technology doesn't work is no bar to success in the marketplace. --Philip Greenspun
alvherre@alvh.no-ip.org (Alvaro Herrera) writes: > Chris Browne wrote: >> ajs@crankycanuck.ca (Andrew Sullivan) writes: >> > On Tue, Jan 17, 2006 at 11:18:59AM +0100, Michael Riess wrote: >> >> hi, >> >> >> >> I'm curious as to why autovacuum is not designed to do full vacuum. I >> > >> > Because nothing that runs automatically should ever take an exclusive >> > lock on the entire database, which is what VACUUM FULL does. >> >> That's a bit more than what autovacuum would probably do... >> autovacuum does things table by table, so that what would be locked >> should just be one table. > > Even a database-wide vacuum does not take locks on more than one table. > The table locks are acquired and released one by one, as the operation > proceeds. And as you know, autovacuum (both 8.1's and contrib) does > issue database-wide vacuums, if it finds a database close to an xid > wraparound. Has that changed recently? I have always seen "vacuumdb" or SQL "VACUUM" (without table specifications) running as one long transaction which doesn't release the locks that it is granted until the end of the transaction. -- "cbbrowne","@","acm.org" http://cbbrowne.com/info/spiritual.html "My nostalgia for Icon makes me forget about any of the bad things. I don't have much nostalgia for Perl, so its faults I remember." -- Scott Gilbert comp.lang.python
Chris Browne <cbbrowne@acm.org> writes: > alvherre@alvh.no-ip.org (Alvaro Herrera) writes: >> Even a database-wide vacuum does not take locks on more than one table. >> The table locks are acquired and released one by one, as the operation >> proceeds. > Has that changed recently? I have always seen "vacuumdb" or SQL > "VACUUM" (without table specifications) running as one long > transaction which doesn't release the locks that it is granted until > the end of the transaction. You sure? It's not supposed to, and watching a database-wide vacuum with "select * from pg_locks" doesn't look to me like it ever has locks on more than one table (plus the table's indexes and toast table). regards, tom lane
On Tue, Jan 17, 2006 at 03:50:38PM +0100, Michael Riess wrote: > Well, > > I think that the documentation is not exactly easy to understand. I > always wondered why there are no examples for common postgresql > configurations. All I know is that the default configuration seems to be > too low for production use. And while running postgres I get no hints as > to which setting needs to be increased to improve performance. I have no There's a number of sites that have lots of info on postgresql.conf tuning. Google for 'postgresql.conf tuning' or 'annotated postgresql.conf'. > chance to see if my FSM settings are too low other than to run vacuum > full verbose in psql, pipe the result to a text file and grep for some > words to get a somewhat comprehensive idea of how much unused space > there is in my system. > > Don't get me wrong - I really like PostgreSQL and it works well in my > application. But somehow I feel that it might run much better ... > > about the FSM: You say that increasing the FSM is fairly cheap - how > should I know that? decibel@phonebook.1[16:26]/opt/local/share/postgresql8:3%grep fsm \ postgresql.conf.sample #max_fsm_pages = 20000 # min max_fsm_relations*16, 6 bytes each #max_fsm_relations = 1000 # min 100, ~70 bytes each decibel@phonebook.1[16:26]/opt/local/share/postgresql8:4% -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
> There's a number of sites that have lots of info on postgresql.conf > tuning. Google for 'postgresql.conf tuning' or 'annotated > postgresql.conf'. I know some of these sites, but who should I know if the information on those pages is correct? The information on those pages should be published as part of the postgres documentation. Doesn't have to be too much, maybe like this page: http://www.powerpostgresql.com/Downloads/annotated_conf_80.html But it should be part of the documentation to show newbies that not only the information is correct, but also approved of and recommended by the postgres team.
> >> Even a database-wide vacuum does not take locks on more than one table. > >> The table locks are acquired and released one by one, as the operation > >> proceeds. > > > Has that changed recently? I have always seen "vacuumdb" or SQL > > "VACUUM" (without table specifications) running as one long > > transaction which doesn't release the locks that it is granted until > > the end of the transaction. > > You sure? It's not supposed to, and watching a database-wide vacuum > with "select * from pg_locks" doesn't look to me like it ever has locks > on more than one table (plus the table's indexes and toast table). Are there some plans to remove vacuum altogether? Mindaugas
Mindaugas wrote: > > >> Even a database-wide vacuum does not take locks on more than one table. > > >> The table locks are acquired and released one by one, as the operation > > >> proceeds. > > > > > Has that changed recently? I have always seen "vacuumdb" or SQL > > > "VACUUM" (without table specifications) running as one long > > > transaction which doesn't release the locks that it is granted until > > > the end of the transaction. > > > > You sure? It's not supposed to, and watching a database-wide vacuum > > with "select * from pg_locks" doesn't look to me like it ever has locks > > on more than one table (plus the table's indexes and toast table). > > Are there some plans to remove vacuum altogether? No, but there are plans to make it as automatic and unintrusive as possible. (User configuration will probably always be needed.) -- Alvaro Herrera Developer, http://www.PostgreSQL.org FOO MANE PADME HUM
mind@bi.lt ("Mindaugas") writes: >> >> Even a database-wide vacuum does not take locks on more than one >> >> table. The table locks are acquired and released one by one, as >> >> the operation proceeds. >> >> > Has that changed recently? I have always seen "vacuumdb" or SQL >> > "VACUUM" (without table specifications) running as one long >> > transaction which doesn't release the locks that it is granted >> > until the end of the transaction. >> >> You sure? It's not supposed to, and watching a database-wide >> vacuum with "select * from pg_locks" doesn't look to me like it >> ever has locks on more than one table (plus the table's indexes and >> toast table). > > Are there some plans to remove vacuum altogether? I don't see that on the TODO list... http://www.postgresql.org/docs/faqs.TODO.html To the contrary, there is a whole section on what functionality to *ADD* to VACUUM. -- let name="cbbrowne" and tld="acm.org" in String.concat "@" [name;tld];; http://www.ntlug.org/~cbbrowne/finances.html "There are two types of hackers working on Linux: those who can spell, and those who can't. There is a constant, pitched battle between the two camps." --Russ Nelson (Linux Kernel Summary, Ver. 1.1.75 -> 1.1.76)
On Wednesday 18 January 2006 08:54 am, Chris Browne wrote: > To the contrary, there is a whole section on what functionality to > *ADD* to VACUUM. Near but not quite off the topic of VACUUM and new features... I've been thinking about parsing the vacuum output and storing it in Postgresql. All the tuple, page, cpu time, etc... information would be inserted into a reasonably flat set of tables. The benefits I would expect from this are: * monitoring ability - I could routinely monitor the values in the table to warn when vacuum's are failing or reclaimed space has risen dramatically. I find it easier to write and maintain monitoring agents that perform SQL queries than ones that need to routinely parse log files and coordinate with cron. * historical perspective on tuple use - which a relatively small amount of storage, I could use the vacuum output to get an idea of usage levels over time, which is beneficial for planning additional capacity * historical information could theoretically inform the autovacuum, though I assume there are better alternatives planned. * it could cut down on traffic on this list if admin could see routine maintenance in a historical context. Assuming this isn't a fundamentally horrible idea, it would be nice if there were ways to do this without parsing the pretty-printed vacuum text (ie, callbacks, triggers, guc variable). I'd like to know if anybody does this already, thinks its a bad idea, or can knock me on the noggin with the pg manual and say, "it's already there!". Regards, Michael
On Wed, Jan 18, 2006 at 03:09:42PM +0100, Michael Riess wrote: > >There's a number of sites that have lots of info on postgresql.conf > >tuning. Google for 'postgresql.conf tuning' or 'annotated > >postgresql.conf'. > > I know some of these sites, but who should I know if the information on > those pages is correct? The information on those pages should be > published as part of the postgres documentation. Doesn't have to be too > much, maybe like this page: > > http://www.powerpostgresql.com/Downloads/annotated_conf_80.html > > But it should be part of the documentation to show newbies that not only > the information is correct, but also approved of and recommended by the > postgres team. Actually, most of what you find there is probably also found in techdocs. But of course it would be better if the docs did a better job of explaining things... -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Wed, Jan 18, 2006 at 11:15:51AM -0800, Michael Crozier wrote: > I've been thinking about parsing the vacuum output and storing it in > Postgresql. All the tuple, page, cpu time, etc... information would be > inserted into a reasonably flat set of tables. <snip> > Assuming this isn't a fundamentally horrible idea, it would be nice if there > were ways to do this without parsing the pretty-printed vacuum text (ie, > callbacks, triggers, guc variable). The best way to do this would be to modify the vacuum code itself, but the issue is that vacuum (or at least lazyvacuum) doesn't handle transactions like the rest of the backend does, so I suspect that there would be some issues with trying to log the information from the same backend that was running the vacuum. > I'd like to know if anybody does this already, thinks its a bad idea, or can > knock me on the noggin with the pg manual and say, "it's already there!". I think it's a good idea, but you should take a look at the recently added functionality that allows you to investigate the contests of the FSM via a user function (this is either in 8.1 or in HEAD; I can't remember which). -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
crozierm@conducivetech.com (Michael Crozier) writes: > On Wednesday 18 January 2006 08:54 am, Chris Browne wrote: >> To the contrary, there is a whole section on what functionality to >> *ADD* to VACUUM. > > Near but not quite off the topic of VACUUM and new features... > > I've been thinking about parsing the vacuum output and storing it in > Postgresql. All the tuple, page, cpu time, etc... information would > be inserted into a reasonably flat set of tables. > > The benefits I would expect from this are: > > * monitoring ability - I could routinely monitor the values in the > table to warn when vacuum's are failing or reclaimed space has risen > dramatically. I find it easier to write and maintain monitoring > agents that perform SQL queries than ones that need to routinely > parse log files and coordinate with cron. > > * historical perspective on tuple use - which a relatively small > amount of storage, I could use the vacuum output to get an idea of > usage levels over time, which is beneficial for planning additional > capacity > > * historical information could theoretically inform the autovacuum, > though I assume there are better alternatives planned. > > * it could cut down on traffic on this list if admin could see > routine maintenance in a historical context. > > Assuming this isn't a fundamentally horrible idea, it would be nice > if there were ways to do this without parsing the pretty-printed > vacuum text (ie, callbacks, triggers, guc variable). > > I'd like to know if anybody does this already, thinks its a bad > idea, or can knock me on the noggin with the pg manual and say, > "it's already there!". We had someone working on that for a while; I don't think it got to the point of being something ready to unleash on the world. I certainly agree that it would be plenty useful to have this sort of information available. Having a body of historical information could lead to having some "more informed" suggestions for heuristics. -- (reverse (concatenate 'string "gro.mca" "@" "enworbbc")) http://cbbrowne.com/info/unix.html Bad command. Bad, bad command! Sit! Stay! Staaay...
On Wednesday 18 January 2006 14:52 pm, Jim C. Nasby wrote: > I think it's a good idea, but you should take a look at the recently > added functionality that allows you to investigate the contests of the > FSM via a user function (this is either in 8.1 or in HEAD; I can't > remember which). I will look at this when time allows. Perhaps there is a combination of triggers on stat tables and asynchronous notifications that would provide this functionality without getting too deep into the vacuum's transaction logic? Were it too integrated with the vacuum, it would likely be too much for contrib/, I assume. thanks, Michael
> We had someone working on that for a while; I don't think it got to > the point of being something ready to unleash on the world. Interesting. I will dig around the mailing list archives too see how they went about it... for my own curiosity if nothing else. If you happen to know offhand, I'd appreciate a link. Regards, Michael
On Wed, Jan 18, 2006 at 03:36:04PM -0800, Michael Crozier wrote: > > On Wednesday 18 January 2006 14:52 pm, Jim C. Nasby wrote: > > I think it's a good idea, but you should take a look at the recently > > added functionality that allows you to investigate the contests of the > > FSM via a user function (this is either in 8.1 or in HEAD; I can't > > remember which). > > I will look at this when time allows. Perhaps there is a combination of > triggers on stat tables and asynchronous notifications that would provide > this functionality without getting too deep into the vacuum's transaction > logic? You can't put triggers on system tables, at least not ones that will be triggered by system operations themselves, because the backend bypasses normal access methods. Also, most of the really interesting info isn't logged anywhere in a system table; stuff like the amount of dead space, tuples removed, etc. > Were it too integrated with the vacuum, it would likely be too much for > contrib/, I assume. Probably. A good alternative might be allowing vacuum to output some machine-friendly information (maybe into a backend-writable file?) and then have code that could load that into a table (though presumably that could should be as simple as just a COPY). -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Jim C. Nasby wrote: > > I think it's a good idea, but you should take a look at the recently > added functionality that allows you to investigate the contests of the > FSM via a user function (this is either in 8.1 or in HEAD; I can't > remember which). AFAICS it is still in the patch queue for 8.2. It's called 'pg_freespacemap' and is available for 8.1/8.0 from the Pgfoundry 'backports' project: http://pgfoundry.org/projects/backports Cheers Mark
Verified. I am working toward getting all those patches applied. --------------------------------------------------------------------------- Mark Kirkwood wrote: > Jim C. Nasby wrote: > > > > > I think it's a good idea, but you should take a look at the recently > > added functionality that allows you to investigate the contests of the > > FSM via a user function (this is either in 8.1 or in HEAD; I can't > > remember which). > > AFAICS it is still in the patch queue for 8.2. > > It's called 'pg_freespacemap' and is available for 8.1/8.0 from the > Pgfoundry 'backports' project: > > http://pgfoundry.org/projects/backports > > Cheers > > Mark > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
BTW, given all the recent discussion about vacuuming and our MVCC, http://www.pervasive-postgres.com/lp/newsletters/2006/Insights_Postgres_Jan.asp#3 should prove interesting. :) -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Jim C. Nasby wrote: > BTW, given all the recent discussion about vacuuming and our MVCC, > http://www.pervasive-postgres.com/lp/newsletters/2006/Insights_Postgres_Jan.asp#3 > should prove interesting. :) > Please explain... what is the .asp extension. I have yet to see it reliable in production ;) -- The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: PLphp, PLperl - http://www.commandprompt.com/
On Fri, Jan 20, 2006 at 09:31:14AM -0800, Joshua D. Drake wrote: > Jim C. Nasby wrote: > >BTW, given all the recent discussion about vacuuming and our MVCC, > >http://www.pervasive-postgres.com/lp/newsletters/2006/Insights_Postgres_Jan.asp#3 > >should prove interesting. :) > > > Please explain... what is the .asp extension. I have yet to see it > reliable in production ;) I lay no claim to our infrastructure. :) -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
> I lay no claim to our infrastructure. :) > Can I quote the: Pervasive Senior Engineering Consultant on that? -- The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: PLphp, PLperl - http://www.commandprompt.com/
On Fri, Jan 20, 2006 at 09:37:50AM -0800, Joshua D. Drake wrote: > > >I lay no claim to our infrastructure. :) > > > Can I quote the: Pervasive Senior Engineering Consultant on that? Sure... I've never been asked to consult on our stuff, and in any case, I don't do web front-ends (one of the nice things about working with a team of other consultants). AFAIK IIS will happily talk to PostgreSQL (though maybe I'm wrong there...) I *have* asked what database is being used on the backend though, and depending on the answer to that some folks might have some explaining to do. :) *grabs big can of dog food* -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
> Sure... I've never been asked to consult on our stuff, and in any case, > I don't do web front-ends (one of the nice things about working with a > team of other consultants). AFAIK IIS will happily talk to PostgreSQL > (though maybe I'm wrong there...) iis (yeah, asp in a successfull productive environement hehe) & postgresql works even better for us than iis & mssql :-) - thomas
On Fri, Jan 20, 2006 at 06:46:45PM +0100, me@alternize.com wrote: > >Sure... I've never been asked to consult on our stuff, and in any case, > >I don't do web front-ends (one of the nice things about working with a > >team of other consultants). AFAIK IIS will happily talk to PostgreSQL > >(though maybe I'm wrong there...) > > iis (yeah, asp in a successfull productive environement hehe) & postgresql > works even better for us than iis & mssql :-) Just last night I was talking to someone about different databases and what-not (he's stuck in a windows shop using MSSQL and I mentioned I'd heard some bad things about it's stability). I realized at some point that asking about what large installs of something exist is pretty pointless... given enough effort you can make almost anything scale. As an example, there's a cable company with a MySQL database that's nearly 1TB... if that's not proof you can make anything scale, I don't know what is. ;) What people really need to ask about is how hard it is to make something work, and how many problems you're likely to keep encountering. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461