Thread: Enhancement request
Could you please add to your to do list a schema parameter for vacuum?
Example:
VACUUM SCHEMA xyz;
PostgreSQL would get a list of all of the tables found in the schema. It would then loop through vacuuming each table in the schema.
I found today that I did not have max_fsm_pages set high enough in order to vacuum the entire database. I vacuum daily to avoid problems. After realizing that I cannot vacuum the entire database now, I then had to do it at the table level. I had to go through all 13 of my schemas and list each table I found into an SQL vacuum script. This was a real waist of time. I will eventually restart PostgreSQL with an increased value for max_fsm_pages so this will not be an issue.
But until then I will have to run a new script. I created a script with 420 SQL vacuum statements at the table level. I would have preferred to create 13 vacuum SQL statements at the schema level.
Thanks for considering this enhancement,
Lance Campbell
Project Manager/Software Architect
Web Services at Public Affairs
University of Illinois
217.333.0382
http://webservices.uiuc.edu
On Nov 30, 2007 2:46 PM, Campbell, Lance <lance@uiuc.edu> wrote: > Could you please add to your to do list a schema parameter for vacuum? Schema-based analyze would also be useful. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com Edison, NJ 08837 | http://www.enterprisedb.com/
On Fri, 30 Nov 2007 13:46:53 -0600 "Campbell, Lance" <lance@uiuc.edu> wrote: > Could you please add to your to do list a schema parameter for vacuum? > > > > Example: > > > > VACUUM SCHEMA xyz; > > +1 Joshua D. Drake
Attachment
On Fri, Nov 30, 2007 at 12:33:41PM -0800, Joshua D. Drake wrote: > On Fri, 30 Nov 2007 13:46:53 -0600 > "Campbell, Lance" <lance@uiuc.edu> wrote: > > > Could you please add to your to do list a schema parameter for vacuum? > > > > > > > > Example: > > > > > > > > VACUUM SCHEMA xyz; > > > > > > +1 I'd also find it useful. I have many projects in a single database in different schemas. Sincerely, Gergely Czuczy mailto: gergely.czuczy@harmless.hu -- Weenies test. Geniuses solve problems that arise.
Attachment
If I understand how PG works, I don't believe this is a problem. Just run vacuum verbose analyze to determine the required value, set max_fsm_pages to that value, restart and vacuum. I believe the issue with too-small max_fsm_pages is that as vacuum locates reusable space, it simply runs out of room to save that information thus PG bloats the table instead of making use of available space located by vacuum.Could you please add to your to do list a schema parameter for vacuum?
Example:
VACUUM SCHEMA xyz;
PostgreSQL would get a list of all of the tables found in the schema. It would then loop through vacuuming each table in the schema.
I found today that I did not have max_fsm_pages set high enough in order to vacuum the entire database. I vacuum daily to avoid problems. After realizing that I cannot vacuum the entire database now, I then had to do it at the table level. I had to go through all 13 of my schemas and list each table I found into an SQL vacuum script. This was a real waist of time. I will eventually restart PostgreSQL with an increased value for max_fsm_pages so this will not be an issue.
Setting max_fsm_pages to a proper value and running vacuum will address _future_ bloat - and you may even eventually fill all the bloat back in. But the only certain way to remove the bloat that has occurred due to insufficient max_fsm_pages is to vacuum full (or cluster as appropriate - cluster can be many times faster than vacuum full).
Having said that, I would also find this feature occasionally useful but would think something akin to pg_dump's options would be more useful with both -n and -t allowing wildcards.
Cheers,
Steve
"Jonah H. Harris" <jonah.harris@gmail.com> writes: > On Nov 30, 2007 2:46 PM, Campbell, Lance <lance@uiuc.edu> wrote: >> Could you please add to your to do list a schema parameter for vacuum? > Schema-based analyze would also be useful. For what? AFAICS we are moving as fast as we can in the direction of auto vacuum and analyze. Adding more frammishes to the manual commands seems like gilding the buggy whip. regards, tom lane
On Nov 30, 2007 4:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > For what? Convenience. > AFAICS we are moving as fast as we can in the direction of auto vacuum > and analyze. Adding more frammishes to the manual commands seems like > gilding the buggy whip. Autovacuum will never be the be all end all. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com Edison, NJ 08837 | http://www.enterprisedb.com/
Could you please add to your to do list a schema parameter for vacuum?
Example:
VACUUM SCHEMA xyz;
PostgreSQL would get a list of all of the tables found in the schema. It would then loop through vacuuming each table in the schema.
I found today that I did not have max_fsm_pages set high enough in order to vacuum the entire database. I vacuum daily to avoid problems. After realizing that I cannot vacuum the entire database now, I then had to do it at the table level. I had to go through all 13 of my schemas and list each table I found into an SQL vacuum script. This was a real waist of time. I will eventually restart PostgreSQL with an increased value for max_fsm_pages so this will not be an issue.
But until then I will have to run a new script. I created a script with 420 SQL vacuum statements at the table level. I would have preferred to create 13 vacuum SQL statements at the schema level.
i generally use an sql to generate vaccum statement for all tables in a schema and then simple copy paste the output in an sql file and then just run it. something like following
postgres=# select 'vacuum ' || relname || ';'
from pg_class c , pg_namespace n
where c.relnamespace = n.oid
and n.nspname='public' and relkind='r';
?column?
--------------
vacuum b;
vacuum a;
vacuum test;
vacuum xyz;
vacuum foo;
(5 rows)
You can redirect this output to a file and then strip off the heading and the feedback of (5 rows) and just run that file
Thanks for considering this enhancement,
Lance Campbell
Project Manager/Software Architect
Web Services at Public Affairs
University of Illinois
217.333.0382
--
Usama Munir Dar http://linkedin.com/in/usamadar
Consultant Architect
Cell:+92 321 5020666
Skype: usamadar
"Jonah H. Harris" <jonah.harris@gmail.com> writes: > On Nov 30, 2007 4:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> AFAICS we are moving as fast as we can in the direction of auto vacuum >> and analyze. Adding more frammishes to the manual commands seems like >> gilding the buggy whip. > Autovacuum will never be the be all end all. No doubt, which is why no one has proposed removing the manual commands. (Yet, anyway.) But adding complication to them is not going to be an easy sale. We have limited manpower for development and we cannot afford to get bogged down maintaining a codebase with enormous bloat from useless legacy "features". So: show me a use case for this that will still make sense in a mostly-autovacuum world. I can see a need for manual vacuuming of individual special-case tables, but I don't see why schema-wide vacuuming is so useful as to justify diverting development effort to it. regards, tom lane
On Fri, 30 Nov 2007 16:48:24 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Jonah H. Harris" <jonah.harris@gmail.com> writes: > > On Nov 30, 2007 4:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> AFAICS we are moving as fast as we can in the direction of auto > >> vacuum and analyze. Adding more frammishes to the manual commands > >> seems like gilding the buggy whip. > > So: show me a use case for this that will still make sense in a > mostly-autovacuum world. I think you are living in a different world than I am if you think it is a mostly-autovacuum world. Yes autovacuum is great for general low use scenarios. Throw it at a database doing hundreds of thousands (or even millions) of transactions an hour that has relations that in the multiple hundred gig range and autovacuum is useless for a good portion of that database. Autovacuum is a great utility for many workloads but even with the upcoming changes I will continually find myself turning off autovacuum for specific relations just so I can turn around and turn on vacuum within cron for others. The multi-worker autovacuum is a great new addition to help part of that problem (starvation) but it is not help against the other (resource consumption, specifically IO). > I can see a need for manual vacuuming of > individual special-case tables, but I don't see why schema-wide > vacuuming is so useful as to justify diverting development effort to > it. The thing is, it isn't nearly as special case for my environments. I have many customers, with many tables where autovacuum just doesn't cut it. We turn it on for say 80% of the relations but guess what... the important relations are still on some type of schedule through something like cron. I get your argument but surely adding SCHEMA isn't that much of a code bloat scenario. We don't even have to add another reserved word... Sincerely, Joshua D. Drake
Attachment
Joshua D. Drake escribió: > Autovacuum is a great utility for many workloads but even with the > upcoming changes I will continually find myself turning off autovacuum > for specific relations just so I can turn around and turn on vacuum > within cron for others. > > The multi-worker autovacuum is a great new addition to help part of > that problem (starvation) but it is not help against the other > (resource consumption, specifically IO). Huh, autovac will consume exactly the same amount of I/O as a user-induced vacuum, so I don't see your point. Multiworker autovac hasn't been tried on your customer's servers yet. It may well turn out to be that it is possible to let it do its work even on the high-update tables or large tables that you currently use cron for. It might be just a matter of tuning the knobs. Remember, we're not adding VACUUM SCHEMA in 8.2 so it doesn't make any sense to try to compare it against the old autovac. And you can bet that in 8.4 autovac will have even more goodies. -- Alvaro Herrera Valdivia, Chile ICBM: S 39º 49' 18.1", W 73º 13' 56.4" "This is a foot just waiting to be shot" (Andrew Dunstan)
On Nov 30, 2007 5:00 PM, Joshua D. Drake <jd@commandprompt.com> wrote: > > So: show me a use case for this that will still make sense in a > > mostly-autovacuum world. > > I think you are living in a different world than I am if you think it > is a mostly-autovacuum world. Same here. > Yes autovacuum is great for general low use scenarios. Throw it at a > database doing hundreds of thousands (or even millions) of transactions > an hour that has relations that in the multiple hundred gig range and > autovacuum is useless for a good portion of that database. Yes, this is precisely the case I'm talking about. Every single high-volume client we have or have consulted for is using custom vacuuming. Autovacuum works fine for the common case, but it doesn't handle high-volume databases very well yet. > The thing is, it isn't nearly as special case for my environments. I > have many customers, with many tables where autovacuum just doesn't cut > it. We turn it on for say 80% of the relations but guess what... the > important relations are still on some type of schedule through > something like cron. > > I get your argument but surely adding SCHEMA isn't that much of a code > bloat scenario. We don't even have to add another reserved word... Agreed. It's very simple and won't add much code at all. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com Edison, NJ 08837 | http://www.enterprisedb.com/
jonah.harris@gmail.com ("Jonah H. Harris") writes: > On Nov 30, 2007 4:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> For what? > > Convenience. > >> AFAICS we are moving as fast as we can in the direction of auto vacuum >> and analyze. Adding more frammishes to the manual commands seems like >> gilding the buggy whip. > > Autovacuum will never be the be all end all. And why is does it not suffice to do the following? SCHEMA=billing for table in `psql -qt -d my_database -c "select table_name from information_schema.tables where table_schema = '${SCHEMA}'and table_type = 'BASE TABLE';"`; do psql -d my_database -c "vacuum analyze ${SCHEMA}.${table};" done I don't see a need to add more to the SQL grammar when the above can be done in 4 lines of shell script. It seems to me that if you actually *NEED* to do 'sophisticated logic-driven' VACUUMing, then you are already headed down a road where you will need to have: a) A script b) Some query criteria, whether in the DBMS, or purely within the shell, to handle the "logic" bit. Once you're there, you have *AT LEAST* the 4 lines of script that I suggested, if not considerably more. Interestingly, the .sig chosen below actually seems somewhat germaine to this... What you're asking for, whether it's "gilding the buggy whip" or "adding frammishes to manual commands" [1], is, in fact, MORE that you're suggesting it is. You're not merely looking for a "frammish," you're proposing that it is meaningful for us to encourage a policy of vacuuming on a per-schema basis. That's not merely a mechanism to let the user do what they want - that's policy. In contrast, while what is in my little script represents policy, as a whole, none of the components represent policies in and of themselves. Footnotes: [1] I *love* the way Tom phrased that; that sentence is going into my personal "fortunes" file... -- output = reverse("ofni.sesabatadxunil" "@" "enworbbc") http://www3.sympatico.ca/cbbrowne/languages.html People consistently decry X for doing precisely what it was designed to do: provide a mechanism to allow *OTHERS* to build GUI systems. -- John Stevens <jstevens@samoyed.ftc.nrcs.usda.gov>
On Nov 30, 2007 5:23 PM, Chris Browne <cbbrowne@acm.org> wrote: > And why is does it not suffice to do the following? I'm not saying it doesn't. I'm fine with writing it manually. I'm talking about making it easy for most users. Remember, we're not UNIX-only; like it or not, most Windows people wouldn't easily be able to script it. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com Edison, NJ 08837 | http://www.enterprisedb.com/
On Nov 30, 2007 4:53 PM, Jonah H. Harris <jonah.harris@gmail.com> wrote: > On Nov 30, 2007 5:00 PM, Joshua D. Drake <jd@commandprompt.com> wrote: > > > So: show me a use case for this that will still make sense in a > > > mostly-autovacuum world. > > > > I think you are living in a different world than I am if you think it > > is a mostly-autovacuum world. > > Same here. > > > Yes autovacuum is great for general low use scenarios. Throw it at a > > database doing hundreds of thousands (or even millions) of transactions > > an hour that has relations that in the multiple hundred gig range and > > autovacuum is useless for a good portion of that database. > > Yes, this is precisely the case I'm talking about. Every single > high-volume client we have or have consulted for is using custom > vacuuming. Autovacuum works fine for the common case, but it doesn't > handle high-volume databases very well yet. That's the case today, because autovacuum is single threaded and can't hit >1 table at once, so a single very large table vacuum could allow other, smaller tables to bloat inordinately. Which is why 8.3 can vacuum > 1 table at a time. I'm not against having a schema keyword mind you, I'm just pointing out that the ultimate goal of the hackers seems to be an autovacuum daemon that can keep the database vacuumed without the need for user initiated vacuums at all. Not sure 8.3 will get us there. But the multi-threaded autovac is a darn good start.
"Joshua D. Drake" <jd@commandprompt.com> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> So: show me a use case for this that will still make sense in a >> mostly-autovacuum world. > I think you are living in a different world than I am if you think it > is a mostly-autovacuum world. Certainly it isn't a mostly-autovacuum world in 8.2 or earlier releases, but that's irrelevant to whether it makes sense to expend effort on a feature that would appear (at the earliest) in 8.4. Autovac in 8.3 is very significantly ahead of where it was in 8.2 --- to the point that we've turned it on by default --- and I predict that the pressure of being on by default will really light the afterburners behind its development. I think it's entirely likely that by the time 8.4 is ready, it will be perfectly fair to characterize manual vacuuming as a buggy-whip technology, at least for all but the three-sigmas-above-normal users. And I'd *much* rather see development effort going into making that vision come true, than into adding questionably-useful complexity in the support for manual vacuuming. regards, tom lane
On Fri, 30 Nov 2007 19:34:02 -0300 Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Joshua D. Drake escribió: > > > Autovacuum is a great utility for many workloads but even with the > > upcoming changes I will continually find myself turning off > > autovacuum for specific relations just so I can turn around and > > turn on vacuum within cron for others. > > > > The multi-worker autovacuum is a great new addition to help part of > > that problem (starvation) but it is not help against the other > > (resource consumption, specifically IO). > > Huh, autovac will consume exactly the same amount of I/O as a > user-induced vacuum, so I don't see your point. It can be determined "when" the I/O is used. > Remember, we're not adding VACUUM SCHEMA in 8.2 so it doesn't make any > sense to try to compare it against the old autovac. And you can bet > that in 8.4 autovac will have even more goodies. > I assume you mean 8.3, but that is certainly a valid point. Sincerely, Joshua D. Drake
Attachment
Joshua D. Drake escribió: > On Fri, 30 Nov 2007 19:34:02 -0300 > Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > Joshua D. Drake escribió: > > > > > The multi-worker autovacuum is a great new addition to help part of > > > that problem (starvation) but it is not help against the other > > > (resource consumption, specifically IO). > > > > Huh, autovac will consume exactly the same amount of I/O as a > > user-induced vacuum, so I don't see your point. > > It can be determined "when" the I/O is used. Future plans call for being able to change autovac parameters depending on time of day, so you will be able to decide that using autovacuum too. > > Remember, we're not adding VACUUM SCHEMA in 8.2 so it doesn't make any > > sense to try to compare it against the old autovac. And you can bet > > that in 8.4 autovac will have even more goodies. > > I assume you mean 8.3, but that is certainly a valid point. No, I meant 8.4 --- the feature set of 8.3 is already set on stone. So VACUUM SCHEMA has to compete with whatever we're able to do for the release following that one. -- Alvaro Herrera http://www.advogato.org/person/alvherre "Entristecido, Wutra (canción de Las Barreras) echa a Freyr a rodar y a nosotros al mar"
Joshua D. Drake escribió:> On Fri, 30 Nov 2007 19:34:02 -0300
> Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> > Joshua D. Drake escribió:
> >> > > The multi-worker autovacuum is a great new addition to help part ofFuture plans call for being able to change autovac parameters depending
> > > that problem (starvation) but it is not help against the other
> > > (resource consumption, specifically IO).
> >
> > Huh, autovac will consume exactly the same amount of I/O as a
> > user-induced vacuum, so I don't see your point.
>
> It can be determined "when" the I/O is used.
on time of day, so you will be able to decide that using autovacuum too.No, I meant 8.4 --- the feature set of 8.3 is already set on stone. So
> > Remember, we're not adding VACUUM SCHEMA in 8.2 so it doesn't make any
> > sense to try to compare it against the old autovac. And you can bet
> > that in 8.4 autovac will have even more goodies.
>
> I assume you mean 8.3, but that is certainly a valid point.
VACUUM SCHEMA has to compete with whatever we're able to do for the
release following that one.
i think even when autovacuum is uber-perfect , the manual overriding commands like VACUUM and utilities like VACUUMDB will still exist for a long time to come, so there is a case of improving them if required. Maybe it would be just easier to add a schema switch to VACUUMDB, which can just query the catalogs to find which tables to vacuum, this way no parser changes are involved or any tinkering with the backend code.
--
Alvaro Herrera http://www.advogato.org/person/alvherre
"Entristecido, Wutra (canción de Las Barreras)
echa a Freyr a rodar
y a nosotros al mar"
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
--
Usama Munir Dar http://linkedin.com/in/usamadar
Consultant Architect
Cell:+92 321 5020666
Skype: usamadar
On Fri, Nov 30, 2007 at 02:00:05PM -0800, Joshua D. Drake wrote: > Yes autovacuum is great for general low use scenarios. Throw it at a > database doing hundreds of thousands (or even millions) of transactions > an hour that has relations that in the multiple hundred gig range and > autovacuum is useless for a good portion of that database. This isn't a good argument for adding new knobs, though. It's an argument for using the limited resources to make autovacuum better. Compare with the never-ending arguments for index hints. A
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 Tom Lane wrote: > Certainly it isn't a mostly-autovacuum world in 8.2 or earlier releases, > but that's irrelevant to whether it makes sense to expend effort on a > feature that would appear (at the earliest) in 8.4. Autovac in 8.3 is > very significantly ahead of where it was in 8.2 --- to the point that > we've turned it on by default --- and I predict that the pressure of > being on by default will really light the afterburners behind its > development. I think it's entirely likely that by the time 8.4 is > ready, it will be perfectly fair to characterize manual vacuuming > as a buggy-whip technology, at least for all but the > three-sigmas-above-normal users. I think this is being extremely over-optimistic. 8.4 is now the next release. Autovacuum still has a long ways to go, and it's still fairly useless out of the box for large relations. I also truly cannot imagine that 8.4 is going to magically solve the vacuum-later vs. performance-now issue. > And I'd *much* rather see development effort going into making that > vision come true, than into adding questionably-useful complexity in > the support for manual vacuuming. It doesn't sound questionable, it sounds like a real-world feature request from someone who would immediately benefit from it. Also, development is not a zero-sum game - someone writing this patch would not necessarily have the skills (or time, or interest) in improving auto-vacuum. I am curious however, as to what the future vision of autovacuum is. Will it be so efficient that it won't impact all but the busiest tables? Will it be able to figure out the best time to vacuum on its own? Will it stop itself mid-run if the need arises? - -- Greg Sabino Mullane greg@turnstep.com End Point Corporation PGP Key: 0x14964AC8 200712050906 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iD8DBQFHVrEMvJuQZxSWSsgRAwK7AKDUv5mV74knvupqfshb77CruKsZLQCfTDD8 w/36OiC7XRu+3kqHP3vi8Og= =pcpb -----END PGP SIGNATURE-----
Greg Sabino Mullane wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: RIPEMD160 > > I am curious however, as to what the future vision of autovacuum is. > Will it be so efficient that it won't impact all but the busiest tables? > Will it be able to figure out the best time to vacuum on its own? I believe maintenance windows are on the horizon. Of course it seems that instead of focusing on that, we should just put in a generic job manager. > Will it stop itself mid-run if the need arises? Man I would love this one... coupled with a just enough vacuum. E.g; it calculates a percentage it needs to vacuum, and only vacuums that much. Leaving dead tuples behind. Joshua D. Drake
Joshua D. Drake wrote: > Greg Sabino Mullane wrote: >> I am curious however, as to what the future vision of autovacuum is. >> Will it be so efficient that it won't impact all but the busiest tables? >> Will it be able to figure out the best time to vacuum on its own? > > I believe maintenance windows are on the horizon. Of course it seems that > instead of focusing on that, we should just put in a generic job manager. When I proposed my idea of maintenance window, everybody seemed happy, so that's what got implemented. If you want to propose a generic job manager, be my guest. >> Will it stop itself mid-run if the need arises? > > Man I would love this one... coupled with a just enough vacuum. E.g; it > calculates a percentage it needs to vacuum, and only vacuums that much. > Leaving dead tuples behind. Stop dreaming and start assigning resources for it to get done ;-) The NTT guys have done so and they are ahead of us on that camp. -- Alvaro Herrera Developer, http://www.PostgreSQL.org/ "In Europe they call me Niklaus Wirth; in the US they call me Nickel's worth. That's because in Europe they call me by name, and in the US by value!"
Greg Sabino Mullane wrote: > I am curious however, as to what the future vision of autovacuum is. > Will it be so efficient that it won't impact all but the busiest tables? I think so; the guys with the dead space map ideas would know. > Will it be able to figure out the best time to vacuum on its own? No -- the admin will still need to configure it, of course. > Will it stop itself mid-run if the need arises? That's the idea if we get cancellable vacuum, yes. -- Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC Thou shalt study thy libraries and strive not to reinvent them without cause, that thy code may be short and readable and thy days pleasant and productive. (7th Commandment for C Programmers)