Thread: pg_autovacuum next steps
Lately I have been thinking about the next steps for the pg_autovacuum daemon. I have written up a document that describes what I'm planning to do next. Please read the attached and response as I would really like some feedback. Thanks, Matthew O'Connor
Attachment
"Matthew T. O'Connor" <matthew@zeut.net> writes: > Lately I have been thinking about the next steps for the pg_autovacuum > daemon. I have written up a document that describes what I'm planning > to do next. Please read the attached and response as I would really > like some feedback. > [ rtf document ] Please repost in some less proprietary format. Plain text is generally considered the thing to use on this list. regards, tom lane
В Пнд, 22.03.2004, в 00:12, Tom Lane пишет: > Please repost in some less proprietary format. Plain text is generally > considered the thing to use on this list. -- Markus Bertheau <twanger@bluetwanger.de>
Attachment
Matthew T. O'Connor wrote: > Lately I have been thinking about the next steps for the > pg_autovacuum daemon. I have written up a document that describes > what I'm planning to do next. Please read the attached and response > as I would really like some feedback. I think these configuration issues will become a lot easier if you make the autovacuum daemon a subprocess of the postmaster (like, say, the checkpoint process). Then you have access to a host of methods for storing state, handling configuration, etc.
> I think these configuration issues will become a lot easier if you make > the autovacuum daemon a subprocess of the postmaster (like, say, the > checkpoint process). Then you have access to a host of methods for > storing state, handling configuration, etc. Yeah - why delay making it a backend process? :) Chris
On Sun, 2004-03-21 at 20:31, Christopher Kings-Lynne wrote: > > I think these configuration issues will become a lot easier if you make > > the autovacuum daemon a subprocess of the postmaster (like, say, the > > checkpoint process). Then you have access to a host of methods for > > storing state, handling configuration, etc. > > Yeah - why delay making it a backend process? :) Ok, well this was part of the reason to have this conversation. My reasons: A) I wasn't sure if people really thought this was ready to be integrated. Tom had said a while ago, that it was a good to keep it as a contrib module while it's still actively being developed. B) Perhaps people like the idea of it being a client app (I don't think so.) C) Most importantly, I'm not backend hacker. If someone wants to do the initial work of getting it running as a backend process, I can take it from there. A while ago, Bruce offered to help me with any backend issues I might have, so perhaps with a little help I can take a run at it. So the first question big question is: Do we want to make it a backend subprocess now? Secondly, are there any other features that people are interested in that were not mentioned in my document? Matthew O'Connor
On Sun, 2004-03-21 at 18:12, Tom Lane wrote: > "Matthew T. O'Connor" <matthew@zeut.net> writes: > > [ rtf document ] > > Please repost in some less proprietary format. Plain text is generally > considered the thing to use on this list. I don't think RTF is proprietary but I should have just posted inline anyway so here is a copy: pg_autovacuum Version 2 Design Document: Exec Summary: pg_autovacuum was initially released as a contrib module in PostgreSQL v7.4. The version in 7.4 is by design very simple. No configuration is required, and very little configuration is possible. Despite these limitations it was voted the most popular new feature of PostgreSQL v7.4 according to the survey held on postgresql.org (http://www.postgresql.org/survey.php?View=1&SurveyID=23). Despite it's popularity there is much room for improvement. This document sets out to define the most important improvements that would help pg_autovacuum to become a truly powerful asset to the suite of tools that come with PostgreSQL. Current Problems & Limitations: Based on user feedback from people using pg_autovacuum in the field, and my own observations, there are a number of problems and limitation with pg_autovacuum. They are: * Inability to customize thresholds on a per table basis * Inability to set default thresholds on a per database basis * Inability to exclude specific databases / tables from pg_autovacuum monitoring * Inability to schedule vacuums during off-peak times * Lack of integration related to startup and shutdown * Ignorance of VACUUM and ANALYZE operations performed outside pg_autovacuum (requires backend integration? or can listen / notify can be used?) * Lack of logging options / syslog integration / log rotation options * Create table fails because template1 is busy I'm not sure how to address all of these concerns, or that they all should be addressed right now. One of my big questions is backend integration. I am leaning towards leaving pg_autovacuum as a client application in contrib for one more release. During this time, I can continue to tweak and improve pg_autovacuum so that we will have a very good idea what the final product should be before we make it a standard backend process. For PostgreSQL 7.5, I plan to implement these new features: 1.Per database defaults and per table thresholds (including total exclusion)2.Persistent data3.Single-Pass Mode (external scheduling from cron etc...)4.Off peak scheduling 1. Per Database defaults and Per table Thresholds: There are differing opinions as to the best way to providing these this feature. The primary debate is where to save the configuration data. I see three options: 1.Store config data inside a special pg_autovacuum table inside existing databases that wants custom settings. 2.Use a config file. This would require some additional coding to add the required parsing, but is possible. 3.Create a pg_autovacuum database inside any cluster that wants to customize their settings. Since many people do not like tools that clutter their databases by adding tables, I think option 1 (adding a pg_autovacuum table to existing databases) is right out. Using a config file would be Ok, but would require additional parsing code. My preference is option 3. Since pg_autovacuum will (hopefully) eventually become an integrated part of the backend, it will eventually be able to add required data to the system catalogs. Given these two premises, as long as pg_autovacuum remains a contrib module it could use it's own database to mimic having system tables. If this database exists, it will be used, if it does not exist, then pg_autovacuum will work just as it did in the 7.4 release with very limited options available to it. The user will be able to specify a non-default database. Table Structure for database specific defaults and table specific thresholds: databases_defaults: (will reference the pg_class system table)id serial primary keyexclude_database booleandefault_vacuum_scaling_factor floatdefault_vacuum_base_value intdefault_analyze_scaling_factor floatdefault_analyze_base_value intdboid oid references pg_database.oid table_thresholdsid serial primary keyexclude_table boolean (exclude this table)vacuum_scaling_factor float (equivalent to -v)vacuum_base_value int (equivalent to -V)vacuum_threshold float (if > 0, use this threshold)analyze_scaling_factor float (equivalent to -a)analyze_base_value int (equivalent to -A)analyze_threshold float (if > 0 use this threshold)relid oid references pg_classs.relid 2.Persistent pg_autovacuum Data: Right now pg_autovacuum has no memory of what was going on the last time it was run. So if significant changes have happened while pg_autovacuum is not running, they will not be counted in the analysis of when to perform a vacuum or analyze operation which can result in under vacuuming. So, pg_autovacuum should occasionally write down it's numbers to the database. The data will be stored in an additional table called table_data table_dataid serial primary keyCountAtLastAnalyze long CountAtLastVacuum longtable_thresholds_id intreferences table_thresholds 3.Single-Pass Mode (External Scheduling): I have received requests to be able to run pg_autovacuum only on request (not as a daemon) making only one pass over all the tables (not looping indefinately). The advantage being that it will operate more like the current vacuum command except that it will only vacuum tables that need to be vacuumed. This feature could be useful as long as pg_autovacuum exists outside the backend. If pg_autovacuum gets integrated into the backend and gets automatically started as a daemon during startup, then this option will no longer make sense. Once we have persistent data (Step 2) then we can easily operate in Single-Pass Mode. 4.Off-Peak Scheduling: A fundamental advantage of our vacuum system is that the work required to reclaim table space is taken out of the critical path and can be moved to and off-peak time when cycles are less precious. One of the drawbacks of the current pg_autovacuum is that it doesn't have any way to factor this in. In it's simplest form (which I will implement first) I would add the ability to add a second set of thresholds that will be active only during an “off-peak” time that can be specified in the pg_autovacuum database, perhaps in a general_settings table. A few notes about things I'm not planning on working on at the moment. Better logging options: An additional logging could would be added to the pg_autovacuum database and will log all activity (vacuums and analyzes) along with their corresponding duration. Syslog support. I'm not sure this is really needed, but a simple patch was submitted by one user and perhaps that can be reviewed / improved and applied.
On Sun, 21 Mar 2004, Matthew T. O'Connor wrote: > On Sun, 2004-03-21 at 20:31, Christopher Kings-Lynne wrote: > > > I think these configuration issues will become a lot easier if you make > > > the autovacuum daemon a subprocess of the postmaster (like, say, the > > > checkpoint process). Then you have access to a host of methods for > > > storing state, handling configuration, etc. > > > > Yeah - why delay making it a backend process? :) > > Ok, well this was part of the reason to have this conversation. > > My reasons: > A) I wasn't sure if people really thought this was ready to be > integrated. Tom had said a while ago, that it was a good to keep it as > a contrib module while it's still actively being developed. I was talking to Jan about some other work on VACUUM related to more intelligent vacuuming. Namely, maintaining a map (outside of shared memory) of blocks which have been pushed out of the free space map for VACUUM to visit (which requires a backend process) and being aware of load restrictions (ie, allowing user to only vacuum when the load average is less than X, for example) and some other leveling stuff to ensure that availability is consistent. Whilst this doesn't related to pg_autovacuum specifically, it'd be great if they could be released at the same time, I think. > > B) Perhaps people like the idea of it being a client app (I don't think > so.) > I'd like to see it as part of the backend. > C) Most importantly, I'm not backend hacker. If someone wants to do the > initial work of getting it running as a backend process, I can take it > from there. A while ago, Bruce offered to help me with any backend > issues I might have, so perhaps with a little help I can take a run at > it. I'd be happy to help you out. Gavin
> > B) Perhaps people like the idea of it being a client app (I don't think > > so.) > > > > I'd like to see it as part of the backend. > > > C) Most importantly, I'm not backend hacker. If someone wants to do the > > initial work of getting it running as a backend process, I can take it > > from there. A while ago, Bruce offered to help me with any backend > > issues I might have, so perhaps with a little help I can take a run at > > it. > > I'd be happy to help you out. Agreed. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Sun, 2004-03-21 at 23:00, Bruce Momjian wrote: > > > C) Most importantly, I'm not backend hacker. If someone wants to do the > > > initial work of getting it running as a backend process, I can take it > > > from there. A while ago, Bruce offered to help me with any backend > > > issues I might have, so perhaps with a little help I can take a run at > > > it. > > > > I'd be happy to help you out. > > Agreed. Ok, thanks for the offer to help, but I think I understated things above when I said I'll need a "little" help :-) I have a few big picture questions. Once pg_autovacuum is launched as a postmaster sub-process, what changes? That is, currently pg_autovacuum uses libpq to connect to a database and issue queries including a vacuum / analyze command when needed. After becoming a subprocess will (should) it still use libpq to connect to the databases, I don't think so, is it even possible to do that? If not, how will it checkout the stats of all the different databases? I guess it should fork() a new backend, connect to it somehow, and use it to query the database, but I'm really not sure how this works. I'm looking through the backend startup code to see how the stats collector and the bgwriter work since they are probably two semi-close examples of what I'll have to do. I think checkpoints does something similar in that it issues a checkpoint command. Thanks again... Matthew
On Mon, 22 Mar 2004, Matthew T. O'Connor wrote: > On Sun, 2004-03-21 at 23:00, Bruce Momjian wrote: > > > > C) Most importantly, I'm not backend hacker. If someone wants to do the > > > > initial work of getting it running as a backend process, I can take it > > > > from there. A while ago, Bruce offered to help me with any backend > > > > issues I might have, so perhaps with a little help I can take a run at > > > > it. > > > > > > I'd be happy to help you out. > > > > Agreed. > > Ok, thanks for the offer to help, but I think I understated things above > when I said I'll need a "little" help :-) > I haven't looked at the code but... > I have a few big picture questions. Once pg_autovacuum is launched as a > postmaster sub-process, what changes? That is, currently pg_autovacuum > uses libpq to connect to a database and issue queries including a vacuum > / analyze command when needed. After becoming a subprocess will > (should) it still use libpq to connect to the databases, I don't think It could use libpq but most definately shouldn't. > so, is it even possible to do that? If not, how will it checkout the > stats of all the different databases? I guess it should fork() a new > backend, connect to it somehow, and use it to query the database, but > I'm really not sure how this works. It can interact with the stats collector (seperate backend) in the same way that existing backends interact: through a domain socket. > I'm looking through the backend startup code to see how the stats > collector and the bgwriter work since they are probably two semi-close > examples of what I'll have to do. I think checkpoints does something > similar in that it issues a checkpoint command. The vacuum backend will call vacuum() (or something very like it) directly. I imagine that when it gets called and on which table will be based upon the existing algorithm. Thanks, Gavin
> > Ok, thanks for the offer to help, but I think I understated things above > > when I said I'll need a "little" help :-) > > > > I haven't looked at the code but... > > > I have a few big picture questions. Once pg_autovacuum is launched as a > > postmaster sub-process, what changes? That is, currently pg_autovacuum > > uses libpq to connect to a database and issue queries including a vacuum > > / analyze command when needed. After becoming a subprocess will > > (should) it still use libpq to connect to the databases, I don't think > > It could use libpq but most definately shouldn't. > > > so, is it even possible to do that? If not, how will it checkout the > > stats of all the different databases? I guess it should fork() a new > > backend, connect to it somehow, and use it to query the database, but > > I'm really not sure how this works. > > It can interact with the stats collector (seperate backend) in the same > way that existing backends interact: through a domain socket. > > > I'm looking through the backend startup code to see how the stats > > collector and the bgwriter work since they are probably two semi-close > > examples of what I'll have to do. I think checkpoints does something > > similar in that it issues a checkpoint command. > > The vacuum backend will call vacuum() (or something very like it) > directly. I imagine that when it gets called and on which table will be > based upon the existing algorithm. One point is this: vacuum() assumes that you are running in a fully fledged backend. There'd be a fair bit of work involved in allowing a single process to call vacuum() against multiple databases. As such, I think that a vacuum backend for a specific database should be forked upon the first connect. Also, the backend might like to try and workout if there are any active backends for its database every so often and if not, perform a final vacuum (if necessary) and exit, so that we don't have lots of idle processes sitting around. Is there a better approach than this? Gavin
On Mon, Mar 22, 2004 at 02:35:37AM -0500, Matthew T. O'Connor wrote: > On Sun, 2004-03-21 at 23:00, Bruce Momjian wrote: > > > > C) Most importantly, I'm not backend hacker. If someone wants to do the > > > > initial work of getting it running as a backend process, I can take it > > > > from there. A while ago, Bruce offered to help me with any backend > > > > issues I might have, so perhaps with a little help I can take a run at > > > > it. > > > > > > I'd be happy to help you out. > > > > Agreed. > > Ok, thanks for the offer to help, but I think I understated things above > when I said I'll need a "little" help :-) > > I have a few big picture questions. Once pg_autovacuum is launched as a > postmaster sub-process, what changes? All. It's important do it as backend process. Because libpq has very,very limited and slow resources for work with backendstuff.The base should be the standard backend with different "main loop" thatwill instead socket checks some shared information about tables andcalls directly vacuum stuff. In this case you can omit work withconnections, parseretc. I thought about it in last days and I found perfect Tom's idea aboutFSM tables usage: What I had in the back of my mind was: each backend counts attempted insertions and deletions in its relcache entries (an update adds to both counts). At transaction commit or abort, we know which of these two counts represents the number of dead tuples added to each relation, so while we scan the relcache for post-xact cleanup (which we will be doing anyway) we can transfer the correct count into the shared FSM entry for the relation. This gives us a reasonably accurate count in shared memory of all thetuple obsoletions since bootup, at least for heavily-used tables. (The FSM might choose to forget about lightly-used tables.) The auto vacuumer could look at the FSM numbers to decide which tables are highest priorityto vacuum. (2002-09-03 08:10:32) I looked at the code and I think extend FSM tables will pretty simple,but I unsure how relcache counters Tom thought. Tom? Karel -- Karel Zak <zakkr@zf.jcu.cz>http://home.zf.jcu.cz/~zakkr/
On Monday 22 March 2004 03:36, Matthew T. O'Connor wrote: > 1. Per Database defaults and Per table Thresholds: > > There are differing opinions as to the best way to providing these this > feature. The primary debate is where to save the configuration data. I > see three options: > > 1.Store config data inside a special pg_autovacuum table inside > existing databases that wants custom settings. > > 2.Use a config file. This would require some additional coding to add > the required parsing, but is possible. > > 3.Create a pg_autovacuum database inside any cluster that wants to > customize their settings. > > Since many people do not like tools that clutter their databases by > adding tables, I think option 1 (adding a pg_autovacuum table to > existing databases) is right out. Using a config file would be Ok, but > would require additional parsing code. My preference is option 3. I've nothing against #3 as a default, but can I put in a suggestion for 1 & 3, or rather some setting definable at runtime/build-time that lets you select database + schema for autovacuum to find its config data. I might be wrong, but it strikes me as the sort of thing people running shared environments will want to choose for themselves. -- Richard Huxton Archonet Ltd
Gavin Sherry wrote: > On Sun, 21 Mar 2004, Matthew T. O'Connor wrote: > >> On Sun, 2004-03-21 at 20:31, Christopher Kings-Lynne wrote: >> > > I think these configuration issues will become a lot easier if you make >> > > the autovacuum daemon a subprocess of the postmaster (like, say, the >> > > checkpoint process). Then you have access to a host of methods for >> > > storing state, handling configuration, etc. >> > >> > Yeah - why delay making it a backend process? :) >> >> Ok, well this was part of the reason to have this conversation. >> >> My reasons: >> A) I wasn't sure if people really thought this was ready to be >> integrated. Tom had said a while ago, that it was a good to keep it as >> a contrib module while it's still actively being developed. > > I was talking to Jan about some other work on VACUUM related to more > intelligent vacuuming. Namely, maintaining a map (outside of shared > memory) of blocks which have been pushed out of the free space map for > VACUUM to visit (which requires a backend process) and being aware of load > restrictions (ie, allowing user to only vacuum when the load average is > less than X, for example) and some other leveling stuff to ensure that > availability is consistent. Whilst this doesn't related to pg_autovacuum > specifically, it'd be great if they could be released at the same time, I > think. I don't recall the "outside of shared memory" part. Anyhow, the whole story goes like this: Maintain 2 bits per block that tell if the block has been vaccumed of all dead tuples since the last time it was dirtied, and if all its tuples are completely frozen. If those two conditions are true, there is no need to vacuum that block at all (Red Flag!!! On further thinking I realized that this assumes that the FSM is loss free). With a default 8K blocksize, this means 32K per 1GB segment, making 4 additional blocks. I actually think that these extra blocks should be somehow part of the heap files, so that they are subject to the regular buffer management. To keep the lock contention on them low, vacuum and backends will set/clear new flags in the bufhdr flags member. That way, the bgwriter and checkpointer will be the usual suspects to set/clear these flags in the shared bitmap array stored in the extra blocks. As to where to store these blocks, some block number arithmetic magic comes to mind. That way a blocks relnode and blockno automatically lead to the bits, even in the case of blind writes. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Mon, 2004-03-22 at 03:36, Gavin Sherry wrote: > One point is this: vacuum() assumes that you are running in a fully > fledged backend. There'd be a fair bit of work involved in allowing a > single process to call vacuum() against multiple databases. I can't imagine we want to do that. > As such, I > think that a vacuum backend for a specific database should be forked upon > the first connect. Also, the backend might like to try and workout if > there are any active backends for its database every so often and if not, > perform a final vacuum (if necessary) and exit, so that we don't have lots > of idle processes sitting around. I don't understand, are you saying that upon postmaster startup a vacuum backend should be forked for every database in the cluster? > Is there a better approach than this? One thought I had is that there wouldn't be a persistent daemon type process, rather the auto_vacuum process, like the check point process, would be kicked off periodically, it would initially be attached to template1, then when it's done there it will figure out the next database to take a look at and fork, the existing template1 process will exit, and the new process will be connected to the next database, just repeat this forking loop until done. Thoughts?
"Matthew T. O'Connor" <matthew@zeut.net> writes: > There are differing opinions as to the best way to providing these this > feature. The primary debate is where to save the configuration data. I > see three options: > 1.Store config data inside a special pg_autovacuum table inside > existing databases that wants custom settings. > 2.Use a config file. This would require some additional coding to add > the required parsing, but is possible. > 3.Create a pg_autovacuum database inside any cluster that wants to > customize their settings. > Since many people do not like tools that clutter their databases by > adding tables, I think option 1 (adding a pg_autovacuum table to > existing databases) is right out. Using a config file would be Ok, but > would require additional parsing code. My preference is option 3. Personally I like #2. The claim that this requires extra code seems bogus to me --- when you are working at the C code level, reading a table takes code too. I think both 1 and 3 represent clutter. Also, they will introduce entirely new sets of problems that we already know how to deal with in the case of config files, but have no experience with for config tables. Some examples: how do you get the daemon to notice that you changed the config (SIGHUP for config files, ?? otherwise)? Can you easily get at the config data from a daemon process regardless of which database it's attached to (if any)? If the config data is in a table, what sorts of interesting problems will arise from trying to vacuum that table itself? Basically I see config tables as cutting against the grain of everything we've done to date in Postgres DBA-level configuration, and I do not see a reason to justify that sort of reinvention. regards, tom lane
Gavin Sherry <swm@linuxworld.com.au> writes: > One point is this: vacuum() assumes that you are running in a fully > fledged backend. There'd be a fair bit of work involved in allowing a > single process to call vacuum() against multiple databases. Make that "it isn't going to happen". > As such, I > think that a vacuum backend for a specific database should be forked upon > the first connect. Also, the backend might like to try and workout if > there are any active backends for its database every so often and if not, > perform a final vacuum (if necessary) and exit, so that we don't have lots > of idle processes sitting around. Lots of idle processes sitting around is right out, too. Remember that each one would eat a backend connection slot. I think we are going to have to limit this to *one* process at a time. What that probably means is that we successively launch an autovacuum process against each database, it does whatever seems appropriate in that database and then quits. We could manage this just like checkpoints are presently managed --- the only thing the postmaster has to know is the desired idle period between end of one autovacuum and start of the next. regards, tom lane
On Mon, 2004-03-22 at 10:58, Tom Lane wrote: > Lots of idle processes sitting around is right out, too. Remember that > each one would eat a backend connection slot. I think we are going to > have to limit this to *one* process at a time. What that probably means > is that we successively launch an autovacuum process against each > database, it does whatever seems appropriate in that database and then > quits. We could manage this just like checkpoints are presently managed > --- the only thing the postmaster has to know is the desired idle period > between end of one autovacuum and start of the next. Ok, I was thinking a similar thing (see my response to Gavin). So we could have autovacuum fired off by the postmaster and it will connect to databases as needed in a serial fashion. Matthew
On Mon, 2004-03-22 at 07:25, Richard Huxton wrote: > On Monday 22 March 2004 03:36, Matthew T. O'Connor wrote: > > 1.Store config data inside a special pg_autovacuum table inside > > existing databases that wants custom settings. > > > > 2.Use a config file. This would require some additional coding to add > > the required parsing, but is possible. > > > > 3.Create a pg_autovacuum database inside any cluster that wants to > > customize their settings.parsing code. My preference is option 3. > > I've nothing against #3 as a default, but can I put in a suggestion for 1 & 3, > or rather some setting definable at runtime/build-time that lets you select > database + schema for autovacuum to find its config data. > > I might be wrong, but it strikes me as the sort of thing people running shared > environments will want to choose for themselves. If pg_autovacuum was being designed to live forever as a client app, then I agree admins having a choice would be good. But as we are going to eventually move any auto_vacuum data and settings into the system tables (when autovacuum is part of the system), I don't see the need to expend the extra cycles, especially since people seem to be pushing hard for autovacuum to be a backend function sooner rather than later.
On Mon, 2004-03-22 at 04:23, Karel Zak wrote: > All. It's important do it as backend process. Because libpq has very, > very limited and slow resources for work with backend stuff. Agreed. > The base should be the standard backend with different "main loop" that > will instead socket checks some shared information about tables and > calls directly vacuum stuff. In this case you can omit work with > connections, parser etc. So am I to understand I can start up a postmaster subprocess and then be able to monitor the activity of all the databases? I guess that makes sense since I would be talking to the stats collector directly via a socket. But that doesn't solve the problem of issuing the vacuum to different databases, I would still create a new backend for every database that needs a vacuum or analyze issues. Also, we don't want to launch multiple simultaneous vacuums so we want the commands to be serialized (I know some people want to be able to do this if databases are located on different disks, but for now i'm keeping things simple). I have an idea for this that I just mentioned in another message to the list. > I thought about it in last days and I found perfect Tom's idea about > FSM tables usage: There has been lots of discusion of incorporating FSM data into the auto_vacuum decision process. I am interested in exploring this, but since I'm already biting off more than I can easily chew, I am going to try and leave the decision making process the same for now. BTW I think we need to use both tools (stats and FSM) since not all tables will be in the FSM, an insert only table still needs to be analyzed periodically and a lightly updated table will eventually need to be vacuumed.
On Mon, 2004-03-22 at 10:51, Tom Lane wrote: > > 2.Use a config file. This would require some additional coding to add > > the required parsing, but is possible. > > Personally I like #2. The claim that this requires extra code seems > bogus to me --- when you are working at the C code level, reading a > table takes code too. I think both 1 and 3 represent clutter. Also, > they will introduce entirely new sets of problems that we already know > how to deal with in the case of config files, but have no experience > with for config tables. Some examples: how do you get the daemon to > notice that you changed the config (SIGHUP for config files, ?? > otherwise)? Can you easily get at the config data from a daemon process > regardless of which database it's attached to (if any)? If the config > data is in a table, what sorts of interesting problems will arise from > trying to vacuum that table itself? I guess what I mean by extra code is extra work since I am comfortable storing data in tables, and would need to do some work to put it in a config file. Some of this conversations should change if we really are going to integrate autovacuum into the backend. My suggestions for storing config data was really assuming that we would continue to be a client app for at least one more release cycle, but I think I'm hearing that people want it in the backend now. Assuming that we are being integrated into the backend, and assuming that a autovacuum process is fired off periodically by the postmaster, I think some config items should become GUC options and others will become part of the system tables. For example: enable_autovacuum should be a GUC option, but the vacuum threshold for a given table should probably just be a new column in the pg_class table. Any data stored in the system tables will be check every time the autovacuum process is run so there won't be any problems with noticing that the admin change the threshold for a given table. > Basically I see config tables as cutting against the grain of everything > we've done to date in Postgres DBA-level configuration, and I do not see > a reason to justify that sort of reinvention. I agree if we are talking about a client app, but I think the scheme laid out above is consistent for something that is part of the backend, do you disagree? Thanks, Matthew O'Connor
On Sun, Mar 21, 2004 at 05:32:59PM -0500, Matthew T. O'Connor wrote: > Lately I have been thinking about the next steps for the pg_autovacuum > daemon. I have written up a document that describes what I'm planning > to do next. Please read the attached and response as I would really > like some feedback. Hi, Matthew. For our uses, we found that pg_autovacuum did not behave as expected with vacuum_threshold set to 0. For our purposes, we have a very good idea of how many tuples need to be slurped up over a given interval, and would like the autovacuum daemon to simply go through and vacuum when it hits a, er, threshold. Obligatory explanation: We were doing this with cron, but when something gets really ugly, cron is not as intuitive as autovacuum. Having this in the postmaster would be fine by me (and in fact, I would even suggest that it belongs there), but I really need to have more flexibility or we are going to (read: I will be tasked with) be writing our own. Sorry for the terseness -- and I haven't had a chance to go over the whole thread; we're moving from 7.3.2 to 7.4.2 on "the big database" this week. Alex -- alex@posixnap.net Alex J. Avriette, Systems Programmer "You have a spell checker. It is called a brain. Please use it." - Chris Lyth
> > As such, I > > think that a vacuum backend for a specific database should be forked upon > > the first connect. Also, the backend might like to try and workout if > > there are any active backends for its database every so often and if not, > > perform a final vacuum (if necessary) and exit, so that we don't have lots > > of idle processes sitting around. > > Lots of idle processes sitting around is right out, too. Remember that > each one would eat a backend connection slot. I think we are going to > have to limit this to *one* process at a time. What that probably means > is that we successively launch an autovacuum process against each > database, it does whatever seems appropriate in that database and then > quits. We could manage this just like checkpoints are presently managed > --- the only thing the postmaster has to know is the desired idle period > between end of one autovacuum and start of the next. *slaps hand on forehead* Yes, this is the best approach. So, do we want a static time, a GUC controlled time or some time which is modified by pg_autovacuum's/stat's collector's knowledge of the amount of work which goes on in any given database? Gavin
Gavin Sherry <swm@linuxworld.com.au> writes: > So, do we want a static time, a GUC controlled time or some time which is > modified by pg_autovacuum's/stat's collector's knowledge of the amount of > work which goes on in any given database? From the point of view of the postmaster a GUC-controlled delay would seem like the best thing. We could discuss having the autovacuum code try to feed back adjustments in the delay, but remember that one of the golden virtues for the postmaster proper is simplicity; that translates directly to robustness. We don't want the postmaster engaging in anything complicated that could potentially lock it up or crash it due to a bug. Possibly this point could be finessed by a two-level structure, that is, postmaster launches autovacuum daemon (which is not itself a backend) and that in turn launches backends to do the real per-database work. The postmaster's only subsequent involvement is restarting the autovac daemon if it dies. The autovac daemon can be as complex as you want. This nice-sounding arrangement is probably not directly workable because of the fact that the postmaster has no good way to know about or control backends if they aren't its direct children. Perhaps the autovac daemon *should* use libpq, that is, not fork backends but connect via the postmaster each time it wants to run a backend. Then the backends are ordinary children of the postmaster and everything acts normally. (This could amount to using the existing autovac code, and simply adding a frammish to the postmaster to autospawn the autovac daemon as a non-backend child process.) regards, tom lane
Tom Lane wrote: >From the point of view of the postmaster a GUC-controlled delay would >seem like the best thing. We could discuss having the autovacuum code >try to feed back adjustments in the delay, but remember that one of the >golden virtues for the postmaster proper is simplicity; that translates >directly to robustness. We don't want the postmaster engaging in >anything complicated that could potentially lock it up or crash it due >to a bug. > > Agreed. [snip...] >This nice-sounding arrangement is probably not directly workable because >of the fact that the postmaster has no good way to know about or control >backends if they aren't its direct children. Perhaps the autovac daemon >*should* use libpq, that is, not fork backends but connect via the >postmaster each time it wants to run a backend. Then the backends are >ordinary children of the postmaster and everything acts normally. >(This could amount to using the existing autovac code, and simply adding >a frammish to the postmaster to autospawn the autovac daemon as a >non-backend child process.) > Well this certainly sounds like something that would be easy to do, which appeals to me at least as a first cut. Question: Does this mean that I lose many of the advantages of being "in the backend"? That is, would pg_autovacuum still be able to use facilities such as GUC, error reporting, access to the FSM, direct pipe connection to the stats collector etc?
Alex J. Avriette wrote: >Hi, Matthew. For our uses, we found that pg_autovacuum did not behave >as expected with vacuum_threshold set to 0. For our purposes, we have >a very good idea of how many tuples need to be slurped up over a given >interval, and would like the autovacuum daemon to simply go through and >vacuum when it hits a, er, threshold. > > Could you please explain this better, I don't really understand what the problem is. If you want pg_autovacuum to perform a vacuum on a table that has had exactly X updates no matter what, you can just run it with -V0 -vX (where X is the vacuum threshold) same thing can be done for analyze. Are you saying that you did this and it did not work as expected? >Having this in the postmaster would be fine by me (and in fact, I would >even suggest that it belongs there), but I really need to have more >flexibility or we are going to (read: I will be tasked with) be writing >our own. > > Everyone agrees it should be in the backend eventually the questions were just if it was ready, and how or if it should change when it goes in. >Sorry for the terseness -- and I haven't had a chance to go over the >whole thread; we're moving from 7.3.2 to 7.4.2 on "the big database" >this week. > No problem, but I need more information to diagnose your problem. Matthew
"Matthew T. O'Connor" <matthew@zeut.net> writes: > Well this certainly sounds like something that would be easy to do, > which appeals to me at least as a first cut. Question: Does this mean > that I lose many of the advantages of being "in the backend"? That is, > would pg_autovacuum still be able to use facilities such as GUC, error > reporting, access to the FSM, direct pipe connection to the stats > collector etc? If you aren't a backend then you couldn't safely access shared memory, including FSM in particular. I see no reason you couldn't use GUC though. There is no "direct pipe connection" to the stats collector, except in the output direction which is not what you want, so I'm not seeing your point there. I am not sure that lack of FSM access is a showstopper, though. We could easily imagine inventing backend commands to read out whatever info you want from FSM, so you could request the info from your connected backend. The more I think about this the more I like it --- it keeps the autovac control code still at arms length from the backend which will surely ease development and experimentation. I suppose there is some overhead in pushing data back and forth over the FE/BE protocol, but surely that would be negligible next to the "real work" of vacuuming. regards, tom lane
Tom Lane wrote: >If you aren't a backend then you couldn't safely access shared memory, >including FSM in particular. I see no reason you couldn't use GUC >though. There is no "direct pipe connection" to the stats collector, >except in the output direction which is not what you want, so I'm not >seeing your point there. > > I probably said that wrong, but how do backends get their stats data? Meaning, when I do a "select * from pg_stat_all_tables" how is the backend getting that data. The reason I'm interested is that if I don't have to fire up a backend just to check the stats that would reduce the load associated with the autovacuum daemon. Someone earlier in the thread seemed to imply there was a way to do this. >I am not sure that lack of FSM access is a showstopper, though. We >could easily imagine inventing backend commands to read out whatever >info you want from FSM, so you could request the info from your >connected backend. > > Yeah I agree, and for phase 1 we can just continue working only on stats data. >The more I think about this the more I like it --- it keeps the autovac >control code still at arms length from the backend which will surely >ease development and experimentation. I suppose there is some overhead >in pushing data back and forth over the FE/BE protocol, but surely that >would be negligible next to the "real work" of vacuuming. > Right, I think the overhead would be negligible. Since you seem to think this is (or at least might be) a good idea, I will go ahead and try to get the postmaster to fire-up the autovacuum daemon. So that the 1st cut, will basically be pg_autovacuum exactly as it stands now, just launched by the postmaster. Also, you didn't mention if I will be able to use the backend logging functions, I would guess that I can, but I'm not totally sure. Thanks again, Matthew
"Matthew T. O'Connor" <matthew@zeut.net> writes: > Tom Lane wrote: >> There is no "direct pipe connection" to the stats collector, > I probably said that wrong, but how do backends get their stats data? They read it out of a flat file that the stats collector rewrites every so often. > Meaning, when I do a "select * from pg_stat_all_tables" how is the > backend getting that data. The reason I'm interested is that if I don't > have to fire up a backend just to check the stats that would reduce the > load associated with the autovacuum daemon. Is that really worth the loss of independence? I'm not sure one way or the other myself. I suppose the autovac daemon could still be run by hand for testing purposes, but it would have to run as user postgres and on the same machine as the postmaster. > Right, I think the overhead would be negligible. Since you seem to > think this is (or at least might be) a good idea, I will go ahead and > try to get the postmaster to fire-up the autovacuum daemon. So that the > 1st cut, will basically be pg_autovacuum exactly as it stands now, just > launched by the postmaster. > Also, you didn't mention if I will be able to use the backend logging > functions, I would guess that I can, but I'm not totally sure. Yes, if you are launched by the postmaster then you'd have access to its stderr etc, so you could just log via elog/ereport. Again though this puts you in a situation where the daemon *must* be launched by the postmaster or it won't work. How much arm's-length separation are you willing to give up? regards, tom lane
On Mon, 22 Mar 2004, Tom Lane wrote: > "Matthew T. O'Connor" <matthew@zeut.net> writes: > > Well this certainly sounds like something that would be easy to do, > > which appeals to me at least as a first cut. Question: Does this mean > > that I lose many of the advantages of being "in the backend"? That is, > > would pg_autovacuum still be able to use facilities such as GUC, error > > reporting, access to the FSM, direct pipe connection to the stats > > collector etc? > [snip] > The more I think about this the more I like it --- it keeps the autovac > control code still at arms length from the backend which will surely > ease development and experimentation. I suppose there is some overhead > in pushing data back and forth over the FE/BE protocol, but surely that > would be negligible next to the "real work" of vacuuming. I was initially against the idea of using libpq but its growing on me too. I think it would be good if the core functions of pg_autovacuum: threshold algorithms, connection, issuing commands can be (re?)designed such that not only the backend can link against it but also a stripped down pg_autovacuum binary which can be used manually. That way users can have a choice and a workaround if there are any issues with the backend model. Also useful for debugging. Thanks, Gavin
Matthew T. O'Connor wrote: > * Inability to customize thresholds on a per table basis I ran headlong into this one. IMHO fixing this is critical. > * Inability to set default thresholds on a per database basis > * Inability to exclude specific databases / tables from pg_autovacuum > monitoring These would be nice to have, but less critical than #1 I think. > * Inability to schedule vacuums during off-peak times This would be *really* nice to have. In my recent case, if pg_autovacuum could work for say 3 minutes, and then back off for 2 minutes or so while the batch transactions hit, it would be ideal. > I'm not sure how to address all of these concerns, or that they all > should be addressed right now. One of my big questions is backend > integration. I am leaning towards leaving pg_autovacuum as a client > application in contrib for one more release. During this time, I can > continue to tweak and improve pg_autovacuum so that we will have a very > good idea what the final product should be before we make it a standard > backend process. I really think pg_autovacuum ought to get folded into the backend now, for 7.5. I haven't had time yet to read the entire thread, but I saw others making the same comment. It would make some of the listed problems go away, or at least become far easier to deal with. > For PostgreSQL 7.5, I plan to implement these new features: > > 1.Per database defaults and per table thresholds (including total > exclusion) Great! > 2.Persistent data > 3.Single-Pass Mode (external scheduling from cron etc...) > 4.Off peak scheduling Great again! > 1. Per Database defaults and Per table Thresholds: > > 1.Store config data inside a special pg_autovacuum table inside > existing databases that wants custom settings. A natural if folded into the backend. > 3.Single-Pass Mode (External Scheduling): > > I have received requests to be able to run pg_autovacuum only on request > (not as a daemon) making only one pass over all the tables (not looping > indefinately). The advantage being that it will operate more like the > current vacuum command except that it will only vacuum tables that need > to be vacuumed. This feature could be useful as long as pg_autovacuum > exists outside the backend. If pg_autovacuum gets integrated into the > backend and gets automatically started as a daemon during startup, then > this option will no longer make sense. It still might make sense. You could have a mode where the daemon essentially sleeps forever, until explicitly woken up by a signal. When woken, it makes one pass, and goes back to infinite sleep. Then provide a simple way to signal the autovacuum process -- maybe an extension of the current VACUUM syntax. > 4.Off-Peak Scheduling: > > A fundamental advantage of our vacuum system is that the work required > to reclaim table space is taken out of the critical path and can be > moved to and off-peak time when cycles are less precious. One of the > drawbacks of the current pg_autovacuum is that it doesn't have any way > to factor this in. > > In it's simplest form (which I will implement first) I would add the > ability to add a second set of thresholds that will be active only > during an “off-peak” time that can be specified in the pg_autovacuum > database, perhaps in a general_settings table. I don't know how this would work, but it is for sure important. In the recent testing I found that pg_autovacuum (well, lazy vacuum in general, but I was using pg_autovacuum to control it) made a huge difference in performance of batch transactions. They range from 4-5 seconds without vacuum running, to as high as 15 minutes with vacuum running. With the vacuum delay patch, delay = 1, pagecount = 8, I still saw times go as high as 10 minutes. Backing vacuum off any more than that caused it to fall behind the transaction rate unrecoverably. But as I said above, if the transactions could complete without vacuum running in 4-5 seconds, then vacuuming resumes for the 3-to-4 minutes between batches, all would be well. Joe
Joe Conway wrote: > Matthew T. O'Connor wrote: > >> * Inability to schedule vacuums during off-peak times > > > This would be *really* nice to have. In my recent case, if > pg_autovacuum could work for say 3 minutes, and then back off for 2 > minutes or so while the batch transactions hit, it would be ideal. > I'm not sure what you are suggesting here. As it stands right now, pg_autovacuum just issues a standard vacuum command, so there isn't anything pg_autovacuum can do until that command completes. There has been a lot of work going on trying to reduce performance impact associated with a vacuum (vacuum delay, ARC etc), hopefully that will make a big difference. > I really think pg_autovacuum ought to get folded into the backend now, > for 7.5. I haven't had time yet to read the entire thread, but I saw > others making the same comment. It would make some of the listed > problems go away, or at least become far easier to deal with. > Yeah, that seems to be the consensus, I am going to work on that next. >> 3.Single-Pass Mode (External Scheduling): > > > It still might make sense. You could have a mode where the daemon > essentially sleeps forever, until explicitly woken up by a signal. > When woken, it makes one pass, and goes back to infinite sleep. Then > provide a simple way to signal the autovacuum process -- maybe an > extension of the current VACUUM syntax. > Well one thing we talked about was having the autovacuum process not be a daemon that is running all the time but rather a process that is spawned periodically by the postmaster, so you wouldn't have to worry about the signal / wake up issues. I could see this working much like checkpoints where the postmaster fires them off on a schedule, there is nothing stopping you from issuing a checkpoint command to force one immediately. So perhaps there could be a new SQL command like "VACUUM AUTO" or something like that. >> 4.Off-Peak Scheduling: >> >> In it's simplest form (which I will implement first) I would add the >> ability to add a second set of thresholds that will be active only >> during an “off-peak” time that can be specified in the pg_autovacuum >> database, perhaps in a general_settings table. > > > I don't know how this would work, but it is for sure important. In the > recent testing I found that pg_autovacuum (well, lazy vacuum in > general, but I was using pg_autovacuum to control it) made a huge > difference in performance of batch transactions. They range from 4-5 > seconds without vacuum running, to as high as 15 minutes with vacuum > running. With the vacuum delay patch, delay = 1, pagecount = 8, I > still saw times go as high as 10 minutes. Backing vacuum off any more > than that caused it to fall behind the transaction rate unrecoverably. > But as I said above, if the transactions could complete without vacuum > running in 4-5 seconds, then vacuuming resumes for the 3-to-4 minutes > between batches, all would be well. Again, once the vacuum command is issued, it's out of pg_autovacuums control. There has been some talk of pg_autovacuum looking at the system load to see if it should wait, or passing some different delay settings to vacuum based on system activity, so maybe some of that will help. What I was talking about with Off-Peak scheduling is really just setting different thresholds for different times of the day, (or perhaps for different systems loads) so that if you know your app is typically busy from 8AM - 8PM, then you can set more conservative thresholds during that time window, and more aggressive thresholds during the Off-Peak time. Thanks for the feedback, Matthew
Gavin Sherry wrote: >I was initially against the idea of using libpq but its growing on me too. > >I think it would be good if the core functions of pg_autovacuum: threshold >algorithms, connection, issuing commands can be (re?)designed such that >not only the backend can link against it but also a stripped down >pg_autovacuum binary which can be used manually. That way users can have a >choice and a workaround if there are any issues with the backend model. >Also useful for debugging. > I agree. Initially, it appears that this won't be a problem since the pg_autovacuum executable will be untouched (or as untouched as possible anyway), it will just be launched by the backend. Going forward if I use any of the functionality provided by the backend (error reporting etc...) I will then have to deign it to work in both stand alone and postmaster sub-process modes, which I think is doable. Matthew
Matthew T. O'Connor wrote: > Could you please explain this better, I don't really understand what the > problem is. If you want pg_autovacuum to perform a vacuum on a table > that has had exactly X updates no matter what, you can just run it with > -V0 -vX (where X is the vacuum threshold) same thing can be done for > analyze. Are you saying that you did this and it did not work as expected? I think the problem with that (I initially tried it myself) is that without a per-table setting, it is impractical. For instance, I want the one large table in my app vacuumed every 500,000 dead tuples, but now my smaller tables never reach the threshold. Joe
Matthew T. O'Connor wrote: > Joe Conway wrote: >> This would be *really* nice to have. In my recent case, if >> pg_autovacuum could work for say 3 minutes, and then back off for 2 >> minutes or so while the batch transactions hit, it would be ideal. >> > I'm not sure what you are suggesting here. As it stands right now, > pg_autovacuum just issues a standard vacuum command, so there isn't > anything pg_autovacuum can do until that command completes. There has > been a lot of work going on trying to reduce performance impact > associated with a vacuum (vacuum delay, ARC etc), hopefully that will > make a big difference. Hopefully it will. I've not had a chance yet to test against cvs tip. I guess what I was suggesting would have to be built in to the lazy vacuum process itself, similar to the vacuum delay stuff. I have no idea if it would be practical, but ideally every X number of pages there would be a way to detect "current load on the postgres cluster" or "current load on the server", and then either sleep, or go full bore, depending on the result and some threshold setting. Kind of like the vacuum delay patch, but with "don't bother to sleep if the system is idle". Joe
On Mon, Mar 22, 2004 at 04:50:57PM -0500, Matthew T. O'Connor wrote: > Could you please explain this better, I don't really understand what the > problem is. If you want pg_autovacuum to perform a vacuum on a table > that has had exactly X updates no matter what, you can just run it with > -V0 -vX (where X is the vacuum threshold) same thing can be done for > analyze. Are you saying that you did this and it did not work as expected? My experience was it did not vacuum at all. > Everyone agrees it should be in the backend eventually the questions > were just if it was ready, and how or if it should change when it goes in. It is very simple code, I'm not sure what is "not ready" about it. If people don't want to use it, they don't have to. Many databases can go days to weeks without vacuuming. alex -- alex@posixnap.net Alex J. Avriette, Unix Systems Gladiator "You cannot invade the mainland United States. There would be a rifle behind each blade of grass." - Admiral Isoroku Yamamoto
Tom Lane wrote: >"Matthew T. O'Connor" <matthew@zeut.net> writes: > > >>I probably said that wrong, but how do backends get their stats data? >> >> > >They read it out of a flat file that the stats collector rewrites every >so often. > > > Ok so that would be easy to do (if we decide we want to) >Is that really worth the loss of independence? I'm not sure one way or >the other myself. I suppose the autovac daemon could still be run by >hand for testing purposes, but it would have to run as user postgres and >on the same machine as the postmaster. > > > I'm not sure, it would allow autovacuum to check the stats more regularly. I suppose it would be possible for pg_autovacuum to recognize if it's been launched by the postmaster or not and look up stats as appropriate, but as you say, I'm not sure it's worth it, and my first cut will work exactly as the current pg_autovacuum does. >>Also, you didn't mention if I will be able to use the backend logging >>functions, I would guess that I can, but I'm not totally sure. >> >> > >Yes, if you are launched by the postmaster then you'd have access to its >stderr etc, so you could just log via elog/ereport. Again though this >puts you in a situation where the daemon *must* be launched by the >postmaster or it won't work. How much arm's-length separation are you >willing to give up? > Well I think I would be more willing to give up the separation for logging purposes. I would think that an error message issued by a postmaster sub-process should wind up in the same location as an error issued by the postmaster proper. Also, people have requested lots of logging options such as syslog etc and I think it would be nice to get all this for free. I would think that pg_autovacuum could determine if it's a postmaster sub-process and log appropriately, does sound reasonable?