Thread: pg_autovacuum start-script
I'm about to try to implement a simple pg_autovacuum script that can be used in conjunction with or integrated entirely with the contrib start-scripts for postgres. I just want to check that what I'm doing has the appropriate sanity checks. The behavior I'm considering is: if pg_ctl status returns a good value then if pg_autovacuum is not running then start pg_autovacuum else error else error Based on what I (think I) know, this covers the cases where: 1. There is not a valid instance of postgres running. 2. There is already a valid instance of pg_autovacuum running (which can still run as a daemon even in the event that postgres is stopped, IIRC). 3. It is safe to start pg_autovacuum because neither of the above cases holds. Is this logic sufficiently sane? In a subsequent iteration, I could even warn, I suppose, if the script discovered any vacuums (whether initiated by pg_autovacuum or no) already running. -tfo
Thomas F.O'Connell wrote: > I'm about to try to implement a simple pg_autovacuum script that can be > used in conjunction with or integrated entirely with the contrib > start-scripts for postgres. I just want to check that what I'm doing has > the appropriate sanity checks. Getting the startup and shutdown of pg_autovacuum coordinated with the postmaster would address one of the big holes in contrib (non-integrated) version of pg_autovacuum. > The behavior I'm considering is: > > if pg_ctl status returns a good value then > if pg_autovacuum is not running then > start pg_autovacuum > else > error > else > error > > Based on what I (think I) know, this covers the cases where: > > 1. There is not a valid instance of postgres running. > 2. There is already a valid instance of pg_autovacuum running (which can > still run as a daemon even in the event that postgres is stopped, IIRC). > 3. It is safe to start pg_autovacuum because neither of the above cases > holds. pg_autovacuum will exit when it can no longer connect to a postmaster. The problem is that it might sleep for several minutes before it notices that the postmaster has shutdown. So, you can restart the postmaster and as long as pg_autovacuum never noticed that it went away, it will keep chugging along as if nothing happened. Is there anyway pg_autovacuum can know if the postmaster has restarted? New PID? Or something better? > Is this logic sufficiently sane? Well if the script also sends a kill signal to pg_autovacuum that might solve the pg_autovacuum still running problem.
On Aug 27, 2004, at 3:37 PM, Matthew T. O'Connor wrote: > Getting the startup and shutdown of pg_autovacuum coordinated with the > postmaster would address one of the big holes in contrib > (non-integrated) version of pg_autovacuum. Yeah, so what I'm planning to write will probably just be a small Perl/shell script containing roughly the logic below. To me, it seems a bit shy of a general hole plugger, but maybe people will find it useful. I guess I'll post my final draft and let the community have at improving it. >> The behavior I'm considering is: >> if pg_ctl status returns a good value then >> if pg_autovacuum is not running then >> start pg_autovacuum >> else >> error >> else >> error >> Based on what I (think I) know, this covers the cases where: >> 1. There is not a valid instance of postgres running. >> 2. There is already a valid instance of pg_autovacuum running (which >> can still run as a daemon even in the event that postgres is stopped, >> IIRC). >> 3. It is safe to start pg_autovacuum because neither of the above >> cases holds. > > pg_autovacuum will exit when it can no longer connect to a postmaster. > The problem is that it might sleep for several minutes before it > notices that the postmaster has shutdown. So, you can restart the > postmaster and as long as pg_autovacuum never noticed that it went > away, it will keep chugging along as if nothing happened. > > Is there anyway pg_autovacuum can know if the postmaster has > restarted? New PID? Or something better? Hmm. If the above situation is true, does it matter whether pg_autovacuum knows whether the postmaster restarted? >> Is this logic sufficiently sane? > > Well if the script also sends a kill signal to pg_autovacuum that > might solve the pg_autovacuum still running problem. Based on what you say above, though, is it even necessary to kill it? Why not just observe that it's running and fail to start a new one? Unless there's a need to restart pg_autovacuum if postmaster restarts. -tfo
Thomas F.O'Connell wrote: > On Aug 27, 2004, at 3:37 PM, Matthew T. O'Connor wrote: >> pg_autovacuum will exit when it can no longer connect to a postmaster. >> The problem is that it might sleep for several minutes before it >> notices that the postmaster has shutdown. So, you can restart the >> postmaster and as long as pg_autovacuum never noticed that it went >> away, it will keep chugging along as if nothing happened. >> >> Is there anyway pg_autovacuum can know if the postmaster has >> restarted? New PID? Or something better? > > Hmm. If the above situation is true, does it matter whether > pg_autovacuum knows whether the postmaster restarted? The issue is knowing if you need to launch another pg_autovacuum process, you certainly don't want to have two pg_autovacuum processes running against the same server. >>> Is this logic sufficiently sane? >> >> Well if the script also sends a kill signal to pg_autovacuum that >> might solve the pg_autovacuum still running problem. > > Based on what you say above, though, is it even necessary to kill it? > Why not just observe that it's running and fail to start a new one? > Unless there's a need to restart pg_autovacuum if postmaster restarts. Perhaps not as long as you can reliably observe that it's running against the newly started postmaster and not another pg_autovacuum process running against an entirely separate postmaster process.
Hmm. Your last point in particular is one I hadn't considered, yet, largely because it's not relevant to my current problem. For a more generalized solution, though, it should definitely be considered. Does pg_autovacuum currently store the pid of the postmaster against which it's being run? In fact, how does it know against which postmaster it's being run? It doesn't take a database as an argument, does it? -tfo On Aug 27, 2004, at 4:47 PM, Matthew T. O'Connor wrote: > Thomas F.O'Connell wrote: >> On Aug 27, 2004, at 3:37 PM, Matthew T. O'Connor wrote: >>> Is there anyway pg_autovacuum can know if the postmaster has >>> restarted? New PID? Or something better? >> Hmm. If the above situation is true, does it matter whether >> pg_autovacuum knows whether the postmaster restarted? > > The issue is knowing if you need to launch another pg_autovacuum > process, you certainly don't want to have two pg_autovacuum processes > running against the same server. > >>>> Is this logic sufficiently sane? >>> >>> Well if the script also sends a kill signal to pg_autovacuum that >>> might solve the pg_autovacuum still running problem. >> Based on what you say above, though, is it even necessary to kill it? >> Why not just observe that it's running and fail to start a new one? >> Unless there's a need to restart pg_autovacuum if postmaster >> restarts. > > Perhaps not as long as you can reliably observe that it's running > against the newly started postmaster and not another pg_autovacuum > process running against an entirely separate postmaster process.
On Fri, 2004-08-27 at 18:09, Thomas F.O'Connell wrote: > Hmm. Your last point in particular is one I hadn't considered, yet, > largely because it's not relevant to my current problem. For a more > generalized solution, though, it should definitely be considered. Yeah, but as you say, for what you are doing, you probably don't need to worry about it. > Does pg_autovacuum currently store the pid of the postmaster against > which it's being run? In fact, how does it know against which > postmaster it's being run? It doesn't take a database as an argument, > does it? No it doesn't store the PID or anything like that and it doesn't know what postmaster it's connecting to, it just connects to what ever postmaster is listing to the specified host and port.
Okay, here's a rough draft that seems to be working after some simple testing: #!/bin/bash # auto_pg_autovacuum is a utility that can be used in crontab or in init # scripts to guarantee that pg_autovacuum is running when postgres is # running. # Original Author -- Thomas F. O'Connell <tfo@sitening.com> # 2004-08-28 # Assumptions: # 1. A sane PATH exists that includes pg_ctl and pg_autovacuum. # 2. pgrep exists on the system and is in PATH. # Accept an optional PGDATA override. # Even though -D means daemonize to pg_autovacuum, I thought this argument # should be consistent with the PGDATA flags for other postgresql utilities. getopts D: opt [ $opt == ? ] && exit 1 [ $OPTARG ] && PGDATA=$OPTARG # But if we don't know where to tell pg_ctl to look for status information, # then we have to error out. if [ ! $PGDATA ]; then echo "PGDATA must be set or specified as an argument to -D"; exit 1; fi # Now check to see whether we have a postmaster. pg_ctl status -D $PGDATA >/dev/null if [ $? != 0 ]; then # If we don't, there's no point starting pg_autovacuum. echo "No postmaster running. Aborting."; exit 1; fi # Now pgrep for an exact match for pg_autovacuum. pgrep -x pg_autovacuum >/dev/null if [ $? == 0 ]; then # If we find something, don't start another one. echo "pg_autovacuum is already running. Aborting."; exit 1; fi # If we're going to start pg_autovacuum, allow specification of a logfile # via -L. Eventually, it would be nice to allow a -o flag or something # similar to allow any pg_autovacuum options to be passed through. getopts L: opt [ $OPTARG ] && LOG="-L $OPTARG" pg_autovacuum -D $LOG [ $? == 0 ] && echo "pg_autovacuum successfully started." This is also available at: http://www.sitening.com/auto_pg_autovacuum Eventually, we'll probably create a PostgreSQL utilities section since I've got some Slony scripts underway, too. Feedback and comments welcome. I'm not an expert shell scripter, so best practices tips are especially welcome. My apologies if this was better posted to HACKERS or a different list. There's not a contrib list that I know of. -tfo On Aug 27, 2004, at 9:33 PM, Matthew T. O'Connor wrote: > On Fri, 2004-08-27 at 18:09, Thomas F.O'Connell wrote: >> Hmm. Your last point in particular is one I hadn't considered, yet, >> largely because it's not relevant to my current problem. For a more >> generalized solution, though, it should definitely be considered. > > Yeah, but as you say, for what you are doing, you probably don't need > to > worry about it. > >> Does pg_autovacuum currently store the pid of the postmaster against >> which it's being run? In fact, how does it know against which >> postmaster it's being run? It doesn't take a database as an argument, >> does it? > > No it doesn't store the PID or anything like that and it doesn't know > what postmaster it's connecting to, it just connects to what ever > postmaster is listing to the specified host and port.
I haven't had a chance to try it yet, but it looks like it will do what you want. Thomas F.O'Connell wrote: > Okay, here's a rough draft that seems to be working after some simple > testing: > > > #!/bin/bash > > > # auto_pg_autovacuum is a utility that can be used in crontab or in init > # scripts to guarantee that pg_autovacuum is running when postgres is > # running. > > # Original Author -- Thomas F. O'Connell <tfo@sitening.com> > # 2004-08-28 > > # Assumptions: > # 1. A sane PATH exists that includes pg_ctl and pg_autovacuum. > # 2. pgrep exists on the system and is in PATH. > > > # Accept an optional PGDATA override. > # Even though -D means daemonize to pg_autovacuum, I thought this argument > # should be consistent with the PGDATA flags for other postgresql > utilities. > getopts D: opt > [ $opt == ? ] && exit 1 > [ $OPTARG ] && PGDATA=$OPTARG > > # But if we don't know where to tell pg_ctl to look for status information, > # then we have to error out. > if [ ! $PGDATA ]; then > echo "PGDATA must be set or specified as an argument to -D"; > exit 1; > fi > > # Now check to see whether we have a postmaster. > pg_ctl status -D $PGDATA >/dev/null > if [ $? != 0 ]; then > # If we don't, there's no point starting pg_autovacuum. > echo "No postmaster running. Aborting."; > exit 1; > fi > > # Now pgrep for an exact match for pg_autovacuum. > pgrep -x pg_autovacuum >/dev/null > if [ $? == 0 ]; then > # If we find something, don't start another one. > echo "pg_autovacuum is already running. Aborting."; > exit 1; > fi > > # If we're going to start pg_autovacuum, allow specification of a logfile > # via -L. Eventually, it would be nice to allow a -o flag or something > # similar to allow any pg_autovacuum options to be passed through. > getopts L: opt > [ $OPTARG ] && LOG="-L $OPTARG" > pg_autovacuum -D $LOG > [ $? == 0 ] && echo "pg_autovacuum successfully started." > > > This is also available at: > > http://www.sitening.com/auto_pg_autovacuum > > Eventually, we'll probably create a PostgreSQL utilities section since > I've got some Slony scripts underway, too. > > Feedback and comments welcome. I'm not an expert shell scripter, so best > practices tips are especially welcome. > > My apologies if this was better posted to HACKERS or a different list. > There's not a contrib list that I know of. > > -tfo > > On Aug 27, 2004, at 9:33 PM, Matthew T. O'Connor wrote: > >> On Fri, 2004-08-27 at 18:09, Thomas F.O'Connell wrote: >> >>> Hmm. Your last point in particular is one I hadn't considered, yet, >>> largely because it's not relevant to my current problem. For a more >>> generalized solution, though, it should definitely be considered. >> >> >> Yeah, but as you say, for what you are doing, you probably don't need to >> worry about it. >> >>> Does pg_autovacuum currently store the pid of the postmaster against >>> which it's being run? In fact, how does it know against which >>> postmaster it's being run? It doesn't take a database as an argument, >>> does it? >> >> >> No it doesn't store the PID or anything like that and it doesn't know >> what postmaster it's connecting to, it just connects to what ever >> postmaster is listing to the specified host and port. > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >