Thread: Manual vacs 5x faster than autovacs?
Hi All, Running Pg 8.3RC2, Linux server, w/8GB RAM, OpenSuSE 10.2 OS (yes, I know that's old). I have seen *really* long-running autovacs eating up system resources. While the below is not an example of *really* long, it shows how I killed an autovac which had been running for more than 10 minutes, then ran a VAC FULL ANALYZE on same exact table in about ~2 min. Any wisdom here? Attributable to autovac_worker settings? Or Pg version? Other? Any insight appreciated. wb ++++++++++++++++++++++++++ $ psql template1 -c "SELECT procpid, current_query, to_char (now() - backend_start, 'HH24:MI:SS') AS connected_et, to_char (now() - query_start,'HH24:MI:SS') AS query_et FROM pg_stat_activity WHERE datname='mydb' ORDER BY query_et DESC LIMIT 1" procpid | current_query | connected_et | query_et ---------+--------------------------------------------+--------------+---------- 9064 | autovacuum: VACUUM ANALYZE myschema.mytable | 00:12:07 | 00:11:38 $ kill 9064 $ date; psql mydb -c "VACUUM FULL ANALYZE myschema.mytable"; date Wed Nov 11 17:25:41 UTC 2009 VACUUM Wed Nov 11 17:27:59 UTC 2009
Wayne Beaver <wayne@acedsl.com> writes: > Running Pg 8.3RC2, Linux server, w/8GB RAM, OpenSuSE 10.2 OS (yes, I > know that's old). I have seen *really* long-running autovacs eating up > system resources. While the below is not an example of *really* long, > it shows how I killed an autovac which had been running for more than > 10 minutes, then ran a VAC FULL ANALYZE on same exact table in about > ~2 min. Any wisdom here? Attributable to autovac_worker settings? autovacuum_vacuum_cost_delay. Is the slow autovac *really* eating a noticeable amount of system resources? I would think that a full speed manual vacuum would be a lot worse. regards, tom lane
On Thu, Nov 12, 2009 at 7:33 AM, Wayne Beaver <wayne@acedsl.com> wrote: > Hi All, > > Running Pg 8.3RC2, Linux server, w/8GB RAM, OpenSuSE 10.2 OS (yes, I know > that's old). I have seen *really* long-running autovacs eating up system > resources. While the below is not an example of *really* long, it shows how > I killed an autovac which had been running for more than 10 minutes, then > ran a VAC FULL ANALYZE on same exact table in about ~2 min. Any wisdom here? > Attributable to autovac_worker settings? Or Pg version? Other? > > Any insight appreciated. Autovac running slow is (generally) a good thing. It reduces the load on your IO subsystem so that other queries can still run fast. What resources are your long running autovacs eating up. If top shows 500Mres and 499Mshr, then don't worry, it's not actually eating up resources.
Hmm, looks like I've been myth-busted here. $ top | grep -E '29343|31924|29840|PID'; echo PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 29840 postgres 15 0 2150m 203m 194m S 0 2.5 0:00.59 postmaster 29343 postgres 15 0 2137m 360m 356m S 1 4.5 0:00.92 postmaster 31924 postgres 15 0 2135m 73m 70m S 1 0.9 0:00.44 postmaster So my claims of resource-usage appear incorrect. I'd seen autovacs running for hours and had mis-attributed this to growing query times on those tables - my thought was that "shrinking" the tables "more quickly" could make them "more-optimized", more often. Sounds like I could be chasing the wrong symptoms, though. wb > Quoting Scott Marlowe <scott.marlowe@gmail.com>: > > Autovac running slow is (generally) a good thing. It reduces the load > on your IO subsystem so that other queries can still run fast. What > resources are your long running autovacs eating up. If top shows > 500Mres and 499Mshr, then don't worry, it's not actually eating up > resources. > Quoting Tom Lane <tgl@sss.pgh.pa.us>: > > autovacuum_vacuum_cost_delay. Is the slow autovac *really* eating > a noticeable amount of system resources? I would think that a full > speed manual vacuum would be a lot worse. >> Wayne Beaver <wayne@acedsl.com> writes: >> >> Running Pg 8.3RC2, Linux server, w/8GB RAM, OpenSuSE 10.2 OS (yes, I >> know that's old). I have seen *really* long-running autovacs eating up >> system resources. While the below is not an example of *really* long, >> it shows how I killed an autovac which had been running for more than >> 10 minutes, then ran a VAC FULL ANALYZE on same exact table in about >> ~2 min. Any wisdom here? Attributable to autovac_worker settings?
On Thu, Nov 12, 2009 at 9:14 AM, Wayne Beaver <wayne@acedsl.com> wrote: > Hmm, looks like I've been myth-busted here. > > $ top | grep -E '29343|31924|29840|PID'; echo > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 29840 postgres 15 0 2150m 203m 194m S 0 2.5 0:00.59 postmaster > 29343 postgres 15 0 2137m 360m 356m S 1 4.5 0:00.92 postmaster > 31924 postgres 15 0 2135m 73m 70m S 1 0.9 0:00.44 postmaster > > So my claims of resource-usage appear incorrect. > > I'd seen autovacs running for hours and had mis-attributed this to growing > query times on those tables - my thought was that "shrinking" the tables > "more quickly" could make them "more-optimized", more often. Sounds like I > could be chasing the wrong symptoms, though. Now it is quite possible that a slow autovac is causing your queries to run slower. And it's that autovac isn't keeping up. One of the verious serious shortcomings of autovac in 8.1 (or was it 8.0? I think it was 8.1 as well) was that it only had one worker thread. So, if it has a moderate to high cost delay, then it might be able to keep up with the job and your tables will become bloated. The problem isn't that autovac is stealing too many resources, it's that it's not stealing enough. The first quick fix is 8.3 which has more efficient vacuuming code and the ability to run > 1 thread (it defaults to 3) so you can still keep it "detuned" to stay out of the way, but with enough threads it can hopefully keep up. Of course, eventually you reach the point where as the work load rises the ability of autovac to keep up is lost, and then you need more IO period. Whether pgsql or any other database, running out of io bandwidth is only really solvable by more IO bandwidth. So, what does iostat -x 10 say about utilization?
On Thu, Nov 12, 2009 at 9:33 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Thu, Nov 12, 2009 at 9:14 AM, Wayne Beaver <wayne@acedsl.com> wrote: >> Hmm, looks like I've been myth-busted here. >> >> $ top | grep -E '29343|31924|29840|PID'; echo >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> 29840 postgres 15 0 2150m 203m 194m S 0 2.5 0:00.59 postmaster >> 29343 postgres 15 0 2137m 360m 356m S 1 4.5 0:00.92 postmaster >> 31924 postgres 15 0 2135m 73m 70m S 1 0.9 0:00.44 postmaster >> >> So my claims of resource-usage appear incorrect. >> >> I'd seen autovacs running for hours and had mis-attributed this to growing >> query times on those tables - my thought was that "shrinking" the tables >> "more quickly" could make them "more-optimized", more often. Sounds like I >> could be chasing the wrong symptoms, though. > > Now it is quite possible that a slow autovac is causing your queries > to run slower. And it's that autovac isn't keeping up. One of the > verious serious shortcomings of autovac in 8.1 (or was it 8.0? I > think it was 8.1 as well) was that it only had one worker thread. So, > if it has a moderate to high cost delay, then it might be able to keep > up with the job and your tables will become bloated. might NOT be able to keep up > > The problem isn't that autovac is stealing too many resources, it's > that it's not stealing enough. > > The first quick fix is 8.3 which has more efficient vacuuming code and Whoops I see you're technically running 8.3, but you're running RC2 for some reason? I don't usually run 8.x.0 in production. Let alone RCs. You should really update before some nasty bug that's been squashed in later releases bites you.
> Quoting Scott Marlowe <scott.marlowe@gmail.com>: > >>> On Thu, Nov 12, 2009 at 9:14 AM, Wayne Beaver <wayne@acedsl.com> wrote: >>> I'd seen autovacs running for hours and had mis-attributed this to growing >>> query times on those tables - my thought was that "shrinking" the tables >>> "more quickly" could make them "more-optimized", more often. Sounds like I >>> could be chasing the wrong symptoms, though. >> >> Now it is quite possible that a slow autovac is causing your queries >> to run slower. So, if it has a moderate to high cost delay, then >> it might not be able to keep >> up with the job and your tables will become bloated. >> >> The problem isn't that autovac is stealing too many resources, it's >> that it's not stealing enough. >> >> I see you're technically running 8.3, but you're running RC2 >> for some reason? I don't usually run 8.x.0 in production. Let alone >> RCs. You should really update before some nasty bug that's been >> squashed in later releases bites you. Hahaha. Yes, 8.3RC2 was latest version at time I implemented related client app. Install is "production-like", more so than production - non-mission-critical, but important to some "VIP-like" users at intervals which are not necessarily predictable. I'm long past my goal of migrating to 8.4, actually... My autovac settings are all at default values, so sounds like I can at least tinker with _workers and _cost_delay. I've not yet gotten to you iostat inquiry from your previous response... wb
On Thu, Nov 12, 2009 at 9:58 AM, Wayne Beaver <wayne@acedsl.com> wrote: >> Quoting Scott Marlowe <scott.marlowe@gmail.com>: >> >>>> On Thu, Nov 12, 2009 at 9:14 AM, Wayne Beaver <wayne@acedsl.com> wrote: >>>> I'd seen autovacs running for hours and had mis-attributed this to >>>> growing >>>> query times on those tables - my thought was that "shrinking" the >>>> tables >>>> "more quickly" could make them "more-optimized", more often. Sounds like >>>> I >>>> could be chasing the wrong symptoms, though. >>> >>> Now it is quite possible that a slow autovac is causing your queries >>> to run slower. So, if it has a moderate to high cost delay, then it >>> might not be able to keep >>> up with the job and your tables will become bloated. >>> >>> The problem isn't that autovac is stealing too many resources, it's >>> that it's not stealing enough. >>> >>> I see you're technically running 8.3, but you're running RC2 >>> for some reason? I don't usually run 8.x.0 in production. Let alone >>> RCs. You should really update before some nasty bug that's been >>> squashed in later releases bites you. > > > Hahaha. Yes, 8.3RC2 was latest version at time I implemented related client > app. Install is "production-like", more so than production - > non-mission-critical, but important to some "VIP-like" users at intervals > which are not necessarily predictable. I'm long past my goal of migrating to > 8.4, actually... Worry far more about being out of date on 8.3. Since you're on an rc release you'll likely need to dump and restore to safely migrate to 8.3.latest, but once there, simply shutting down, updating and starting up is all that's usually required. > > My autovac settings are all at default values, so sounds like I can at least > tinker with _workers and _cost_delay. I've not yet gotten to you iostat > inquiry from your previous response... Don't worry too much, just want to see if your IO system is maxed out.
The autovac may have done most of the work before you killed it ... I'm new to Postgres, but from limited subjective experience, it seems it's a lot faster to vaccum ranges of blocks that are were recently vacuumed (at minimum, a good chunk of table will have been brought into buffer cache by both Postgres and the OS during the prior pass). I've found that with very large data tables, the auto-vaccum on default settings isn't as aggressive as I'd like ... I find running a VACUUM ANALYZE isn't at all intrusive, though I prefer to do it once a day at 3am. Beware that VACUUM FULL locks an entire table at a time :-) Cheers Dave On Thu, Nov 12, 2009 at 8:33 AM, Wayne Beaver <wayne@acedsl.com> wrote: > Hi All, > > Running Pg 8.3RC2, Linux server, w/8GB RAM, OpenSuSE 10.2 OS (yes, I know > that's old). I have seen *really* long-running autovacs eating up system > resources. While the below is not an example of *really* long, it shows how > I killed an autovac which had been running for more than 10 minutes, then > ran a VAC FULL ANALYZE on same exact table in about ~2 min. Any wisdom here? > Attributable to autovac_worker settings? Or Pg version? Other? > > Any insight appreciated. > > wb > > ++++++++++++++++++++++++++ > > $ psql template1 -c "SELECT procpid, current_query, to_char (now() - > backend_start, 'HH24:MI:SS') AS connected_et, to_char (now() - > query_start,'HH24:MI:SS') AS query_et FROM pg_stat_activity WHERE > datname='mydb' ORDER BY query_et DESC LIMIT 1" > > procpid | current_query | connected_et | > query_et > ---------+--------------------------------------------+--------------+---------- > 9064 | autovacuum: VACUUM ANALYZE myschema.mytable | 00:12:07 | > 00:11:38 > > > > $ kill 9064 > > > $ date; psql mydb -c "VACUUM FULL ANALYZE myschema.mytable"; date > Wed Nov 11 17:25:41 UTC 2009 > VACUUM > Wed Nov 11 17:27:59 UTC 2009 > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >
On 13/11/2009 2:29 PM, Dave Crooke wrote: > Beware that VACUUM FULL locks an entire table at a time :-) ... and often bloats its indexes horribly. Use CLUSTER instead if you need to chop a table that's massively bloated down to size; it'll be much faster, and shouldn't leave the indexes in a mess. I increasingly wonder what the purpose of VACUUM FULL in its current form is. -- Craig Ringer
On Fri, Nov 13, 2009 at 8:31 PM, Craig Ringer <craig@postnewspapers.com.au> wrote: > On 13/11/2009 2:29 PM, Dave Crooke wrote: > >> Beware that VACUUM FULL locks an entire table at a time :-) > > ... and often bloats its indexes horribly. Use CLUSTER instead if you > need to chop a table that's massively bloated down to size; it'll be > much faster, and shouldn't leave the indexes in a mess. > > I increasingly wonder what the purpose of VACUUM FULL in its current > form is. There's been talk of removing it. It's almost historical in nature now, but there are apparently one or two situations, like when you're almost out of space, that vacuum full can handle that dumping reload or cluster or whatnot can't do without more extra space.
On 14/11/2009 11:55 AM, Scott Marlowe wrote: > On Fri, Nov 13, 2009 at 8:31 PM, Craig Ringer > <craig@postnewspapers.com.au> wrote: >> On 13/11/2009 2:29 PM, Dave Crooke wrote: >> >>> Beware that VACUUM FULL locks an entire table at a time :-) >> >> ... and often bloats its indexes horribly. Use CLUSTER instead if you >> need to chop a table that's massively bloated down to size; it'll be >> much faster, and shouldn't leave the indexes in a mess. >> >> I increasingly wonder what the purpose of VACUUM FULL in its current >> form is. > > There's been talk of removing it. It's almost historical in nature > now, but there are apparently one or two situations, like when you're > almost out of space, that vacuum full can handle that dumping reload > or cluster or whatnot can't do without more extra space. Perhaps it should drop and re-create indexes as well, then? (Or disable them so they become inconsistent, then REINDEX them - same deal). It'd run a LOT faster, and the index bloat issue would be gone. The current form of the command just invites misuse and misapplication. -- Craig Ringer
On Fri, Nov 13, 2009 at 9:45 PM, Craig Ringer <craig@postnewspapers.com.au> wrote: > On 14/11/2009 11:55 AM, Scott Marlowe wrote: >> On Fri, Nov 13, 2009 at 8:31 PM, Craig Ringer >> <craig@postnewspapers.com.au> wrote: >>> On 13/11/2009 2:29 PM, Dave Crooke wrote: >>> >>>> Beware that VACUUM FULL locks an entire table at a time :-) >>> >>> ... and often bloats its indexes horribly. Use CLUSTER instead if you >>> need to chop a table that's massively bloated down to size; it'll be >>> much faster, and shouldn't leave the indexes in a mess. >>> >>> I increasingly wonder what the purpose of VACUUM FULL in its current >>> form is. >> >> There's been talk of removing it. It's almost historical in nature >> now, but there are apparently one or two situations, like when you're >> almost out of space, that vacuum full can handle that dumping reload >> or cluster or whatnot can't do without more extra space. > > Perhaps it should drop and re-create indexes as well, then? (Or disable > them so they become inconsistent, then REINDEX them - same deal). It'd > run a LOT faster, and the index bloat issue would be gone. > > The current form of the command just invites misuse and misapplication. Yeah, it should be a name that when you're typing it you know you screwed up to get where you are. The opleasemayihavebackthespaceilostwhilelockingmytablesandbloatingmyindexes command. No chance you'll run it by mistake either!
> Quoting Scott Marlowe <scott.marlowe@gmail.com>: > >> On Thu, Nov 12, 2009 at 9:58 AM, Wayne Beaver <wayne@acedsl.com> wrote: >>> Quoting Scott Marlowe <scott.marlowe@gmail.com>: >>> >>>>> On Thu, Nov 12, 2009 at 9:14 AM, Wayne Beaver <wayne@acedsl.com> wrote: >>>>> I'd seen autovacs running for hours and had mis-attributed this to >>>>> growing query times on those tables - my thought was that >>>>> "shrinking" the tables >>>>> "more quickly" could make them "more-optimized", more often. Sounds like >>>>> could be chasing the wrong symptoms, though. >>>> >>>> Now it is quite possible that a slow autovac is causing your queries >>>> to run slower. So, if it has a moderate to high cost delay, then it >>>> might not be able to keep >>>> up with the job and your tables will become bloated. >>>> >>>> The problem isn't that autovac is stealing too many resources, it's >>>> that it's not stealing enough. >>>> >> I've not yet gotten to you iostat inquiry from your previous response... > > Don't worry too much, just want to see if your IO system is maxed out. $ iostat Linux 2.6.18.8-0.9-default (myserver) 11/16/2009 avg-cpu: %user %nice %system %iowait %steal %idle 28.11 3.13 6.50 8.71 0.00 53.56 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 153.08 7295.23 3675.59 123127895363 62036043656\
On Mon, Nov 16, 2009 at 9:13 AM, Wayne Beaver <wayne@acedsl.com> wrote: >> Quoting Scott Marlowe <scott.marlowe@gmail.com>: >> >>> On Thu, Nov 12, 2009 at 9:58 AM, Wayne Beaver <wayne@acedsl.com> wrote: >>>> >>>> Quoting Scott Marlowe <scott.marlowe@gmail.com>: >>>> >>>>>> On Thu, Nov 12, 2009 at 9:14 AM, Wayne Beaver <wayne@acedsl.com> >>>>>> wrote: >>>>>> I'd seen autovacs running for hours and had mis-attributed this to >>>>>> growing query times on those tables - my thought was that "shrinking" >>>>>> the tables >>>>>> "more quickly" could make them "more-optimized", more often. Sounds >>>>>> like >>>>>> could be chasing the wrong symptoms, though. >>>>> >>>>> Now it is quite possible that a slow autovac is causing your queries >>>>> to run slower. So, if it has a moderate to high cost delay, then it >>>>> might not be able to keep >>>>> up with the job and your tables will become bloated. >>>>> >>>>> The problem isn't that autovac is stealing too many resources, it's >>>>> that it's not stealing enough. >>>>> >>> I've not yet gotten to you iostat inquiry from your previous response... >> >> Don't worry too much, just want to see if your IO system is maxed out. > > > $ iostat > Linux 2.6.18.8-0.9-default (myserver) 11/16/2009 > > avg-cpu: %user %nice %system %iowait %steal %idle > 28.11 3.13 6.50 8.71 0.00 53.56 > > Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn > sda 153.08 7295.23 3675.59 123127895363 62036043656\ That's just since the machine was turned on. run it like: iostat -x 10 and see what comes out after the first one.
Quoting Scott Marlowe <scott.marlowe@gmail.com>: > On Mon, Nov 16, 2009 at 9:13 AM, Wayne Beaver <wayne@acedsl.com> wrote: >>> Quoting Scott Marlowe <scott.marlowe@gmail.com>: >>> >>>> On Thu, Nov 12, 2009 at 9:58 AM, Wayne Beaver <wayne@acedsl.com> wrote: >>>>> >>>>> Quoting Scott Marlowe <scott.marlowe@gmail.com>: >>>>> >>>>>>> On Thu, Nov 12, 2009 at 9:14 AM, Wayne Beaver <wayne@acedsl.com> >>>>>>> wrote: >>>>>>> I'd seen autovacs running for hours and had mis-attributed this to >>>>>>> growing query times on those tables - my thought was that "shrinking" >>>>>>> the tables >>>>>>> "more quickly" could make them "more-optimized", more often. Sounds >>>>>>> like >>>>>>> could be chasing the wrong symptoms, though. >>>>>> >>>>>> Now it is quite possible that a slow autovac is causing your queries >>>>>> to run slower. So, if it has a moderate to high cost delay, then it >>>>>> might not be able to keep >>>>>> up with the job and your tables will become bloated. >>>>>> >>>>>> The problem isn't that autovac is stealing too many resources, it's >>>>>> that it's not stealing enough. >>>>>> >>>> I've not yet gotten to you iostat inquiry from your previous response... >>> >>> Don't worry too much, just want to see if your IO system is maxed out. >> > > That's just since the machine was turned on. run it like: > > iostat -x 10 > > and see what comes out after the first one. Duh! Sorry about that... $ iostat -x 10 Linux 2.6.18.8-0.9-default (myserver) 11/16/2009 avg-cpu: %user %nice %system %iowait %steal %idle 28.11 3.13 6.50 8.70 0.00 53.56 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 3.20 406.34 100.74 52.33 7293.84 3675.79 3646.92 1837.90 71.66 0.07 2.15 0.90 13.71