Thread: CPU-intensive autovacuuming
Dear Postgresql experts, I'm trying to work out why my system spends so much time autovacuuming. It looks like pg_autovacuum wakes up every few minutes and does a query on some system tables, and then spins doing no more queries but burning all available CPU cycles for a a couple of minutes, before sleeping again. I'm logging all queries to syslog, and the only queries that autovacuum seems to run are about 3 selecting from pg_class and some stats tables. They complete in a couple of seconds. I would see VACUUM commands in there as well (wouldn't I?) if it was actually doing anything, but I don't. Since not much is happening on the system I'm not suprised that it decides that nothing needs vacuuming. But instead I think it "spins"; and it's the pg_autovacuum process, not a postmaster, that is taking all the CPU. I wonder if this is because I have a lot of tables (about 50,000 I think - I have one schema per user and each schema a couple of dozen tables). Could it be that there is some code in autovacuum that is O(n^2) in the number of tables, or something like that? Has anyone seen anything like this before? Any debugging suggestions? This is with the Debian package of 7.4.2, and all the default autovacuum settings. Cheers, Phil.
Following up on my own post from last night: > Could it be that there is some code in autovacuum that is O(n^2) in > the number of tables? Browsing the code using webcvs, I have found this: for (j = 0; j < PQntuples(res); j++) { tbl_elem = DLGetHead(dbs->table_list); while (tbl_elem != NULL) { I haven't really tried to understand what is going on in here, but it does look like it is getting the result of the "pg_class join stats" query and then matching it up against its internal list of tables using nested loops, which is undoubtedly O(n^2) in the number of tables. Have I correctly understood what is going on here? --Phil.
Phil Endecott wrote: > Following up on my own post from last night: > > > Could it be that there is some code in autovacuum that is O(n^2) in > > the number of tables? > > Browsing the code using webcvs, I have found this: > > for (j = 0; j < PQntuples(res); j++) > { > tbl_elem = DLGetHead(dbs->table_list); > while (tbl_elem != NULL) > { > > I haven't really tried to understand what is going on in here, but it > does look like it is getting the result of the "pg_class join stats" > query and then matching it up against its internal list of tables > using nested loops, which is undoubtedly O(n^2) in the number of tables. > > Have I correctly understood what is going on here? Indeed you have. I have head a few similar reports but perhaps none as bad as yours. One person put a small sleep value so that it doesn't spin so tight. You could also just up the sleep delay so that it doesn't do this work quite so often. No other quick suggestions.
Matthew T. O'Connor wrote: > Phil Endecott wrote: >> > Could it be that there is some code in autovacuum that is O(n^2) in >> > the number of tables? >> >> Browsing the code using webcvs, I have found this: >> >> for (j = 0; j < PQntuples(res); j++) >> { >> tbl_elem = DLGetHead(dbs->table_list); >> while (tbl_elem != NULL) >> { >> Have I correctly understood what is going on here? > Indeed you have. I have head a few similar reports but perhaps none as > bad as yours. One person put a small sleep value so that it doesn't > spin so tight. You could also just up the sleep delay so that it > doesn't do this work quite so often. No other quick suggestions. I do wonder why autovacuum is keeping its table list in memory rather than in the database. But given that it is keeping it in memory, I think the real fix is to sort that list (or keep it ordered when building or updating it). It is trivial to also get the query results ordered, and they can then be compared in O(n) time. I notice various other places where there seem to be nested loops, e.g. in the update_table_list function. I'm not sure if they can be fixed by similar means. --Phil.
Phil Endecott wrote: > Matthew T. O'Connor wrote: > >> Indeed you have. I have head a few similar reports but perhaps none >> as bad as yours. One person put a small sleep value so that it >> doesn't spin so tight. You could also just up the sleep delay so >> that it doesn't do this work quite so often. No other quick >> suggestions. > > > I do wonder why autovacuum is keeping its table list in memory rather > than in the database. For better or worse, this was a conscious design decision that the contrib version of autovacuum be non-invasive to your database. > But given that it is keeping it in memory, I think the real fix is to > sort that list (or keep it ordered when building or updating it). It > is trivial to also get the query results ordered, and they can then be > compared in O(n) time. I'm quite sure there is a better way, please submit a patch if you can. This was never a real concern for most people since the number of tables is typically small enough not to be a problem. The integrated version of autovacuum that didn't make the cut before 8.0 avoids this problem since the autovacuum data is stored in the database. > I notice various other places where there seem to be nested loops, > e.g. in the update_table_list function. I'm not sure if they can be > fixed by similar means. I would think so, they all basically do the same type of loop.
Matthew T. O'Connor wrote: > The integrated version > of autovacuum that didn't make the cut before 8.0 avoids this problem > since the autovacuum data is stored in the database. What is the status of this? Is it something that will be included in 8.1 or 8.0.n? I might be able to patch the current code but that doesn't seem like a useful thing to do if a better solution will arrive eventually. I am currently running vacuums from a cron job and I think I will be happy with that for the time being. (Incidentally, I have also found that the indexes on my pg_attributes table were taking up over half a gigabyte, which came down to less than 40 megs after reindexing them. Is there a case for having autovacuum also call reindex?) --Phil.
Phil Endecott wrote: > Matthew T. O'Connor wrote: > >> The integrated version of autovacuum that didn't make the cut before >> 8.0 avoids this problem since the autovacuum data is stored in the >> database. > > > What is the status of this? Is it something that will be included in > 8.1 or 8.0.n? I might be able to patch the current code but that > doesn't seem like a useful thing to do if a better solution will > arrive eventually. I am currently running vacuums from a cron job and > I think I will be happy with that for the time being. This is a good question :-) I have been so busy with work lately that I have not been able to work on it. I am currently trying to resurrect the patch I sent in for 8.0 and update it so that it applies against HEAD. Once that is done, I will need help from someone with the portions of the work that I'm not comfortable / capable of. The main issue with the version I created during the 8.0 devel cycle it used libpq to connect, query and issue commands against the databases. This was deemed bad, and I need help setting up the infrastructure to make this happen without libpq. I hope to have my patch applying against HEAD sometime this week but it probably won't happen till next week. So the summary of the autovacuum integration status is that we are fast running out of time (feature freeze July 1), and I have very little time to devote to this task. So you might want to submit your O(n) patch cause unfortunately it looks like integrated autovacuum might slip another release unless someone else steps up to work on it. > (Incidentally, I have also found that the indexes on my pg_attributes > table were taking up over half a gigabyte, which came down to less > than 40 megs after reindexing them. Is there a case for having > autovacuum also call reindex?) Yes there is certainly some merit to having autovacuum or something similar perform other system maintenance tasks such as reindexing. I just haven't taken it there yet. It does seem strange that your pg_attributes table go that big, anyone have any insight here? You did say you are using 7.4.2, I forget it that has the index reclaiming code in vacuum, also there are some autovacuum bugs in the early 7.4.x releases. You might try to upgrade to either 8.0.x or a later 7.4.x release. Matthew O'Connor
Phil Endecott wrote: > Matthew T. O'Connor wrote: > > The integrated version > > of autovacuum that didn't make the cut before 8.0 avoids this problem > > since the autovacuum data is stored in the database. > > What is the status of this? Is it something that will be included in > 8.1 or 8.0.n? I might be able to patch the current code but that > doesn't seem like a useful thing to do if a better solution will arrive > eventually. I am currently running vacuums from a cron job and I think > I will be happy with that for the time being. I will post about integrating pg_autovacuum into the backend for 8.1 in a few minutes. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Phil Endecott <spam_from_postgresql_general@chezphil.org> writes: > (Incidentally, I have also found that the indexes on my pg_attributes > table were taking up over half a gigabyte, which came down to less than > 40 megs after reindexing them. Is there a case for having autovacuum > also call reindex?) Lots of temp tables I suppose? If so that's not autovacuum's fault; it wasn't getting told about the activity in pg_attribute until this patch: 2005-03-31 18:20 tgl * src/backend/postmaster/: pgstat.c (REL7_4_STABLE), pgstat.c (REL8_0_STABLE), pgstat.c: Flush any remaining statistics counts out to the collector at process exit. Without this, operations triggered during backend exit (such as temp table deletions) won't be counted ... which given heavy usage of temp tables can lead to pg_autovacuum falling way behind on the need to vacuum pg_class and pg_attribute. Per reports from Steve Crawford and others. Unless the bloat occurred after you updated to 8.0.2, there's no issue. regards, tom lane
Phil, If you complete this patch, I'm very interested to see it. I think I'm the person Matthew is talking about who inserted a sleep value. Because of the sheer number of tables involved, even small values of sleep caused pg_autovacuum to iterate too slowly over its table lists to be of use in a production environment (where I still find its behavior to be preferable to a complicated list of manual vacuums performed in cron). -- Thomas F. O'Connell Co-Founder, Information Architect Sitening, LLC Strategic Open Source: Open Your i™ http://www.sitening.com/ 110 30th Avenue North, Suite 6 Nashville, TN 37203-6320 615-260-0005 On Jun 7, 2005, at 6:16 AM, Phil Endecott wrote: > Matthew T. O'Connor wrote: > >> Phil Endecott wrote: >> >>> > Could it be that there is some code in autovacuum that is O >>> (n^2) in >>> > the number of tables? >>> >>> Browsing the code using webcvs, I have found this: >>> >>> for (j = 0; j < PQntuples(res); j++) >>> { >>> tbl_elem = DLGetHead(dbs->table_list); >>> while (tbl_elem != NULL) >>> { Have I correctly understood what is going on here? >>> > > >> Indeed you have. I have head a few similar reports but perhaps >> none as bad as yours. One person put a small sleep value so that >> it doesn't spin so tight. You could also just up the sleep delay >> so that it doesn't do this work quite so often. No other quick >> suggestions. >> > > I do wonder why autovacuum is keeping its table list in memory > rather than in the database. > > But given that it is keeping it in memory, I think the real fix is > to sort that list (or keep it ordered when building or updating > it). It is trivial to also get the query results ordered, and they > can then be compared in O(n) time. > > I notice various other places where there seem to be nested loops, > e.g. in the update_table_list function. I'm not sure if they can > be fixed by similar means. > > --Phil.
--- "Thomas F. O'Connell" <tfo@sitening.com> wrote: > Phil, > > If you complete this patch, I'm very interested to see it. > > I think I'm the person Matthew is talking about who inserted a sleep > > value. Because of the sheer number of tables involved, even small > values of sleep caused pg_autovacuum to iterate too slowly over its > table lists to be of use in a production environment (where I still > find its behavior to be preferable to a complicated list of manual > vacuums performed in cron). > > -- > Thomas F. O'Connell > Co-Founder, Information Architect > Sitening, LLC > Were you sleeping every time through the loop? How about something like: if (j%500 == 1) usleep(100000) Regards, Shelby Cain __________________________________ Discover Yahoo! Stay in touch with email, IM, photo sharing and more. Check it out! http://discover.yahoo.com/stayintouch.html
I was usleeping in tiny increments in each iteration of the loop. I didn't try break it into iterative groups like this. Honestly, I'd prefer to see pg_autovacuum improved to do O(n) rather than O(n^2) table activity. At this point, though, I'm probably not too likely to have much time to hack pg_autovacuum before 8.1 is released, although if it doesn't become integrated by beta feature freeze, I might give it a shot. But I hope if anyone completes the linear improvement, they'll post to the lists. -- Thomas F. O'Connell Co-Founder, Information Architect Sitening, LLC Strategic Open Source: Open Your i™ http://www.sitening.com/ 110 30th Avenue North, Suite 6 Nashville, TN 37203-6320 615-260-0005 On Jun 10, 2005, at 9:12 AM, Shelby Cain wrote: > --- "Thomas F. O'Connell" <tfo@sitening.com> wrote: > > > Were you sleeping every time through the loop? How about something > like: > > if (j%500 == 1) usleep(100000) > > Regards, > > Shelby Cain
Hi everybody
I am trying to write a stored procedure that returns a result set but it is not working
this is the function:
///
CREATE OR REPLACE FUNCTION remisiones.fn_get_total_remitidoxprovision1("numeric")
RETURNS SETOF record AS
$BODY$
begin
select rm.provision as provision,
drm.producto as producto,
sum(drm.cantidad) as cantidad
FROM remisiones.remisiones rm, remisiones.detalles_remision drm
WHERE rm.remision = drm.remision and rm.provision = $1
GROUP BY rm.provision, drm.producto
ORDER BY rm.provision, drm.producto;
end;$BODY$
///
If I call this function from the interactive sql of pgadminIII I get this result:
select * from fn_gert_total_remitidosxprovision(1)
---------------------------------------------------------------------------
row refcursor
1 <unnamed porta1>
is there a way to display the value of the rows returned, i need it becouse I need to use it in a Datawindow definition in an Powerbuilder app.
thanks in advance
Hugo
I am trying to write a stored procedure that returns a result set but it is not working
this is the function:
///
CREATE OR REPLACE FUNCTION remisiones.fn_get_total_remitidoxprovision1("numeric")
RETURNS SETOF record AS
$BODY$
begin
select rm.provision as provision,
drm.producto as producto,
sum(drm.cantidad) as cantidad
FROM remisiones.remisiones rm, remisiones.detalles_remision drm
WHERE rm.remision = drm.remision and rm.provision = $1
GROUP BY rm.provision, drm.producto
ORDER BY rm.provision, drm.producto;
end;$BODY$
///
If I call this function from the interactive sql of pgadminIII I get this result:
select * from fn_gert_total_remitidosxprovision(1)
---------------------------------------------------------------------------
row refcursor
1 <unnamed porta1>
is there a way to display the value of the rows returned, i need it becouse I need to use it in a Datawindow definition in an Powerbuilder app.
thanks in advance
Hugo
"Thomas F. O'Connell" <tfo@sitening.com> writes: > Honestly, I'd prefer to see pg_autovacuum improved to do O(n) rather > than O(n^2) table activity. At this point, though, I'm probably not > too likely to have much time to hack pg_autovacuum before 8.1 is > released, although if it doesn't become integrated by beta feature > freeze, I might give it a shot. This would be vastly easier to fix if the code were integrated into the backend first. In the backend environment you could just keep the info in a dynahash.c hashtable instead of in a linear list. On the client side, you have to roll your own hashing (or adapt dynahash to life outside the backend environment). regards, tom lane
# htakada@gmail.com / 2005-06-11 08:44:32 -0400: > Hi everybody > > I am trying to write a stored procedure that returns a result set but it is > not working > this is the function: > /// > CREATE OR REPLACE FUNCTION > remisiones.fn_get_total_remitidoxprovision1("numeric") > > RETURNS SETOF record AS > $BODY$ > begin > select rm.provision as provision, > drm.producto as producto, > sum(drm.cantidad) as cantidad > FROM remisiones.remisiones rm, remisiones.detalles_remision drm > WHERE rm.remision = drm.remision and rm.provision = $1 > GROUP BY rm.provision, drm.producto > ORDER BY rm.provision, drm.producto; > end;$BODY$ > > /// http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#AEN32875 -- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991