Thread: patch for pg_autovacuum 8.0.x prevent segv for dropped tables
Apologies if this is old news, but pg_autovacuum in 8.0.x has the bad habit of SEGVing and exiting when a table gets dropped out from under it. This creates problems if you rely on pg_autovacuum for the bulk of your vacuuming as it forgets it's statistics when it is restarted and so will skip some desireable vacuums. I looked at the new autovacuum in 8.1 and it appears from casual inspection not to have the same problem. Below is a patch for this that should apply against any 8.0.x. The change verifies that the catalog query returned some rows before accessing the row data. -dg diff -Naur source/postgresql-8.0.2/contrib/pg_autovacuum/pg_autovacuum.c build/postgresql-8.0.2/contrib/pg_autovacuum/pg_autovacuum.c --- source/postgresql-8.0.2/contrib/pg_autovacuum/pg_autovacuum.c 2005-04-02 16:02:03.000000000 -0800 +++ build/postgresql-8.0.2/contrib/pg_autovacuum/pg_autovacuum.c 2005-09-28 22:15:25.428710172 -0700 @@ -1013,6 +1013,7 @@ static void perform_maintenance_command(db_info * dbi, tbl_info * tbl, int operation) { + PGresult *res; char buf[256]; /* @@ -1069,10 +1070,16 @@ fflush(LOGOUTPUT); } - send_query(buf, dbi); - - update_table_thresholds(dbi, tbl, operation); - + res = send_query(buf, dbi); + if (PQntuples(res)) { + update_table_thresholds(dbi, tbl, operation); + } else { + if (args->debug >= 1) { + sprintf(logbuffer, "Cannot refind table %s", tbl->table_name); + log_entry(logbuffer, LVL_DEBUG); + fflush(LOGOUTPUT); + } + } if (args->debug >= 2) print_table_info(tbl); } -- David Gould daveg@sonic.net If simplicity worked, the world would be overrun with insects.
Looks reasonable to me. All the patch does is make sure that the result set is valid. Probably a check I should have done from the beginning, or pg _autovacuum should be locking tables to make sure they aren't dropped, but that sounds too intrusive, this is probably better. Matt daveg wrote: > Apologies if this is old news, but pg_autovacuum in 8.0.x has the bad habit > of SEGVing and exiting when a table gets dropped out from under it. This > creates problems if you rely on pg_autovacuum for the bulk of your vacuuming > as it forgets it's statistics when it is restarted and so will skip some > desireable vacuums. > > I looked at the new autovacuum in 8.1 and it appears from casual inspection > not to have the same problem. > > Below is a patch for this that should apply against any 8.0.x. The change > verifies that the catalog query returned some rows before accessing the row > data. > > -dg > > diff -Naur source/postgresql-8.0.2/contrib/pg_autovacuum/pg_autovacuum.c build/postgresql-8.0.2/contrib/pg_autovacuum/pg_autovacuum.c > --- source/postgresql-8.0.2/contrib/pg_autovacuum/pg_autovacuum.c 2005-04-02 16:02:03.000000000 -0800 > +++ build/postgresql-8.0.2/contrib/pg_autovacuum/pg_autovacuum.c 2005-09-28 22:15:25.428710172 -0700 > @@ -1013,6 +1013,7 @@ > static void > perform_maintenance_command(db_info * dbi, tbl_info * tbl, int operation) > { > + PGresult *res; > char buf[256]; > > /* > @@ -1069,10 +1070,16 @@ > fflush(LOGOUTPUT); > } > > - send_query(buf, dbi); > - > - update_table_thresholds(dbi, tbl, operation); > - > + res = send_query(buf, dbi); > + if (PQntuples(res)) { > + update_table_thresholds(dbi, tbl, operation); > + } else { > + if (args->debug >= 1) { > + sprintf(logbuffer, "Cannot refind table %s", tbl->table_name); > + log_entry(logbuffer, LVL_DEBUG); > + fflush(LOGOUTPUT); > + } > + } > if (args->debug >= 2) > print_table_info(tbl); > } >
Small nit: please observe postgres community conventions regarding a) indentation (BSD style, tabsize 4) and b) diff type (context, not unidiff) The patch itself looks ok to me. cheers andrew daveg wrote: >Apologies if this is old news, but pg_autovacuum in 8.0.x has the bad habit >of SEGVing and exiting when a table gets dropped out from under it. This >creates problems if you rely on pg_autovacuum for the bulk of your vacuuming >as it forgets it's statistics when it is restarted and so will skip some >desireable vacuums. > >I looked at the new autovacuum in 8.1 and it appears from casual inspection >not to have the same problem. > >Below is a patch for this that should apply against any 8.0.x. The change >verifies that the catalog query returned some rows before accessing the row >data. > >-dg > >diff -Naur source/postgresql-8.0.2/contrib/pg_autovacuum/pg_autovacuum.c build/postgresql-8.0.2/contrib/pg_autovacuum/pg_autovacuum.c >--- source/postgresql-8.0.2/contrib/pg_autovacuum/pg_autovacuum.c 2005-04-02 16:02:03.000000000 -0800 >+++ build/postgresql-8.0.2/contrib/pg_autovacuum/pg_autovacuum.c 2005-09-28 22:15:25.428710172 -0700 >@@ -1013,6 +1013,7 @@ > static void > perform_maintenance_command(db_info * dbi, tbl_info * tbl, int operation) > { >+ PGresult *res; > char buf[256]; > > /* >@@ -1069,10 +1070,16 @@ > fflush(LOGOUTPUT); > } > >- send_query(buf, dbi); >- >- update_table_thresholds(dbi, tbl, operation); >- >+ res = send_query(buf, dbi); >+ if (PQntuples(res)) { >+ update_table_thresholds(dbi, tbl, operation); >+ } else { >+ if (args->debug >= 1) { >+ sprintf(logbuffer, "Cannot refind table %s", tbl->table_name); >+ log_entry(logbuffer, LVL_DEBUG); >+ fflush(LOGOUTPUT); >+ } >+ } > if (args->debug >= 2) > print_table_info(tbl); > } > >
daveg <daveg@sonic.net> writes: > Below is a patch for this that should apply against any 8.0.x. The change > verifies that the catalog query returned some rows before accessing the row > data. Surely this is completely broken? AFAICT you are testing the result from a VACUUM or ANALYZE command, which is not going to return any tuples. regards, tom lane
Tom Lane wrote: > daveg <daveg@sonic.net> writes: > >> Below is a patch for this that should apply against any 8.0.x. The change >> verifies that the catalog query returned some rows before accessing the row >> data. >> > > Surely this is completely broken? AFAICT you are testing the result > from a VACUUM or ANALYZE command, which is not going to return any > tuples. Upon further inspection, I think you are right. I would think that instead of checking the query result with PQntuples, it should probably be checked with |PQresultStatus. Matt |
"Matthew T. O'Connor" <matthew@zeut.net> writes: > Tom Lane wrote: >> Surely this is completely broken? AFAICT you are testing the result >> from a VACUUM or ANALYZE command, which is not going to return any >> tuples. > Upon further inspection, I think you are right. I would think that > instead of checking the query result with PQntuples, it should probably > be checked with |PQresultStatus. ISTM this is the wrong place to test at all. I put a PQntuples check into update_table_thresholds() instead, which seems a far more direct defense against trouble. (Consider eg the case where someone drops the table just after your VACUUM completes successfully. Also there are drop/rename scenarios to think about: success of the VACUUM proves that there is a table named FOO, not that there is still a table with the OID you have on record.) regards, tom lane
Tom Lane wrote: >daveg <daveg@sonic.net> writes: > > >>Below is a patch for this that should apply against any 8.0.x. The change >>verifies that the catalog query returned some rows before accessing the row >>data. >> >> > >Surely this is completely broken? AFAICT you are testing the result >from a VACUUM or ANALYZE command, which is not going to return any >tuples. > > I guess he should change if (PQntuples(res)) to if (|PQresultStatus|(res) == PGRES_COMMAND_OK) ? cheers andrew
On Thu, Oct 20, 2005 at 12:30:27PM -0400, Tom Lane wrote: > "Matthew T. O'Connor" <matthew@zeut.net> writes: > > Tom Lane wrote: > >> Surely this is completely broken? AFAICT you are testing the result > >> from a VACUUM or ANALYZE command, which is not going to return any > >> tuples. > > > Upon further inspection, I think you are right. I would think that > > instead of checking the query result with PQntuples, it should probably > > be checked with |PQresultStatus. > > ISTM this is the wrong place to test at all. I put a PQntuples check > into update_table_thresholds() instead, which seems a far more direct > defense against trouble. (Consider eg the case where someone drops the > table just after your VACUUM completes successfully. Also there are > drop/rename scenarios to think about: success of the VACUUM proves that > there is a table named FOO, not that there is still a table with the OID > you have on record.) Yes, I agree, update_table_thresholds() is the right place for the check. Please ignore the earlier patch. Thanks -dg -- David Gould daveg@sonic.net If simplicity worked, the world would be overrun with insects.