Re: autovacuum next steps, take 3 - Mailing list pgsql-hackers

From Tom Lane
Subject Re: autovacuum next steps, take 3
Date
Msg-id 4912.1173485896@sss.pgh.pa.us
Whole thread Raw
In response to Re: autovacuum next steps, take 3  ("Matthew T. O'Connor" <matthew@zeut.net>)
Responses Re: autovacuum next steps, take 3  ("Matthew T. O'Connor" <matthew@zeut.net>)
Re: autovacuum next steps, take 3  (Alvaro Herrera <alvherre@commandprompt.com>)
List pgsql-hackers
"Matthew T. O'Connor" <matthew@zeut.net> writes:
> Does a new worker really care about the PID of other workers or what 
> table they are currently working on?

As written, it needs the PIDs so it can read in the other workers' todo
lists (which are in files named by PID).

> It's not clear to me why a worker cares that there is a new worker, 
> since the new worker is going to ignore all the tables that are already 
> claimed by all worker todo lists.

That seems wrong to me, since it means that new workers will ignore
tables that are scheduled for processing by an existing worker, no
matter how far in the future that schedule extends.  As an example,
suppose you have half a dozen large tables in need of vacuuming.
The first worker in will queue them all up, and subsequent workers
will do nothing useful, at least not till the first worker is done
with the first table.  Having the first worker update its todo
list file after each table allows the earlier tables to be exposed
for reconsideration, but that's expensive and it does nothing for
later tables.

I suggest that maybe we don't need exposed TODO lists at all.  Rather
the workers could have internal TODO lists that are priority-sorted
in some way, and expose only their current table OID in shared memory.
Then the algorithm for processing each table in your list is
1. Grab the AutovacSchedule LWLock exclusively.2. Check to see if another worker is currently processing   that table;
ifso drop LWLock and go to next list entry.3. Recompute whether table needs vacuuming; if not,   drop LWLock and go to
nextentry.  (This test covers the   case where someone vacuumed the table since you made your   list.)4. Put table OID
intoshared memory, drop LWLock, then   vacuum table.5. Clear current-table OID from shared memory, then   repeat for
nextlist entry.
 

This creates a behavior of "whoever gets to it first" rather than
allowing workers to claim tables that they actually won't be able
to service any time soon.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Dave Page
Date:
Subject: Re: who gets paid for this
Next
From: Greg Smith
Date:
Subject: Re: Log levels for checkpoint/bgwriter monitoring