Alvaro Herrera wrote:
> worker to-do list
> -----------------
>
> When each worker starts, it determines which tables to process in the
> usual fashion: get pg_autovacuum and pgstat data and compute the
> equations.
>
> The worker then takes a "snapshot" of what's currently going on in the
> database, by storing worker PIDs, the corresponding table OID that's
> being currently worked, and the to-do list for each worker.
>
> It removes from its to-do list the tables being processed. Finally, it
> writes the list to disk.
>
> The table list will be written to a file in
> PGDATA/vacuum/<database-oid>/todo.<worker-pid>
> The file will consist of table OIDs, in the order in which they are
> going to be vacuumed.
>
> At this point, vacuuming can begin.
>
> Before processing each table, it scans the WorkerInfos to see if there's
> a new worker, in which case it reads its to-do list to memory.
>
> Then it again fetches the tables being processed by other workers in the
> same database, and for each other worker, removes from its own in-memory
> to-do all those tables mentioned in the other lists that appear earlier
> than the current table being processed (inclusive). Then it picks the
> next non-removed table in the list. All of this must be done with the
> Autovacuum LWLock grabbed in exclusive mode, so that no other worker can
> pick the same table (no IO takes places here, because the whole lists
> were saved in memory at the start.)
Sorry, I confused matters here by not clarifing on-disk to-do lists
versus in-memory. When we write the to-do list to file, that's the
to-do lists that other workers will see. It will not change; when I say
"remove a table for the to-do list", it will be removed from the to-do
list in memory, but the file will not get rewritten.
Note that a worker will not remove from its list a table that's in the
to-do list of another worker but not yet processed. It will only remove
those tables that are currently being processed (i.e. they appear in the
shared memory entry for that worker), and any tables that appear _before
that one_ on that particular worker's file.
So this behaves very much like what Tom describes in an email downthread,
not like what Matthew is thinking. In fact I'm thinking that the above
is needlessly complex, and that Tom's proposal is simpler and achieves
pretty much the same effect, so I'll have a look at evolving from that
instead.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support