Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > These are all shared catalogs. There are others, so you may still see
> > more. We got another report for pg_database
> > https://www.postgresql.org/message-id/A9D40BB7-CFD6-46AF-A0A1-249F04878A2A%40amazon.com
> > so I suppose there really is a bug. I don't know what's going on there.
>
> I think it's pretty obvious: autovacuum.c's rule for detecting whether
> some other worker is already processing table X is wrong when X is a
> shared table. I propose the attached patch.
Now that I actually tried this, it turned out that this problem is not
so simple. vacuum.c already has logic to use conditional acquire of the
table-level lock, and if not available it skips the table:
LOG: skipping vacuum of "pg_shdepend" --- lock not available
so an autovacuum worker is never "stuck" behind another worker trying to
vacuum the table. This code is already in 9.2. I suppose the only way
for multiple workers to get stuck is the relatively new logic in
lazy_truncate_heap that retries multiple times when AEL is not
available. I haven't tried to replicate this yet.
In other words, the patches proposed here would not fix the actual
problem. Back to the drawing board, it seems.
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services