Re: autovacuum locking question - Mailing list pgsql-performance

From Jeff Janes
Subject Re: autovacuum locking question
Date
Msg-id CAMkU=1wmmOvKGC+_QqHBE=nM=nqPBq=DJrX8Sk3CeQ+krAcEkA@mail.gmail.com
Whole thread Raw
In response to autovacuum locking question  (Mike Schanne <mschanne@kns.com>)
Responses RE: autovacuum locking question  (Mike Schanne <mschanne@kns.com>)
Re: autovacuum locking question  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-performance
On Thu, Dec 5, 2019 at 5:26 PM Mike Schanne <mschanne@kns.com> wrote:

Hi,

I am investigating a performance problem in our application and am seeing something unexpected in the postgres logs regarding the autovacuum.

 

2019-12-01 13:05:39.029 UTC,"wb","postgres",6966,"127.0.0.1:53976",5ddbd990.1b36,17099,"INSERT waiting",2019-11-25 13:39:28 UTC,12/1884256,12615023,LOG,00000,"process 6966 still waiting for RowExclusiveLock on relation 32938 of database 32768 after 1000.085 ms","Process holding the lock: 6045. Wait queue: 6966.",,,,,"INSERT INTO myschema.mytable (...) VALUES (...) RETURNING process.mytable.mytable_id",13,,""

2019-12-01 13:05:39.458 UTC,,,6045,,5de3b800.179d,1,,2019-12-01 12:54:24 UTC,10/417900,0,ERROR,57014,"canceling autovacuum task",,,,,"automatic vacuum of table ""postgres.myschema.mytable""",,,,""

 

My understanding from reading the documentation was that a vacuum can run concurrently with table inserts/updates, but from reading the logs it appears they are conflicting over a row lock.  This particular table gets very frequent inserts/updates (10-100 inserts / sec) so I am concerned that if the autovacuum is constantly canceled, then the table never gets cleaned and its performance will continue to degrade over time.  Is it expected for the vacuum to be canceled by an insert in this way?

 

We are using postgres 9.6.10.


If the vacuum finds a lot of empty pages at the end of the table, it will try to truncate them and takes a strong lock to do so.  It is supposed to check every 20ms to see if anyone else is blocked on that lock, at which point it stops doing the truncation and releases the lock.  So it should never get "caught" holding the lock in order to be cancelled.  Is your setting for deadlock_timeout much lower than usual?  Also, if the truncation is bogged down in very slow IO, perhaps it doesn't actually get around to checking ever 20ms despite its intentionsl

How often have you seen it in the logs?

Cheers,

Jeff

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: autovacuum locking question
Next
From: Lars Aksel Opsahl
Date:
Subject: Re: How to run in parallel in Postgres