Re: [GENERAL] autovacuum holds exclusive lock on table preventing itfrom to be updated - Mailing list pgsql-general

From Dmitry O Litvintsev
Subject Re: [GENERAL] autovacuum holds exclusive lock on table preventing itfrom to be updated
Date
Msg-id BL2PR09MB100982187092C1BCB00434EBB9C40@BL2PR09MB1009.namprd09.prod.outlook.com
Whole thread Raw
In response to Re: [GENERAL] autovacuum holds exclusive lock on table preventing itfrom to be updated  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: [GENERAL] autovacuum holds exclusive lock on table preventing itfrom to be updated  (Scott Marlowe <scott.marlowe@gmail.com>)
List pgsql-general
yes, we had to restart database 4 days ago (and vacuum has resumed on start).
I checked the log files and discovered that autovacuum on this table takes

        pages: 0 removed, 14072307 remain
        tuples: 43524292 removed, 395006545 remain
        buffer usage: -1493114028 hits, 107664973 misses, 30263658 dirtied
        avg read rate: 1.604 MB/s, avg write rate: 0.451 MB/s
        system usage: CPU 2055.81s/17710.94u sec elapsed 524356.57 sec

6 days. So it is perpetually being autovacuumed (which I assumed to be a good thing)

Table has 400M entries, 115 GB.

I will try your suggestions in the test environment.

Thank you,
Dmitry
________________________________________
From: Jeff Janes <jeff.janes@gmail.com>
Sent: Monday, June 19, 2017 1:16 PM
To: Dmitry O Litvintsev
Cc: Andreas Kretschmer; pgsql-general@postgresql.org
Subject: Re: [GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated

On Mon, Jun 19, 2017 at 10:33 AM, Dmitry O Litvintsev <litvinse@fnal.gov<mailto:litvinse@fnal.gov>> wrote:
Hi

Since I have posted this nothing really changed. I am starting to panic (mildly).

The source (production) runs :

          relname           |           mode           | granted |                                substr
               |          query_start          |          age 

----------------------------+--------------------------+---------+----------------------------------------------------------------------+-------------------------------+------------------------
 t_inodes_iio_idx           | RowExclusiveLock         | t       | autovacuum: VACUUM ANALYZE public.t_inodes (to
preventwraparound)   | 2017-06-15 10:26:18.643209-05 | 4 days 01:58:56.697559 


This is close to unreadable.  You can use use \x to get output from psql which survives email more readably.

Your first report was 6 days ago.  Why is the job only 4 days old?  Are you frequently restarting your production
server,so that the vacuum job never gets a chance to finish?  If so, that would explain your predicament. 

And how big is this table, that it takes at least 4 days to VACUUM?

vacuum_cost_delay = 50ms

That is a lot.  The default value for this is 0.  The default value for autovacuum_vacuum_cost_delay is 20, which is
usuallytoo high for giant databases. 

I think you are changing this in the wrong direction.  Rather than increase vacuum_cost_delay, you need to decrease
autovacuum_vacuum_cost_delay,so that you won't keep having problems in the future. 


On your test server, change vacuum_cost_delay to zero and then initiate a manual vacuum of the table.  It will block on
theautovacuum's lock, so then kill the autovacuum (best to have the manual vacuum queued up first, otherwise it will be
racebetween when you start the manual vacuum, and when the autovacuum automatically restarts, to see who gets the
lock).See how long it takes this unthrottled vacuum to run, and how much effect the IO it causes has on the performance
ofother tasks.  If acceptable, repeat this on production (although really, I don't that you have much of a choice on
whetherthe effect it is acceptable or not--it needs to be done.) 

Cheers,

Jeff


pgsql-general by date:

Previous
From: Dmitry Dolgov
Date:
Subject: Re: [GENERAL] performance considerations of jsonb vs separate rows
Next
From: Israel Brewster
Date:
Subject: [GENERAL] sub-select with multiple records, columns