Thread:

From
Yuri Niyazov
Date:
We are running Postgresql 9.3.8. We had something fairly surprising happen here the other day: we started reaching the vacuum limit, with our logs filling up with “WARNING:  database "mydb" must be vacuumed within 177009986 transactions” messages, hundreds of them per second, and we came dangerously close to experiencing a DB shutdown. *However*, there were only three things running that were writing to the database - a db-wide VACUUM that we manually started, a per-table autovacuum that postgres itself started, and a "write 1 row every 1-minute" cron job into a separate table that we use to track replication delay. 

Whenever we killed the autovacuum process, the very fast incrementation of the transaction IDs would stop. Then, a few minutes later, PG would restart the autovacuum process, and then very soon after that, the transaction IDs would start incrementing again very quickly. 

Modifying postgresql.conf to turn off autovacuums didn't prevent auto-vacuuming from restarting, in-line with documentation that says that autovacuums will start whenever postgres is in danger of running out of transaction IDs. 

We ended up writing a script that would check the running processes and continuously kill autovacuums as they started. We were able to get the manual vacuum to finish without running out of transaction IDs and without experiencing a shut-down, but the entire experience was slightly concerning.

Did we hit upon a known edge case where two vacuum processes running in parallel would increment transaction IDs very quickly?


Re: your mail

From
Alvaro Herrera
Date:
On 2019-Jan-03, Yuri Niyazov wrote:

> We are running Postgresql 9.3.8. We had something fairly surprising happen
> here the other day: we started reaching the vacuum limit, with our logs
> filling up with “WARNING:  database "mydb" must be vacuumed within
> 177009986 transactions” messages, hundreds of them per second, and we came
> dangerously close to experiencing a DB shutdown. *However*, there were only
> three things running that were writing to the database - a db-wide VACUUM
> that we manually started, a per-table autovacuum that postgres itself
> started, and a "write 1 row every 1-minute" cron job into a separate table
> that we use to track replication delay.

I suggest to consider an urgent upgrade to at least 9.3.11, but really
all the way to the end of 9.3, and quickly afterwards plan an upgrade to
the latest 9.4 in a short timeframe.  9.3.8 contains known bugs, which
the first upgrade gets you out of; but 9.3 as a whole is out of support
since a couple of months, so you'll be playing with fire until you're in
9.4.  Do not postpone to upgrade to 9.3.latest while planning the upgrade
to 9.4, though.

I wonder if you fell prey to this weird behavior:
https://postgr.es/m/CAMkU=1yE4YyCC00W_GcNoOZ4X2qxF7x5DUAR_kMt-Ta=YPyFPQ@mail.gmail.com

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services