Re: [ADMIN] Missing Chunk Error when doing a VACUUM FULL operation -DB Corruption? - Mailing list pgsql-admin

From Arjun Ranade
Subject Re: [ADMIN] Missing Chunk Error when doing a VACUUM FULL operation -DB Corruption?
Date
Msg-id CANrrCRxk=26f+UaaQBv8BKFk6skH4FCEA+Q_EV-oE2tYaW4s6Q@mail.gmail.com
Whole thread Raw
In response to Re: [ADMIN] Missing Chunk Error when doing a VACUUM FULL operation -DB Corruption?  (Stephen Frost <sfrost@snowman.net>)
List pgsql-admin
Yes, we are now in the process of adding custom metrics/alerts around the xmin horizon across all of our postgres databases. 

We will do a DB-wide VACUUM FULL as well (ironically, this incident started because VACUUM FULL failed last weekend).

Appreciate all the input on this.

Arjun


On Thu, Nov 2, 2017 at 11:06 AM, Stephen Frost <sfrost@snowman.net> wrote:
Tom, Arjun,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Arjun Ranade <ranade@nodalexchange.com> writes:
> > After dropping the replication slot, VACUUM FULL runs fine now and no
> > longer reports the "oldest xmin is far in the past"
>
> Excellent.  Maybe we should think about providing better tools to notice
> "stuck" replication slots.

+1

> In the meantime, you probably realize this already, but if global xmin
> has been stuck for months then you're going to have terrible bloat
> everywhere.  Database-wide VACUUM FULL seems called for.

This, really, is also a lesson in "monitor your distance to transaction
wrap-around"..  You really should know something is up a lot sooner than
the warnings from PG showing up in the logs.

Thanks!

Stephen

pgsql-admin by date:

Previous
From: bala jayaram
Date:
Subject: Fwd: [ADMIN] postgresql9.4 aws - no pg_upgrade
Next
From: Vasilis Ventirozos
Date:
Subject: Re: [ADMIN] postgresql9.4 aws - no pg_upgrade