Home > mailing lists

Fwd: Cluster "stuck" in "not accepting commands to avoid wraparound data loss" - Mailing list pgsql-hackers

From	Jeff Janes
Subject	Fwd: Cluster "stuck" in "not accepting commands to avoid wraparound data loss"
Date	December 17, 2015 17:04:35
Msg-id	CAMkU=1yWky3fFnJ8AYAdOCctQWrEF0RWhU8v9GOtFFpxkF3Myw@mail.gmail.com Whole thread
In response to	Cluster "stuck" in "not accepting commands to avoid wraparound data loss" (Andres Freund <andres@anarazel.de>)
Responses	Re: Fwd: Cluster "stuck" in "not accepting commands to avoid wraparound data loss"
List	pgsql-hackers

Tree view

Sorry, accidentally failed to include the list originally, here it is
for the list:

On Dec 16, 2015 9:52 AM, "Robert Haas" <robertmhaas@gmail.com> wrote:
>
> On Fri, Dec 11, 2015 at 1:08 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> > Since changes to datfrozenxid are WAL logged at the time they occur,
> > but the supposedly-synchronous change to ShmemVariableCache is not WAL
> > logged until the next checkpoint, a well timed crash can leave you in
> > the state where the system is in a tizzy about wraparound but each
> > database says "Nope, not me".
>
> ShmemVariableCache is an in-memory data structure, so it's going to
> get blown away and rebuilt on a crash.  But I guess it gets rebuild
> from the contents of the most recent checkpoint record, so that
> doesn't actually help.  However, I wonder if it would be safe to for
> the autovacuum launcher to calculate an updated value and call
> SetTransactionIdLimit() to update ShmemVariableCache.

I was wondering if that should happen either at the end of crash
recovery (but I suppose you can't poll pg_database yet at that
point?), or immediately before throwing the "database is not accepting
commands to avoid wraparound data loss" error.

At which point would it make sense for the launcher do it?  I guess
just after it was started up under PMSIGNAL_START_AUTOVAC_LAUNCHER
conditions?

> But I'm somewhat confused what this has to do with Andres's report.

Doesn't it explain the exact situation he is in, where the oldest
database is 200 million, but the cluster as a whole is 2 billion?

Cheers,

Jeff

pgsql-hackers by date:

From: Tom Lane
Date: 17 December 2015, 16:19:46
Subject: Re: Using a single standalone-backend run in initdb (was Re: Bootstrap DATA is a pita)

From: Robert Haas
Date: 17 December 2015, 17:07:43
Subject: Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?

Fwd: Cluster "stuck" in "not accepting commands to avoid wraparound data loss" - Mailing list pgsql-hackers

Previous

Next