Re: [HACKERS] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1 - Mailing list pgsql-general

From Alvaro Herrera
Subject Re: [HACKERS] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Date
Msg-id 20150615134718.GN133018@postgresql.org
Whole thread Raw
In response to Re: [HACKERS] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1  (Andres Freund <andres@anarazel.de>)
Responses Re: [HACKERS] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-general
Andres Freund wrote:

> A first version to address this problem can be found appended to this
> email.
>
> Basically it does:
> * Whenever more than MULTIXACT_MEMBER_SAFE_THRESHOLD are used, signal
>   autovacuum once per members segment
> * For both members and offsets, once hitting the hard limits, signal
>   autovacuum everytime. Otherwise we loose the information when
>   restarting the database, or when autovac is killed. I ran into this a
>   bunch of times while testing.

Sounds reasonable.

I see another hole in this area.  See do_start_worker() -- there we only
consider the offsets limit to determine a database to be in
almost-wrapped-around state (causing emergency attention).  If the
database in members trouble has no pgstat entry, it might get completely
ignored.  I think the way to close this hole is to
find_multixact_start() in the autovac launcher for the database with the
oldest datminmxid, to determine whether we need to activate emergency
mode for it.  (Maybe instead of having this logic in autovacuum, it
should be a new function that receives database datminmulti and returns
a boolean indicating whether the database is in trouble or not.)

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-general by date:

Previous
From: Anton
Date:
Subject: Re: pg_last_xact_replay_timestamp lies
Next
From: James Cloos
Date:
Subject: Re: localtime ?