Re: pgsql: First-draft release notes for 9.3.3. - Mailing list pgsql-committers

From Andres Freund
Subject Re: pgsql: First-draft release notes for 9.3.3.
Date
Msg-id 20140216092310.GL19470@alap3.anarazel.de
Whole thread Raw
In response to pgsql: First-draft release notes for 9.3.3.  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pgsql: First-draft release notes for 9.3.3.  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-committers
Hi Tom,

Some comments on the release notes:

<!--
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Branch: master [423e1211a] 2014-01-10 18:03:18 -0300
Branch: REL9_3_STABLE [a25c2b7c4] 2014-01-10 18:03:18 -0300
-->

    <listitem>
     <para>
      Fix multixact freezing of tuples that predate
      a <literal>pg_upgrade</> to 9.3
      (Álvaro Herrera)
     </para>

     <para>
      This oversight would result in complaints such as <quote>ERROR:
      MultiXactId 11415437 does no longer exist -- apparent wraparound</>.
     </para>
    </listitem>

I *think* this could only happen with changes that were committed
*after* 9.3.2 has been released. Alvaro, that's right, no?
Are such issues listed?

<!--
Author: Tom Lane <tgl@sss.pgh.pa.us>
Branch: master [e8312b4f0] 2013-12-13 11:50:15 -0500
Branch: REL9_3_STABLE [478af9b79] 2013-12-13 11:50:25 -0500
-->

    <listitem>
     <para>
      Prevent timeout interrupts from taking control away from mainline
      code unless <varname>ImmediateInterruptOK</> is set
      (Andres Freund, Tom Lane)
     </para>

     <para>
      This was initially reported as a <quote>stuck spinlock</> failure,
      but many other misbehaviors are possible after a statement timeout.
     </para>
    </listitem>

I think this needs to be featured more prominently. It's the reason I
have been asking for a new pointrelease... Issues I've seen this cause
*in production* on several sites since the the fix was committed
include:
* ERRORs regularly being promoted to PANICs on seemingly innocous errors
  because CriticalSectionCount is out of whack.
* backends spuriously holding lwlocks which do not get cleaned up
  because there's no LWLockReleaseAll() call unless
  TRANS_INPROGRESS/START has been reached. That sometimes got "fixed" when
  backends exited because there happens to be a LWLockReleaseAll() call
  there. Symptoms included backends waiting on themselves and clusters
  getting stuck until certain backends exited.
* backends that were unkillable because InterruptHoldoffCount whas out
  of whack.
* corrupted heap pages

All but the last one were occurring repeatedly, and have completely
vanished since applying the fix. The last one I have seen only once, but
it seems like quite a comfortable explanation.

With the current explanation it's going to be hard to convince people
how important it is to upgrade from 9.3.2 to 9.3.3.

Thanks for assembling the notes,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-committers by date:

Previous
From: Tom Lane
Date:
Subject: pgsql: First-draft release notes for 9.3.3.
Next
From: Alvaro Herrera
Date:
Subject: Re: pgsql: First-draft release notes for 9.3.3.