Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1
Date
Msg-id 20131119184004.GD7240@alap2.anarazel.de
Whole thread Raw
In response to Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1  (Christophe Pettus <xof@thebuild.com>)
List pgsql-hackers
On 2013-11-19 10:32:10 -0800, Christophe Pettus wrote:
> 
> On Nov 19, 2013, at 10:29 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> 
> > It's pretty unlikely that any automated testing would have cought this,
> > the required conditions are too unlikely for that.
> 
> I would expect that "promote secondary while primary is under heavy
> load" is clear-cut test case.

That's not sufficient though. It's e.g. very hard to reproduce the issue
using the standard pgbench workload (not enough xids generated, too many
hint bits).

Note that the bug isn't caused by promotion, the problem occurs during
the initial startup of a Hot-Standby standby. If the bug wasn't hit
there, it won't be a problem at promotion.

> What concerns me more is that we don't seem to have a framework to put
> in a regression test on the bug you just found (and thank you for
> finding it so quickly!).

Agreed. But regarding it as a bad situation isn't fixing it
unfortunately.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block
Next
From: Andres Freund
Date:
Subject: Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1