Re: Fast promotion failure - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Fast promotion failure
Date
Msg-id 20130513.092352.30755878.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Fast promotion failure  (Amit Kapila <amit.kapila@huawei.com>)
Responses Re: Fast promotion failure  (Amit Kapila <amit.kapila@huawei.com>)
List pgsql-hackers
2013/05/10 20:01 "Amit Kapila" <amit.kapila@huawei.com>:
> > > C 2013-05-10 15:32:32.170 JST 9242 FATAL:  could not receive data
> > from WAL stream:
>
> Is there any chance, that there is any network glitch caused this one time
> error.

Unix domam sockets are hardly likely to have such troubles. This
test ran within single host.

> > I'm get confused, the patch seems to me ensureing the "first
> > checkpoint after fast promotion is performed" to use the
> > "correct, new, ThisTimeLineID".
>
> What is your confusion?

Heikki said in the fist message in this thread that he suspected
the cause of the failure he had seen to be wrong TLI on whitch
checkpointer runs. Nevertheless, the patch you suggested for me
looks fixing it. Moreover (one of?) the failure from the same
cause looks fixed with the patch.

Is the point of this discussion that the patch may leave out some
glich about timing of timeline-related changing and Heikki saw an
egress of that?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Robins Tharakan
Date:
Subject: Add regression tests for DISCARD
Next
From: Jon Nelson
Date:
Subject: Re: corrupt pages detected by enabling checksums