Hi,
On 2017-06-20 16:11:32 +0300, Heikki Linnakangas wrote:
> On 06/19/2017 10:30 AM, Andres Freund wrote:
> > Greg Burek from Heroku (CCed) reported a weird issue on IM, that was
> > weird enough to be interesting. What he'd observed was that he promoted
> > some PITR standby, and early clones of that node work, but later clones
> > did not, failing to read some segment.
> >
> > The problems turns out to be the following: [explanation]
>
> Good detective work!
Thanks.
> > The minimal fix here is presumably not to use XLByteToPrevSeg() in
> > RemoveXlogFile(), but XLByteToSeg(). I don't quite see what purpose it
> > serves here - I don't think it's ever needed.
>
> Agreed, I don't see a reason for it either.
Pushed. And found like three other things while investigating :/
> > There seems to be a larger question ehre though: Why does
> > XLogFileReadAnyTLI() probe all timelines even if they weren't a parent
> > at that period? That seems like a bad idea, especially in more
> > complicated scenarios where some precursor timeline might live for
> > longer than it was a parent? ISTM XLogFileReadAnyTLI() should check
> > which timeline a segment ought to come from, based on the historY?
>
> Yeah. I've had that thought for years as well, but there has never been any
> pressing reason to bite the bullet and rewrite it, so I haven't gotten
> around to it.
Heh. Still seems like something we should tackle - but it'd not be
urgent enough to backpatch, so it doesn't quite seem like something to
tackle *just now* :/
- Andres