On Sun, Feb 7, 2016 at 10:54 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Feb 2, 2016 at 5:28 AM, Andres Freund <andres@anarazel.de> wrote: > > > > Hi, > > > > currently if, when not in standby mode, we can't read a checkpoint > > record, we automatically fall back to the previous checkpoint, and start > > replay from there. > > > > Doing so without user intervention doesn't actually seem like a good > > idea. While not super likely, it's entirely possible that doing so can > > wreck a cluster, that'd otherwise easily recoverable. Imagine e.g. a > > tablespace being dropped - going back to the previous checkpoint very > > well could lead to replay not finishing, as the directory to create > > files in doesn't even exist. > > > > I think there are similar hazards for deletion of relation when > relfilenode gets reused. Basically, it can delete the data > for one of the newer relations which is created after the > last checkpoint. > > > As there's, afaics, really no "legitimate" reasons for needing to go > > back to the previous checkpoint I don't think we should do so in an > > automated fashion. > > > > I have tried to find out why at the first place such a mechanism has > been introduced and it seems to me that commit > 4d14fe0048cf80052a3ba2053560f8aab1bb1b22 has introduced it, but > the reason is not apparent. Then I digged through the archives > and found mail chain which I think has lead to this commit. > Refer [1][2]. >
oops, forgot to provide the links, providing them now.