Re: Review: Patch to compute Max LSN of Data Pages - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Review: Patch to compute Max LSN of Data Pages |
Date | |
Msg-id | 003101ce77eb$0507cbe0$0f1763a0$@kapila@huawei.com Whole thread Raw |
In response to | Re: Review: Patch to compute Max LSN of Data Pages (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On Wednesday, July 03, 2013 1:26 AM Robert Haas wrote: > On Tue, Jun 25, 2013 at 1:42 PM, Andres Freund <andres@2ndquadrant.com> > wrote: > > I think the usecase for this utility isn't big enough to be included > in > > postgres since it really can only help in a very limited > > circumstances. And I think it's too likely to be misused for stuff > it's > > not useable for (e.g. remastering). > > > > The only scenario I see is that somebody deleted/corrupted > > pg_controldata. In that scenario the tool is supposed to be used to > find > > the biggest lsn used so far so the user then can use pg_resetxlog to > set > > that as the wal starting point. > > But that can be way much easier solved by just setting the LSN to > > something very, very high. The database cannot be used for anything > > reliable afterwards anyway. > > I guess this is true, but I think I'm mildly in favor of including > this anyway. I think I would have used it once or twice, if it had > been there - maybe not even to feed into pg_resetxlog, but just to > check for future LSNs. We don't have anything like a suite of > diagnostic tools in bin or contrib today, for use by database > professionals in fixing things that fall strictly in the realm of > "don't try this at home", and I think we should have such a thing. > Unfortunately this covers about 0.1% of the space I'd like to see > covered, which might be a reason to reject it or at least insist that > it be enhanced first. > > At any rate, I don't think this is anywhere near committable as it > stands today. Some random review comments: > > remove_parent_refernces() is spelled wrong. It was corrected in last posted version. http://www.postgresql.org/message-id/6C0B27F7206C9E4CA54AE035729E9C383BEB928 8@szxeml509-mbx > Why does this patch need all of this fancy path-handling logic - > specifically, remove_parent_refernces() and make_absolute_path()? > Surely its needs are not that different from pg_ctl or pg_resetxlog or > pg_controldata. If they all need it and it's duplicated, we should > fix that. Otherwise, why the special logic here? > > I don't think we need getLinkPath() either. The server has no trouble > finding its files by just using a pathname that follows the symlink. > Why do we need anything different here? The whole point of symlinks > is that you can traverse them transparently, without knowing where > they point. It is to handle negative scenario's like if there is any recursion in path. However if you feel this is not important, it can be removed. > Duplicating PageHeaderIsValid doesn't seem acceptable. Moreover, > since the point of this is to be able to use it on a damaged cluster, > why is that logic even desirable? I think we'd be better off assuming > the pages to be valid. > > The calling convention for this utility seems quite baroque. There's > no real need for all of this -p/-P stuff. I think the syntax should > just be: > > pg_computemaxlsn file_or_directory... > > For each argument, we determine whether it's a file or a directory. > If it's a file, we assume it's a PostgreSQL data file and scan it. If > it's a directory, we check whether it looks like a data directory. If > it does, we recurse through the whole tree structure and find the data > files, and process them. If it doesn't look like a data directory, we > scan each plain file in that directory whose name looks like a > PostgreSQL data file name. With this approach, there's no need to > limit ourselves to a single input argument and no need to specify what > kind of argument it is; the tool just figures it out. > > I think it would be a good idea to have a mode that prints out the max > LSN found in *each data file* scanned, and then prints out the overall > max found at the end. In fact, I think that should perhaps be the > default mode, with -q, --quiet to disable it. Printing too many LSN for each file might fill user's screen and he might be needing only overall LSN. Should we keep -p --printall as option to print all LSN's and keep default as overall max LSN? With Regards, Amit Kapila.
pgsql-hackers by date: