Re: Review: Patch to compute Max LSN of Data Pages - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Review: Patch to compute Max LSN of Data Pages
Date
Msg-id 003101ce77eb$0507cbe0$0f1763a0$@kapila@huawei.com
Whole thread Raw
In response to Re: Review: Patch to compute Max LSN of Data Pages  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wednesday, July 03, 2013 1:26 AM Robert Haas wrote:
> On Tue, Jun 25, 2013 at 1:42 PM, Andres Freund <andres@2ndquadrant.com>
> wrote:
> > I think the usecase for this utility isn't big enough to be included
> in
> > postgres since it really can only help in a very limited
> > circumstances. And I think it's too likely to be misused for stuff
> it's
> > not useable for (e.g. remastering).
> >
> > The only scenario I see is that somebody deleted/corrupted
> > pg_controldata. In that scenario the tool is supposed to be used to
> find
> > the biggest lsn used so far so the user then can use pg_resetxlog to
> set
> > that as the wal starting point.
> > But that can be way much easier solved by just setting the LSN to
> > something very, very high. The database cannot be used for anything
> > reliable afterwards anyway.
> 
> I guess this is true, but I think I'm mildly in favor of including
> this anyway.  I think I would have used it once or twice, if it had
> been there - maybe not even to feed into pg_resetxlog, but just to
> check for future LSNs.  We don't have anything like a suite of
> diagnostic tools in bin or contrib today, for use by database
> professionals in fixing things that fall strictly in the realm of
> "don't try this at home", and I think we should have such a thing.
> Unfortunately this covers about 0.1% of the space I'd like to see
> covered, which might be a reason to reject it or at least insist that
> it be enhanced first.
> 
> At any rate, I don't think this is anywhere near committable as it
> stands today.  Some random review comments:
> 
> remove_parent_refernces() is spelled wrong.
It was corrected in last posted version. 
http://www.postgresql.org/message-id/6C0B27F7206C9E4CA54AE035729E9C383BEB928
8@szxeml509-mbx

> Why does this patch need all of this fancy path-handling logic -
> specifically, remove_parent_refernces() and make_absolute_path()?
> Surely its needs are not that different from pg_ctl or pg_resetxlog or
> pg_controldata.  If they all need it and it's duplicated, we should
> fix that.  Otherwise, why the special logic here?
> 
> I don't think we need getLinkPath() either.  The server has no trouble
> finding its files by just using a pathname that follows the symlink.
> Why do we need anything different here?  The whole point of symlinks
> is that you can traverse them transparently, without knowing where
> they point.

It is to handle negative scenario's like if there is any recursion in path.
However if you feel this is not important, it can be removed.
> Duplicating PageHeaderIsValid doesn't seem acceptable.  Moreover,
> since the point of this is to be able to use it on a damaged cluster,
> why is that logic even desirable?  I think we'd be better off assuming
> the pages to be valid.
> 
> The calling convention for this utility seems quite baroque.  There's
> no real need for all of this -p/-P stuff.  I think the syntax should
> just be:
> 
> pg_computemaxlsn file_or_directory...
> 
> For each argument, we determine whether it's a file or a directory.
> If it's a file, we assume it's a PostgreSQL data file and scan it.  If
> it's a directory, we check whether it looks like a data directory.  If
> it does, we recurse through the whole tree structure and find the data
> files, and process them.  If it doesn't look like a data directory, we
> scan each plain file in that directory whose name looks like a
> PostgreSQL data file name.  With this approach, there's no need to
> limit ourselves to a single input argument and no need to specify what
> kind of argument it is; the tool just figures it out.
> 
> I think it would be a good idea to have a mode that prints out the max
> LSN found in *each data file* scanned, and then prints out the overall
> max found at the end.  In fact, I think that should perhaps be the
> default mode, with -q, --quiet to disable it.

Printing too many LSN for each file might fill user's screen and he might be
needing only overall LSN.
Should we keep -p --printall as option to print all LSN's and keep default
as overall max LSN?

With Regards,
Amit Kapila.




pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Move unused buffers to freelist
Next
From: Simon Riggs
Date:
Subject: Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])