Re: Review: Patch to compute Max LSN of Data Pages - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Review: Patch to compute Max LSN of Data Pages
Date
Msg-id 008c01ce72f9$aa5da4d0$ff18ee70$@kapila@huawei.com
Whole thread Raw
In response to Re: Review: Patch to compute Max LSN of Data Pages  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: Review: Patch to compute Max LSN of Data Pages
List pgsql-hackers
On Wednesday, June 26, 2013 10:19 PM Fujii Masao wrote:
> On Wed, Jun 26, 2013 at 8:57 PM, Amit Kapila <amit.kapila@huawei.com>
> wrote:
> > On Wednesday, June 26, 2013 4:40 PM Andres Freund wrote:
> >> Hi Amit,
> >>
> >> On 2013-06-26 16:22:28 +0530, Amit Kapila wrote:
> >> > On Wednesday, June 26, 2013 1:20 PM Andres Freund wrote:
> >> > > On 2013-06-26 08:50:27 +0530, Amit Kapila wrote:
> >> > > > On Tuesday, June 25, 2013 11:12 PM Andres Freund wrote:
> >> > > > > On 2013-06-16 17:19:49 -0700, Josh Berkus wrote:
> >> > > > > > Amit posted a new version of this patch on January 23rd.
> But
> >> > > last
> >> > > > > > comment on it by Tom is "not sure everyone wants this".
> >> > > > > >
> >> > > > > > https://commitfest.postgresql.org/action/patch_view?id=905
> >> > > > >
> >> > > > > > ... so, what's the status of this patch?
> >> > > > >
> >> > > > > That comment was referencing a mail of mine - so perhaps I
> >> better
> >> > > > > explain:
> >> > > > >
> >> > > > > I think the usecase for this utility isn't big enough to be
> >> > > included in
> >> > > > > postgres since it really can only help in a very limited
> >> > > > > circumstances. And I think it's too likely to be misused for
> >> stuff
> >> > > it's
> >> > > > > not useable for (e.g. remastering).
> >> > > > >
> >> > > > > The only scenario I see is that somebody deleted/corrupted
> >> > > > > pg_controldata. In that scenario the tool is supposed to be
> >> used to
> >> > > > > find
> >> > > > > the biggest lsn used so far so the user then can use
> >> pg_resetxlog
> >> > > to
> >> > > > > set
> >> > > > > that as the wal starting point.
> >> > > > > But that can be way much easier solved by just setting the
> LSN
> >> to
> >> > > > > something very, very high. The database cannot be used for
> >> anything
> >> > > > > reliable afterwards anyway.
> >> > > >
> >> > > > One of the main reason this was written was to make server up
> in
> >> case
> >> > > of
> >> > > > corruption and
> >> > > > user can take dump of some useful information if any.
> >> > > >
> >> > > > By setting LSN very, very high user might loose the
> information
> >> which
> >> > > he
> >> > > > wants to take dump.
> >> > >
> >> > > Which information would that loose?
> >> >   Information from WAL replay which can be more appropriate by
> >> selecting
> >> > LSN.
> >>
> >> Sorry, I can't follow. If wal replay still is an option you can just
> >> look at the WAL and get a sensible value way easier.
> >
> > Originally 2 parts were proposed, one was to get LSN from data pages
> and
> > other from data pages.
> > Original proposal is:
> > http://www.postgresql.org/message-
> id/6C0B27F7206C9E4CA54AE035729E9C382851FFA
> > 1@szxeml509-mbs
> >
> > The second part for looking into WAL was written but due to
> xlogreader
> > patch, it was postponed and I didn't get time after that
> > to pursue it.
> >
> >
> >>The whole tool
> >> seems to only make sense if you've lost pg_xlog.
> >
> > The tool's initial intent was if pg_controldata is lost and this idea
> is
> > originated in below mail chain:
> > http://www.postgresql.org/message-id/4274.1340084598@sss.pgh.pa.us
> >
> >
> >> >   Also for a developer, guessing very high LSN might be easy, but
> for
> >> users
> >> >   it might not be equally easy, and getting such value by utility
> >> would be
> >> > comfortable.
> >>
> >> Well, then we can just document some very high lsn and be done with
> >> it. Like CF000000/00000000.
> >> That would leave enough space for eventual writes caused while
> dumping
> >> the database (say hint bit writes in a checksummed database) and
> cannot
> >> yet be realistically be reached during normal operation.
> >
> > Can we be ultra sure, that this LSN is not reached. I think it will
> take
> > vary long to reach such LSN, but still theoretically it can be
> possible.
> > I don't have any evidence.
> >
> >> >   One more use case for which this utility was done is as below:
> >> >   It will be used to decide that on new-standby (old-master)
> whether
> >> a full
> >> > backup is needed from
> >> >   New-master(old-standby).
> >> >   The backup is required when the data page in old-master precedes
> >> >   the last applied LSN in old-standby (i.e., new-master) at the
> >> moment
> >> >   of the failover.
> >>
> >> That's exactly what I was afraid of. Unless I miss something the
> tool
> >> is
> >> *NOT* sufficient to do this.
> >
> > You mean to say if user knows the max LSN of data pages in old-master
> and
> > last applied LSN in new master, he cannot decide whether
> > Full backup is needed? It should be straightforward decision that
> skip a
> > backup if that old-master LSN is less than the new-master (i.e., last
> > applied LSN, IOW, timeline switch LSN).
> > It was proposed as a usecase in this below mail:
> > http://www.postgresql.org/message-id/CAHGQGwHyd1fY0hF0qKh0-uKDh-
> gcbYxMOFBYVk
> > Kh4jzji-FCfg@mail.gmail.com
> 
> I guess he meant the commit hint bit problem.

True, after reading the thread mentioned by Andres, I got the reason he was
pointing why it is not sufficient.
So can it be useful if database has checksums enabled?

With Regards,
Amit Kapila.




pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: [PATCH] add --progress option to pgbench (submission 3)
Next
From: 'Andres Freund'
Date:
Subject: Re: Review: Patch to compute Max LSN of Data Pages