Re: Failed to delete old ReorderBuffer spilled files - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Failed to delete old ReorderBuffer spilled files
Date
Msg-id 20171122.114814.59091351.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Failed to delete old ReorderBuffer spilled files  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
Hi,

At Wed, 22 Nov 2017 10:10:27 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in
<CAD21AoDkPbCNX-d_VqKrW4rDt5W5Y3=LQr7zYbbxF=uVDayt-A@mail.gmail.com>
> >> Using last changing LSN might work but I'm afraid that that fails
> >> to remove the last snap file if the crash happens at the very
> >> start of a segment.
> 
> I think it works even in the case because the we can compute the
> largest WAL segment number that we need to remove by using the lsn of
> the last change in old transaction. Am I missing something?

I'm concerned by the window that can leave an empty file in
ReorderBufferSerializeTXN. But I had a closer look and found that
the snap files for old transactions are not of the previous run,
but rebuilt from WAL records after restart. So there cannot be an
empty file there.

I'm convinced that it is the proper way to deal with this problem.

> >> Anyway all files of the transaction is no longer useless at the
> >> time, but it seems that the last_lsn is required to avoid
> >> directory scanning at every transaction end.
> >>
> >> Letting ReorderBufferAbortOld scan the directory and determine
> >> the first and last LSN then set to the txn would work but it
> >> might be an overkill. Using the beginning LSN of the next segment
> >> of the last_change->lsn could surely work... really?
> >> (ReorderBufferRestoreCleanup doesn't complain on ENOENT.)
> >
> > Somehow I deleted exessively while editing. One more possible
> > solution is making ReorderBufferAbortOld take final LSN and
> > DecodeStandbyOp passes the LSN of XLOG_RUNNING_XACTS record to
> > it.
> >
> 
> Setting final_lsn in ReorderBufferAbortOld seems good to me but I'm
> not sure we can use the lsn of XLOG_RUNNING_XACTS record. Doesn't
> ReorderBufferRestoreCleanup() raise an error due to ENOENT if the wal

It no longer matters but the function does *not* raise an error
on ENOENT.

> segment having XLOG_RUNNING_XACTS records doesn't have any changes of
> the old transaction?

Since the transaction doesn't meet abort record any larger LSN
can work as final_lsn and the record is guaranteed to be so. But
anyway I agree that the last_change->lsn is more proper than it.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Next
From: Craig Ringer
Date:
Subject: Re: Failed to delete old ReorderBuffer spilled files