Re: .ready and .done files considered harmful - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: .ready and .done files considered harmful
Date
Msg-id 20210907.174208.938167028563322823.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: .ready and .done files considered harmful  (Dipesh Pandit <dipesh.pandit@gmail.com>)
Responses Re: .ready and .done files considered harmful  ("Bossart, Nathan" <bossartn@amazon.com>)
List pgsql-hackers
At Fri, 3 Sep 2021 18:31:46 +0530, Dipesh Pandit <dipesh.pandit@gmail.com> wrote in 
> Hi,
> 
> Thanks for the feedback.
> 
> > Which approach do you think we should use?  I think we have decent
> > patches for both approaches at this point, so perhaps we should see if
> > we can get some additional feedback from the community on which one we
> > should pursue further.
> 
> In my opinion both the approaches have benefits over current implementation.
> I think in keep-trying-the-next-file approach we have handled all rare and
> specific
> scenarios which requires us to force a directory scan to archive the
> desired files.
> In addition to this with the recent change to force a directory scan at
> checkpoint
> we can avoid an infinite wait for a file which is still being missed out
> despite
> handling the special scenarios. It is also more efficient in extreme
> scenarios
> as discussed in this thread. However, multiple-files-per-readdir approach
> is
> cleaner with resilience of current implementation.
> 
> I agree that we should decide on which approach to pursue further based on
> additional feedback from the community.


I was thinking that the multple-files approch would work efficiently
but the the patch still runs directory scans every 64 files.  As
Robert mentioned it is still O(N^2).  I'm not sure the reason for the
limit, but if it were to lower memory consumption or the cost to sort,
we can resolve that issue by taking trying-the-next approach ignoring
the case of having many gaps (discussed below). If it were to cause
voluntary checking of out-of-order files, almost the same can be
achieved by running directory scans every 64 files in the
trying-the-next approach (and we would suffer O(N^2) again).  On the
other hand, if archiving is delayed by several segments, the
multiple-files method might reduce the cost to scan the status
directory but it won't matter since the directory contains only
several files.  (I think that it might be better that we don't go to
trying-the-next path if we found only several files by running a
directory scan.)  The multiple-files approach reduces the number of
directory scans if there were many gaps in the WAL file
sequence. Alghouth theoretically the last max_backend(+alpha?)
segemnts could be written out-of-order, but I suppose we only have
gaps only among the several latest files in reality. I'm not sure,
though..

In short, the trying-the-next approach seems to me to be the way to
go, for the reason that it is simpler but it can cover the possible
failures by almost the same measures with the muliple-files approach.

> > The problem I see with this is that pgarch_archiveXlog() might end up
> > failing.  If it does, we won't retry archiving the file until we do a
> > directory scan.  I think we could try to avoid forcing a directory
> > scan outside of these failure cases and archiver startup, but I'm not
> > sure it's really worth it.  When pgarch_readyXlog() returns false, it
> > most likely means that there are no .ready files present, so I'm not
> > sure we are gaining a whole lot by avoiding a directory scan in that
> > case.  I guess it might help a bit if there are a ton of .done files,
> > though.
> 
> Yes, I think it will be useful when we have a bunch of .done files and
> the frequency of .ready files is such that the archiver goes to wait
> state before the next WAL file is ready for archival.
> 
> > I agree, but it should probably be something like DEBUG3 instead of
> > LOG.
> 
> I will update it in the next patch.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: "Drouvot, Bertrand"
Date:
Subject: Re: [UNVERIFIED SENDER] Re: [BUG] Failed Assertion in ReorderBufferChangeMemoryUpdate()
Next
From: Peter Eisentraut
Date:
Subject: Re: automatically generating node support functions