On 2021-Sep-20, Robert Haas wrote:
> I was thinking that this might increase the number of directory scans
> by a pretty large amount when we repeatedly catch up, then 1 new file
> gets added, then we catch up, etc.
I was going to say that perhaps we can avoid repeated scans by having a
bitmap of future files that were found by a scan; so if we need to do
one scan, we keep track of the presence of the next (say) 64 files in
our timeline, and then we only have to do another scan when we need to
archive a file that wasn't present the last time we scanned. However:
> But I guess your thought process is that such directory scans, even if
> they happen many times per second, can't really be that expensive,
> since the directory can't have much in it. Which seems like a fair
> point. I wonder if there are any situations in which there's not much
> to archive but the archive_status directory still contains tons of
> files.
(If we take this stance, which seems reasonable to me, then we don't
need to optimize.) But perhaps we should complain if we find extraneous
files in archive_status -- Then it'd be on the users' heads not to leave
tons of files that would slow down the scan.
--
Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/
Maybe there's lots of data loss but the records of data loss are also lost.
(Lincoln Yeoh)