Re: Race between KeepFileRestoredFromArchive() and restartpoint - Mailing list pgsql-hackers

From Don Seiler
Subject Re: Race between KeepFileRestoredFromArchive() and restartpoint
Date
Msg-id CAHJZqBBMjeL3xi3Yr17_uSNjNvJJdXBKgrWnuRAHJec9usA1sA@mail.gmail.com
Whole thread Raw
In response to Re: Race between KeepFileRestoredFromArchive() and restartpoint  (David Steele <david@pgmasters.net>)
Responses Re: Race between KeepFileRestoredFromArchive() and restartpoint
List pgsql-hackers
On Tue, Aug 2, 2022 at 10:01 AM David Steele <david@pgmasters.net> wrote:

> That makes sense.  Each iteration of the restartpoint recycle loop has a 1/N
> chance of failing.  Recovery adds >N files between restartpoints.  Hence, the
> WAL directory grows without bound.  Is that roughly the theory in mind?

Yes, though you have formulated it better than I had in my mind.

Let's see if Don can confirm that he is seeing the "could not link file"
messages.

During my latest incident, there was only one occurrence:

could not link file “pg_wal/xlogtemp.18799" to “pg_wal/000000010000D45300000010”: File exists

WAL restore/recovery seemed to continue on just fine then. And it would continue on until the pg_wal volume ran out of space unless I was manually rm'ing already-recovered WAL files from the side.

--
Don Seiler
www.seiler.us

pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: Patch to avoid orphaned dependencies
Next
From: Jacob Champion
Date:
Subject: Re: Consider parallel for lateral subqueries with limit