On Fri, Jan 3, 2025 at 10:53 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> (To be clear: if this is how FreeBSD acts, then I'm afraid we already
> do have such bugs. The rmtree case is just easier to observe than a
> missed fsync.)
For what little it's worth, I'm not quite convinced yet that FreeBSD's
client isn't more broken than it needs to be. Lots of systems I
looked at have stable cookies in practice (as NFS 4 recommends),
including the one used in this report, so it seems like a more basic
problem. At the risk of being wrong on the internet, I don't see any
fundamental reason why a readdir() scan can't have no-skip,
no-duplicate, no-fail semantics for stable-cookie, no-verification
servers. And this case works perfectly with a couple of other NFS
clients implementations that you and I tried.
As for systems that don't have stable cookies, well then they should
implement the cookie verification scheme and AFAICS the readdir() scan
should then fail if it can't recover, or it would expose isolation
anomalies offensive to database hacker sensibilities. I think it
should be theoretically possible to recover in some happy cases
(maybe: when enough state is still around in cache to deduplicate).
But that shouldn't be necessary on filers using eg ZFS or BTRFS whose
cookies are intended to be stable. A server could also do MVCC magic,
and from a quick google, I guess NetApp WAFL might do that as it has
the concept of "READDIR expired", which smells a bit like ORA-01555:
snapshot too old.