Hi,
On 2023-04-09 13:55:33 +1200, Thomas Munro wrote:
> I think that particular thing might relate to modifications of the
> user buffer while a write is in progress (breaking btrfs's internal
> checksums). I don't think we should ever do that ourselves (not least
> because it'd break our own checksums). We lock the page during the
> write so no one can do that, and then we sleep in a synchronous
> syscall.
Oh, but we actually *do* modify pages while IO is going on. I wonder if you
hit the jack pot here. The content lock doesn't prevent hint bit
writes. That's why we copy the page to temporary memory when computing
checksums.
I think we should modify the test to enable checksums - if the problem goes
away, then it's likely to be related to modifying pages while an O_DIRECT
write is ongoing...
Greetings,
Andres Freund