Thomas Munro <thomas.munro@gmail.com> writes:
> we have a page at offset 638976, and we can find all system calls that
> touched that offset:
> [pid 26031] 23:26:48.521123 pwritev(50,
> [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
> iov_len=8192}], 1, 638976) = 8192
> [pid 26040] 23:26:48.568975 pwrite64(5,
> "\0\0\0\0\0Nj\1\0\0\0\0\240\3\300\3\0 \4
> \0\0\0\0\340\2378\0\300\2378\0"..., 8192, 638976) = 8192
> [pid 26040] 23:26:48.593157 pread64(6,
> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
> 8192, 638976) = 8192
Boy, it's hard to look at that trace and not call it a filesystem bug.
Given the apparent dependency on COW, I wonder if this has something
to do with getting confused about which copy is current?
Another thing that struck me is that the two calls from pid 26040
are issued on different FDs. I checked the strace log and verified
that these do both refer to "base/5/16384". It looks like there was
a cache flush at about 23:26:48.575023 that caused 26040 to close
and later reopen all its database relation FDs. Maybe that is
somehow contributing to the filesystem's confusion? And more to the
point, could that explain why other O_DIRECT users aren't up in arms
over this bug? Maybe they don't switch FDs as readily as we do.
regards, tom lane