Hi.
Here's an updated patch for the fsync problem(s).
A few points that may be worth considering:
1. I've made the ReadDir → ReadDirExtended change, but in retrospect I
don't think it's really worth it. Using ReadDir and letting it die
is probably fine. (Aside: I left ReadDirExtended static for now.)
2. Robert, are you comfortable with what fsync_pgdata() does in xlog.c?
I wasn't sure if I should move that to fd.c as well. I think it's
borderline OK for now.
3. There's a comment in fsync_fname that we need to open directories
with O_RDONLY and non-directories with O_RDWR. Does this also apply
to pre_sync_fname (i.e. sync_file_range and posix_fadvise) on any
platforms we care about? It doesn't seem so, but I'm not sure.
4. As a partial aside, why does fsync_fname use OpenTransientFile? It
looks like it should use BasicOpenFile as pre_sync_fname does. We
close the fd immediately after calling fsync anyway. Or is there
some reason I'm missing that we should use OpenTransientFile in
pre_sync_fname too?
(Aside²: it's pointless to specify S_IRUSR|S_IWUSR in fsync_fname
since we're not setting O_CREAT. Minor, so will submit separate
patch.)
5. I made walktblspc_entries use stat() instead of lstat(), so that we
can handle symlinks and directories the same way. Note that we won't
continue to follow links inside the sub-directories because we use
walkdir instead of recursing.
But this depends on S_ISDIR() returning true after stat() on Windows
with a junction. Does that work? And are the entries in pb_tblspc on
Windows actually junctions?
I've tested this with unreadable files in PGDATA and junk inside
pb_tblspc. Feedback welcome.
I have a separate patch to initdb with the corresponding changes, which
I will post after dinner and a bit more testing.
-- Abhijit