David Noel <david.i.noel@gmail.com> writes:
> I didn't have any luck with the rc script but I was able to use it to
> get a ktrace dump as root (ktrace as user pgsql doesn't seem to work).
> So hopefully that will show something(!)
The relevant part of the ktrace output is
71502 postgres CALL unlink(0x7fffffffc130)
71502 postgres NAMI "pg_xlog/xlogtemp.71502"
71502 postgres RET unlink -1 errno 2 No such file or directory
71502 postgres CALL open(0x7fffffffc130,O_RDWR|O_CREAT|O_EXCL,S_IRUSR|S_IWUSR)
71502 postgres NAMI "pg_xlog/xlogtemp.71502"
71502 postgres RET open 3
71502 postgres CALL write(0x3,0x801a56030,0x2000)
71502 postgres GIO fd 3 wrote 4096 bytes
.... a lot of uninteresting write() calls snipped ...
71502 postgres RET write 8192/0x2000
71502 postgres CALL close(0x3)
71502 postgres RET close 0
71502 postgres CALL unlink(0x7fffffffbc60)
71502 postgres NAMI "pg_xlog/000000010000000000000001"
71502 postgres RET unlink -1 errno 2 No such file or directory
71502 postgres CALL link(0x7fffffffc130,0x7fffffffbc60)
71502 postgres NAMI "pg_xlog/xlogtemp.71502"
71502 postgres NAMI "pg_xlog/000000010000000000000001"
71502 postgres RET link -1 errno 1 Operation not permitted
71502 postgres CALL unlink(0x7fffffffc130)
71502 postgres NAMI "pg_xlog/xlogtemp.71502"
71502 postgres RET unlink 0
71502 postgres CALL open(0x7fffffffc530,O_RDWR,<unused>0x180)
71502 postgres NAMI "pg_xlog/000000010000000000000001"
71502 postgres RET open -1 errno 2 No such file or directory
This corresponds to the execution of XLogFileInit(), and what's
evidently happening is that we successfully create and zero-fill
the first xlog segment file under a temporary name, but then
the attempt to rename it into place with link() fails with EPERM.
This is really a WTF kind of failure, I think. The directory is
certainly writable --- it was made under our own UID, and what's
more we just managed to create the file there under its temp name.
So how can we get an EPERM failure from link()?
I think this is a kernel bug.
regards, tom lane
PS: one odd thing here is that the ereport(LOG) in
InstallXLogFileSegment isn't doing anything; otherwise we'd have gotten
a much more helpful error report about "could not link file". I don't
think we run the bootstrap mode with log_min_messages set high enough to
disable LOG messages, so why isn't it printing? Nonetheless, this error
shouldn't have occurred.