On Wed, Nov 29, 2006 at 09:55:41AM +0000, tomas@tuxteam.de wrote:
> > It would be interesting to know what other causes there could be for
> > short writes.
>
> Interrupted system call?
>
> [Diclaimer: I assume provisions for that are taken, I just don't know
> the code around that spot and am just offering an answer to the above
> question]
Seems unlikely. Under BSD signal semantics (which PostgreSQL uses),
there is no such thing as an "interrupted system call". When a signal
happens, the system is supposed to restart the system call
automatically.
If this were a problem, we'd have seen it long before now I think.
> The problem arises from the fact that errno is only guaranteed to be set
> on a -1 return value. It'd be nice to have errno set on a short write
> too.
On return from a raw system call the there only one value. If >=0,
that's the return value. If <0, then errno is set to -result and -1 is
returned to the app. So you see, what you're suggesting isn't possible
without a completely different way to doing system calls.
Other interfaces, like async I/O have request blocks and can return
both an error status and a number of bytes.
> So the "right" answer might be to retry a write on a short write and only
> to bail out in the <=0 case (raising an "unspecified error" in the 0
> case). Ugh.
Possibly, but it'd still be nice to know what is causing the failure if
it's not disk full.
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.