On Tue, 2002-10-08 at 04:15, Zeugswetter Andreas SB SD wrote:
> Can the magic be, that kaio directly writes from user space memory to the
> disk ? Since in your case all transactions A-E want the same buffer written,
> the memory (not it's content) will also be the same. This would automatically
> write the latest possible version of our WAL buffer to disk.
>
*Some* implementations allow for zero-copy aio. That is a savings. On
heavily used systems, it can be a large savings.
> The problem I can see offhand is how the kaio system can tell which transaction
> can be safely notified of the write, or whether the programmer is actually responsible
> for not changing the buffer until notified of completion ?
That's correct. The programmer can not change the buffer contents until
notification has completed for that outstanding aio operation. To do
otherwise results in undefined behavior. Since some systems do allow
for zero-copy aio operations, requiring the buffers not be modified,
once queued, make a lot of sense. Of course, even on systems that don't
support zero-copy, changing the buffered data prior to write completion
just seems like a bad idea to me.
Here's a quote from SGI's aio_write man page:
If the buffer pointed to by aiocbp->aio_buf or the control block pointed
to by aiocbp changes or becomes an illegal address prior to asynchronous
I/O completion then the behavior is undefined. Simultaneous synchronous
operations using the same aiocbp produce undefined results.
And on SunOS we have: The aiocbp argument points to an aiocb structure. If the buffer pointed to by
aiocbp->aio_bufor the control block pointed to by aiocbp becomes an illegal address prior to asynchronous I/O
completion,then the behavior is undefined.
and For any system action that changes the process memory space while an asynchronous I/O is outstanding to
theaddress range being changed, the result of that action is undefined.
Greg