Re: Analysis of ganged WAL writes - Mailing list pgsql-hackers

From Greg Copeland
Subject Re: Analysis of ganged WAL writes
Date
Msg-id 1034080494.26053.273.camel@mouse.copelandconsulting.net
Whole thread Raw
In response to Re: Analysis of ganged WAL writes  ("Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at>)
List pgsql-hackers
On Tue, 2002-10-08 at 04:15, Zeugswetter Andreas SB SD wrote:
> Can the magic be, that kaio directly writes from user space memory to the
> disk ? Since in your case all transactions A-E want the same buffer written,
> the memory (not it's content) will also be the same. This would automatically
> write the latest possible version of our WAL buffer to disk.
>

*Some* implementations allow for zero-copy aio.  That is a savings.  On
heavily used systems, it can be a large savings.

> The problem I can see offhand is how the kaio system can tell which transaction
> can be safely notified of the write, or whether the programmer is actually responsible
> for not changing the buffer until notified of completion ?

That's correct.  The programmer can not change the buffer contents until
notification has completed for that outstanding aio operation.  To do
otherwise results in undefined behavior.  Since some systems do allow
for zero-copy aio operations, requiring the buffers not be modified,
once queued, make a lot of sense.  Of course, even on systems that don't
support zero-copy, changing the buffered data prior to write completion
just seems like a bad idea to me.

Here's a quote from SGI's aio_write man page:
If the buffer pointed to by aiocbp->aio_buf or the control block pointed
to by aiocbp changes or becomes an illegal address prior to asynchronous
I/O completion then the behavior is undefined.  Simultaneous synchronous
operations using the same aiocbp produce undefined results.

And on SunOS we have:    The aiocbp argument points to an  aiocb  structure.  If  the    buffer  pointed  to  by
aiocbp->aio_bufor the control block    pointed to by aiocbp becomes an  illegal  address  prior  to    asynchronous I/O
completion,then the behavior is undefined. 
and    For any system action that changes the process memory  space    while  an  asynchronous  I/O  is  outstanding to
theaddress    range being changed, the result of that action is undefined. 


Greg


pgsql-hackers by date:

Previous
From: "Marc G. Fournier"
Date:
Subject: Re: v7.2.3 - tag'd, packaged ... need it checked ...
Next
From: Robert Treat
Date:
Subject: Re: Little note to php coders