Re: Proposed LogWriter Scheme, WAS: Potential Large Performance - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Proposed LogWriter Scheme, WAS: Potential Large Performance
Date
Msg-id 200210051826.g95IQh605752@candle.pha.pa.us
Whole thread Raw
In response to Re: Proposed LogWriter Scheme, WAS: Potential Large Performance  ("Curtis Faith" <curtis@galtair.com>)
Responses Re: Proposed LogWriter Scheme, WAS: Potential Large Performance  ("Curtis Faith" <curtis@galtair.com>)
Mailing list unsubscribe - hackers isn't there?  (Mitch <mitch@doot.org>)
List pgsql-hackers
Curtis Faith wrote:
> The advantage to aio_write in this scenario is when writes cross track
> boundaries or when the head is in the wrong spot. If we write
> in reasonable blocks with aio_write the write might get to the disk
> before the head passes the location for the write.
> 
> Consider a scenario where:
> 
>     Head is at file offset 10,000.
> 
>     Log contains blocks 12,000 - 12,500
> 
>     ..time passes..
> 
>     Head is now at 12,050
> 
>     Commit occurs writing block 12,501
> 
> In the aio_write case the write would already have been done for blocks  
> 12,000 to 12,050 and would be queued up for some additional blocks up to
> potentially 12,500. So the write for the commit could occur without an
> additional rotation delay. We are talking 85 to 200 milliseconds
> delay for this rotation on a single disk. I don't know how often this
> happens in actual practice but it might occur as often as every other
> time.

So, you are saying that we may get back aio confirmation quicker than if
we issued our own write/fsync because the OS was able to slip our flush
to disk in as part of someone else's or a general fsync?

I don't buy that because it is possible our write() gets in as part of
someone else's fsync and our fsync becomes a no-op, meaning there aren't
any dirty buffers for that file.  Isn't that also possible?

Also, remember the kernel doesn't know where the platter rotation is
either. Only the SCSI drive can reorder the requests to match this. The
OS can group based on head location, but it doesn't know much about the
platter location, and it doesn't even know where the head is.

Also, does aio return info when the data is in the kernel buffers or
when it is actually on the disk?   

Simply, aio allows us to do the write and get notification when it is
complete.  I don't see how that helps us, and I don't see any other
advantages to aio.  To use aio, we need to find something that _can't_
be solved with more traditional Unix API's, and I haven't seen that yet.

This aio thing is getting out of hand.  It's like we have a hammer, and
everything looks like a nail, or a use for aio.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


pgsql-hackers by date:

Previous
From: "Curtis Faith"
Date:
Subject: Re: Proposed LogWriter Scheme, WAS: Potential Large Performance
Next
From: Bruce Momjian
Date:
Subject: Re: [SQL] [GENERAL] CURRENT_TIMESTAMP