Re: O_DIRECT setting - Mailing list pgsql-performance

From Neil Conway
Subject Re: O_DIRECT setting
Date
Msg-id 1095911914.22485.414.camel@localhost.localdomain
Whole thread Raw
In response to O_DIRECT setting  (Guy Thornley <guy@esphion.com>)
Responses Re: O_DIRECT setting
List pgsql-performance
On Mon, 2004-09-20 at 17:57, Guy Thornley wrote:
> According to the manpage, O_DIRECT implies O_SYNC:
>
>         File I/O is done directly to/from user space buffers.  The I/O is
>         synchronous, i.e., at the completion of the read(2) or write(2)
>         system call, data is guaranteed to have been transferred.

This seems like it would be a rather large net loss. PostgreSQL already
structures writes so that the writes we need to hit disk immediately
(WAL records) are fsync()'ed -- the kernel is given more freedom to
schedule how other writes are flushed from the cache. Also, my
recollection is that O_DIRECT also disables readahead -- if that's
correct, that's not what we want either.

BTW, using O_DIRECT has been discussed a few times in the past. Have you
checked the list archives? (for both -performance and -hackers)

> Would people be interested in a performance benchmark?

Sure -- I'd definitely be curious, although as I said I'm skeptical it's
a win.

> I need some benchmark tips :)

Some people have noted that it can be difficult to use contrib/pgbench
to get reproducible results -- you might want to look at Jan's TPC-W
implementation or the OSDL database benchmarks:

http://pgfoundry.org/projects/tpc-w-php/
http://www.osdl.org/lab_activities/kernel_testing/osdl_database_test_suite/

> Incidentally, postgres heap files suffer really, really bad fragmentation,
> which affects sequential scan operations (VACUUM, ANALYZE, REINDEX ...)
> quite drastically. We have in-house patches that somewhat alleiviate this,
> but they are not release quality.

Can you elaborate on these "in-house patches"?

-Neil



pgsql-performance by date:

Previous
From: Greg Stark
Date:
Subject: Re: NAS, SAN or any alternate solution ?
Next
From: "Gary Doades"
Date:
Subject: Re: Caching of Queries