Re: [HACKERS] lseek/read/write overhead becomes visible at scale .. - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..
Date
Msg-id 20170124185945.zcyfs4pn65knfhq3@alap3.anarazel.de
Whole thread Raw
In response to Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On 2017-01-24 15:36:13 -0300, Alvaro Herrera wrote:
> Tobias Oberstein wrote:
> 
> > I am benchmarking IOPS, and while doing so, it becomes apparent that at
> > these scales it does matter _how_ IO is done.
> > 
> > The most efficient way is libaio. I get 9.7 million/sec IOPS with low CPU
> > load. Using any synchronous IO engine is slower and produces higher load.
> > 
> > I do understand that switching to libaio isn't going to fly for PG
> > (completely different approach).
> 
> Maybe it is possible to write a new f_smgr implementation (parallel to
> md.c) that uses libaio.  There is no "seek" in that interface, at least,
> though the interface does assume that the implementation is blocking.

For it to be beneficial we'd need to redesign the IO stack above that so
much that it'd be basically not recognizable (since we'd need to
actually use async io for it to be beneficial). Using libaio IIRC still
requires O_DIRECT, so we'd to take more care with ordering of writeback
etc too - we got closer with 9.6, but we're still far away from it.
Besides that, it's also not always that clear when AIO would be
beneficial, since a lot of the synchronous IO is actually synchronous
for a reason.

Andres



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal
Next
From: Andres Freund
Date:
Subject: Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..