Re: Allow a per-tablespace effective_io_concurrency setting - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Allow a per-tablespace effective_io_concurrency setting
Date
Msg-id 20150903002448.GD17210@awork2.anarazel.de
Whole thread Raw
In response to Re: Allow a per-tablespace effective_io_concurrency setting  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On 2015-09-03 01:59:13 +0200, Tomas Vondra wrote:
> That's a bit surprising, especially considering that e_i_c=30 means ~100
> pages to prefetch if I'm doing the math right.
> 
> AFAIK queue depth for SATA drives generally is 32 (so prefetching 100 pages
> should not make a difference), 256 for SAS drives and ~1000 for most current
> RAID controllers.

I think the point is that after an initial buildup - we'll again only
prefetch pages in smaller increments because we already prefetched most
pages. The actual number of concurrent reads will then be determined by
how fast pages are processed.  A prefetch_target of a 100 does *not*
mean 100 requests are continously in flight. And even if it would, the
OS won't issue many more requests than the queue depth, so they'll just
sit in the OS queue.

> So instead of "How many blocks I need to prefetch to saturate the devices?"
> you're asking "How many blocks I need to prefetch to never actually wait for
> the I/O?"

Yes, pretty much.

> I do like this view, but I'm not really sure how could we determine the
> right value? It seems to be very dependent on hardware and workload.

Right.

> For spinning drives the speedup comes from optimizing random seeks to a more
> optimal path (thanks to NCQ/TCQ), and on SSDs thanks to using the parallel
> channels (and possibly faster access to the same block).

+ OS reordering & coalescing.

Don't forget that the OS processes the OS IO queues while the userland
process isn't scheduled - in a concurrent workload with more processes
than hardware threads that means that pending OS requests are being
issued while the query isn't actively being processed.

> I guess the best thing we could do at this level is simply keep the
> on-device queues fully saturated, no?

Well, being too aggressive can hurt throughput and latency of concurrent
processes, without being beneficial.

Andres Freund



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Allow replication roles to use file access functions
Next
From: Andres Freund
Date:
Subject: Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore