Re: Allow a per-tablespace effective_io_concurrency setting - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Allow a per-tablespace effective_io_concurrency setting
Date
Msg-id 20150902223812.GE8555@awork2.anarazel.de
Whole thread Raw
In response to Re: Allow a per-tablespace effective_io_concurrency setting  (Greg Stark <stark@mit.edu>)
Responses Re: Allow a per-tablespace effective_io_concurrency setting  (Greg Stark <stark@mit.edu>)
Re: Allow a per-tablespace effective_io_concurrency setting  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-hackers
On 2015-09-02 19:49:13 +0100, Greg Stark wrote:
> I can take the blame for this formula.
> 
> It's called the "Coupon Collector Problem". If you hit get a random
> coupon from a set of n possible coupons, how many random coupons would
> you have to collect before you expect to have at least one of each.

My point is that that's just the entirely wrong way to model
prefetching. Prefetching can be massively beneficial even if you only
have a single platter! Even if there were no queues on the hardware or
OS level! Concurrency isn't the right way to look at prefetching.

You need to prefetch so far ahead that you'll never block on reading
heap pages - and that's only the case if processing the next N heap
blocks takes longer than the prefetch of the N+1 th page. That doesn't
mean there continously have to be N+1 prefetches in progress - in fact
that actually often will only be the case for the first few, after that
you hopefully are bottlnecked on CPU.

If you additionally take into account hardware realities where you have
multiple platters, multiple spindles, command queueing etc, that's even
more true. A single rotation of a single platter with command queuing
can often read several non consecutive blocks if they're on a similar

- Andres



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Allow a per-tablespace effective_io_concurrency setting
Next
From: Bruce Momjian
Date:
Subject: Re: Horizontal scalability/sharding