Re: Increasing default value for effective_io_concurrency? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Increasing default value for effective_io_concurrency?
Date
Msg-id 20190701233215.wdimoypumnshwbl5@alap3.anarazel.de
Whole thread Raw
In response to Increasing default value for effective_io_concurrency?  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Increasing default value for effective_io_concurrency?  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Re: Increasing default value for effective_io_concurrency?  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

On 2019-06-29 22:15:19 +0200, Tomas Vondra wrote:
> I think we should consider changing the effective_io_concurrency default
> value, i.e. the guc that determines how many pages we try to prefetch in
> a couple of places (the most important being Bitmap Heap Scan).

Maybe we need improve the way it's used / implemented instead - it seems
just too hard to determine the correct setting as currently implemented.


> In some cases it helps a bit, but a bit higher value (4 or 8) performs
> significantly better. Consider for example this "sequential" data set
> from the 6xSSD RAID system (x-axis shows e_i_c values, pct means what
> fraction of pages matches the query):

I assume that the y axis is the time of the query?

How much data is this compared to memory available for the kernel to do
caching?


>    pct         0         1        4         16        64       128
>    ---------------------------------------------------------------
>      1     25990     18624      3269      2219      2189      2171
>      5     88116     60242     14002      8663      8560      8726
>     10    120556     99364     29856     17117     16590     17383
>     25    101080    184327     79212     47884     46846     46855
>     50    130709    309857    163614    103001     94267     94809
>     75    126516    435653    248281    156586    139500    140087
> 
> compared to the e_i_c=0 case, it looks like this:
> 
>    pct       1        4         16        64       128
>    ----------------------------------------------------
>      1     72%      13%         9%        8%        8%
>      5     68%      16%        10%       10%       10%
>     10     82%      25%        14%       14%       14%
>     25    182%      78%        47%       46%       46%
>     50    237%     125%        79%       72%       73%
>     75    344%     196%       124%      110%      111%
> 
> So for 1% of the table the e_i_c=1 is faster by about ~30%, but with
> e_i_c=4 (or more) it's ~10x faster. This is a fairly common pattern, not
> just on this storage system.
> 
> The e_i_c=1 can perform pretty poorly, especially when the query matches
> large fraction of the table - for example in this example it's 2-3x
> slower compared to no prefetching, and higher e_i_c values limit the
> damage quite a bit.

I'm surprised the slowdown for small e_i_c values is that big - it's not
obvious to me why that is.  Which os / os version / filesystem / io
scheduler / io scheduler settings were used?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: POC: converting Lists into arrays
Next
From: Thomas Munro
Date:
Subject: Re: Usage of epoch in txid_current