Re: Increasing default value for effective_io_concurrency? - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Increasing default value for effective_io_concurrency?
Date
Msg-id 20190702080322.6yoo6rbmvg4xvo3d@development
Whole thread Raw
In response to Re: Increasing default value for effective_io_concurrency?  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Mon, Jul 01, 2019 at 04:32:15PM -0700, Andres Freund wrote:
>Hi,
>
>On 2019-06-29 22:15:19 +0200, Tomas Vondra wrote:
>> I think we should consider changing the effective_io_concurrency default
>> value, i.e. the guc that determines how many pages we try to prefetch in
>> a couple of places (the most important being Bitmap Heap Scan).
>
>Maybe we need improve the way it's used / implemented instead - it seems
>just too hard to determine the correct setting as currently implemented.
>

Sure, if we can improve those bits, that'd be nice. It's definitely hard
to decide what value is appropriate for a given storage system. But I'm
not sure it's something we can do easily, considering how opaque the
hardware is for us ...

I wonder 

>
>> In some cases it helps a bit, but a bit higher value (4 or 8) performs
>> significantly better. Consider for example this "sequential" data set
>> from the 6xSSD RAID system (x-axis shows e_i_c values, pct means what
>> fraction of pages matches the query):
>
>I assume that the y axis is the time of the query?
>

The y-axis is the fraction of table matched by the query. The values in
the contingency table are query durations (average of 3 runs, but the
numbers vere very close).

>How much data is this compared to memory available for the kernel to do
>caching?
>

Multiple of RAM, in all cases. The queries were hitting random subsets of
the data, and the page cache was dropped after each test, to eliminate
cross-query caching.

>
>>    pct         0         1        4         16        64       128
>>    ---------------------------------------------------------------
>>      1     25990     18624      3269      2219      2189      2171
>>      5     88116     60242     14002      8663      8560      8726
>>     10    120556     99364     29856     17117     16590     17383
>>     25    101080    184327     79212     47884     46846     46855
>>     50    130709    309857    163614    103001     94267     94809
>>     75    126516    435653    248281    156586    139500    140087
>>
>> compared to the e_i_c=0 case, it looks like this:
>>
>>    pct       1        4         16        64       128
>>    ----------------------------------------------------
>>      1     72%      13%         9%        8%        8%
>>      5     68%      16%        10%       10%       10%
>>     10     82%      25%        14%       14%       14%
>>     25    182%      78%        47%       46%       46%
>>     50    237%     125%        79%       72%       73%
>>     75    344%     196%       124%      110%      111%
>>
>> So for 1% of the table the e_i_c=1 is faster by about ~30%, but with
>> e_i_c=4 (or more) it's ~10x faster. This is a fairly common pattern, not
>> just on this storage system.
>>
>> The e_i_c=1 can perform pretty poorly, especially when the query matches
>> large fraction of the table - for example in this example it's 2-3x
>> slower compared to no prefetching, and higher e_i_c values limit the
>> damage quite a bit.
>
>I'm surprised the slowdown for small e_i_c values is that big - it's not
>obvious to me why that is.  Which os / os version / filesystem / io
>scheduler / io scheduler settings were used?
>

This is the system with NVMe storage, and SATA RAID:

Linux bench2 4.19.26 #1 SMP Sat Mar 2 19:50:14 CET 2019 x86_64 Intel(R)
Xeon(R) CPU E5-2620 v4 @ 2.10GHz GenuineIntel GNU/Linux

/dev/nvme0n1p1 on /mnt/data type ext4 (rw,relatime)
/dev/md0 on /mnt/raid type ext4 (rw,relatime,stripe=48)

The other system looks pretty much the same (same kernel, ext4).


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: [PATCH] Speedup truncates of relation forks
Next
From: Prabhat Sahu
Date:
Subject: Attached partition not considering altered column properties of root partition.