Re: Speed up Clog Access by increasing CLOG buffers - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Speed up Clog Access by increasing CLOG buffers
Date
Msg-id CAA4eK1J1sJchNAsbbhKP1DSRubcZLtQuiTjWpxdF0rc9+QoXvg@mail.gmail.com
Whole thread Raw
In response to Re: Speed up Clog Access by increasing CLOG buffers  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Speed up Clog Access by increasing CLOG buffers
List pgsql-hackers
On Fri, Sep 23, 2016 at 8:22 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> On 09/23/2016 03:07 PM, Amit Kapila wrote:
>>
>> On Fri, Sep 23, 2016 at 6:16 PM, Tomas Vondra
>> <tomas.vondra@2ndquadrant.com> wrote:
>>>
>>> On 09/23/2016 01:44 AM, Tomas Vondra wrote:
>>>>
>>>>
>>>> ...
>>>> The 4.5 kernel clearly changed the results significantly:
>>>>
>>> ...
>>>>
>>>>
>>>>
>>>> (c) Although it's not visible in the results, 4.5.5 almost perfectly
>>>> eliminated the fluctuations in the results. For example when 3.2.80
>>>> produced this results (10 runs with the same parameters):
>>>>
>>>>     12118 11610 27939 11771 18065
>>>>     12152 14375 10983 13614 11077
>>>>
>>>> we get this on 4.5.5
>>>>
>>>>     37354 37650 37371 37190 37233
>>>>     38498 37166 36862 37928 38509
>>>>
>>>> Notice how much more even the 4.5.5 results are, compared to 3.2.80.
>>>>
>>>
>>> The more I think about these random spikes in pgbench performance on
>>> 3.2.80,
>>> the more I find them intriguing. Let me show you another example (from
>>> Dilip's workload and group-update patch on 64 clients).
>>>
>>> This is on 3.2.80:
>>>
>>>   44175  34619  51944  38384  49066
>>>   37004  47242  36296  46353  36180
>>>
>>> and on 4.5.5 it looks like this:
>>>
>>>   34400  35559  35436  34890  34626
>>>   35233  35756  34876  35347  35486
>>>
>>> So the 4.5.5 results are much more even, but overall clearly below
>>> 3.2.80.
>>> How does 3.2.80 manage to do ~50k tps in some of the runs? Clearly we
>>> randomly do something right, but what is it and why doesn't it happen on
>>> the
>>> new kernel? And how could we do it every time?
>>>
>>
>> As far as I can see you are using default values of min_wal_size,
>> max_wal_size, checkpoint related params, have you changed default
>> shared_buffer settings, because that can have a bigger impact.
>
>
> Huh? Where do you see me using default values?
>

I was referring to one of your script @ http://bit.ly/2doY6ID.  I
haven't noticed that you have changed default values in
postgresql.conf.

> There are settings.log with a
> dump of pg_settings data, and the modified values are
>
> checkpoint_completion_target = 0.9
> checkpoint_timeout = 3600
> effective_io_concurrency = 32
> log_autovacuum_min_duration = 100
> log_checkpoints = on
> log_line_prefix = %m
> log_timezone = UTC
> maintenance_work_mem = 524288
> max_connections = 300
> max_wal_size = 8192
> min_wal_size = 1024
> shared_buffers = 2097152
> synchronous_commit = on
> work_mem = 524288
>
> (ignoring some irrelevant stuff like locales, timezone etc.).
>
>> Using default values of mentioned parameters can lead to checkpoints in
>> between your runs.
>
>
> So I'm using 16GB shared buffers (so with scale 300 everything fits into
> shared buffers), min_wal_size=16GB, max_wal_size=128GB, checkpoint timeout
> 1h etc. So no, there are no checkpoints during the 5-minute runs, only those
> triggered explicitly before each run.
>

Thanks for clarification.  Do you think we should try some different
settings *_flush_after parameters as those can help in reducing spikes
in writes?

>> Also, I think instead of 5 mins, read-write runs should be run for 15
>> mins to get consistent data.
>
>
> Where does the inconsistency come from?

Thats what I am also curious to know.

> Lack of warmup?

Can't say, but at least we should try to rule out the possibilities.
I think one way to rule out is to do slightly longer runs for Dilip's
test cases and for pgbench we might need to drop and re-create
database after each reading.

> Considering how
> uniform the results from the 10 runs are (at least on 4.5.5), I claim this
> is not an issue.
>

It is quite possible that it is some kernel regression which might be
fixed in later version.  Like we are doing most tests in cthulhu which
has 3.10 version of kernel and we generally get consistent results.
I am not sure if later version of kernel say 4.5.5 is a net win,
because there is a considerable difference (dip) of performance in
that version, though it produces quite stable results.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Complete LOCK TABLE ... IN ACCESS|ROW|SHARE
Next
From: Amit Kapila
Date:
Subject: Re: store narrow values in hash indexes?