Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule? - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?
Date
Msg-id alpine.DEB.2.10.1507041746230.6474@sto
Whole thread Raw
In response to Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?
Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?
List pgsql-hackers
>>>> In summary, the X^1.5 correction seems to work pretty well. It doesn't
>>>> completely eliminate the problem, but it makes it a lot better.

I've looked at the maths.

I think that the load is distributed as the derivative of this function, 
that is (1.5 * x ** 0.5): It starts at 0 but very quicky reaches 0.5, it 
pass the 1.0 (average load) around 40% progress, and ends up at 1.5, that 
is the finishing load is 1.5 the average load, just before fsyncing files. 
This looks like a recipee for a bad time: I would say this is too large an 
overload. I would suggest a much lower value, say around 1.1...

The other issue with this function is that it should only degrade 
performance by disrupting the write distribution if someone has WAL on a 
different disk. As I understand it this thing does only make sense if the 
WAL & the data are on the samee disk. This really suggest a guc.

> I have ran some tests with this patch and the detailed results of the 
> runs are attached with this mail.

I do not understand really the aggregated figures in the files attached.

I guess that maybe between "end" markers there is a summary of figures 
collected for 28 backends over 300-second runs (?), but I do not know what 
the min/max/avg/sum/count figures are about.

> I thought the patch should show difference if I keep max_wal_size to 
> somewhat lower or moderate value so that checkpoint should get triggered 
> due to wal size, but I am not seeing any major difference in the writes 
> spreading.

I'm not sure I understand your point. I would say that at full speed 
pgbench the disk is always busy writing as much as possible, either 
checkpoint writes or wal writes, so the write load as such should not be 
that different anyway?

I understood that the point of the patch is to check whether there is a 
tps dip or not when the checkpoint begins, but I'm not sure how this can 
be infered from the many aggregated data you sent, and from my recent 
tests the tps is very variable anyway on HDD.

-- 
Fabien.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: PostgreSQL 9.5 Alpha 1 build fail with perl 5.22
Next
From: Andrew Dunstan
Date:
Subject: Re: PostgreSQL 9.5 Alpha 1 build fail with perl 5.22