Hi Luca
(I tried to reproduce your tests, but I got similar results over different checkpoint_completion_target)
The rest is in line here below:
On 12/07/2019 12:04, Luca Ferrari wrote:
>
> shared_buffers = 1 GB
> checkpoint_timeout = 5 min
>
> I've created a pgbench database as follows (around 4.5 GB):
> % pgbench -i -s 300 -F 100 --foreign-keys --unlogged-tables -h
> 127.0.0.1 -U luca pgbench
>
> and I've tested three times (each time after a restart) with the following:
> % pgbench -T 600 -j 4 -c 4 -h 127.0.0.1 -U luca -P 60 pgbench
>
>
> Since tables are unlogged, I was expecting no much difference in
> setting checkpoint_completion_target, but I got (average results):
> - checkpoint_completion_target = 0.1 ==> 755 tps
> - checkpoint_completation_target = 0.5 ==> 767 tps
> - checkpoint_completion_target = 0.9 ==> 681 tps
unlogged tables are not written to WAL, therefore checkpoints do not fit into the picture (unless something else is
writingdata..).
>
> so while there is not a big different in the first two cases, it seems
> throttling I/O reduces the tps, and I don't get why. Please note that
> there is some small activity while benchmarking, and that's why I ran
> at least three tests for each setting.
It is not a good idea to have anything running in the background.
Also is always a good idea to run tests multiple times, and I think that 3 is the bare minimum.
You want to make sure your tests are as reliable as possible, means having similar results between each other,
thereforeyou might post all the results, not only the average, so people can give their interpretation of the data.
Back to your question, your tests run for 10 minutes, and checkpoints happen every 5, so we should expect to see 2
checkpointsper test, which might influence your results. How long is a checkpoint spread over time, is given by
checkpoint_completion_target
Assuming that the 'background activity' writes data, a value of (checkpoint_completion_target) 0.9 means that when your
teststarts, the system might be still busy in writing data from the previous checkpoint (which started before your
pgbenchtest was launched). That is less likely to happen with a value of 0.1
Maybe looking at the graphs (CPU, disk) of your server might point to something.
Also the postgres logs should be able to tell you more, eg: when a checkpoint starts, finishes, and how much stuff it
wrote.
I hope I gave you enough inputs to better understand what is going on.
regards,
fabio pardi