Re: checkpointer continuous flushing - Mailing list pgsql-hackers
From | Fabien COELHO |
---|---|
Subject | Re: checkpointer continuous flushing |
Date | |
Msg-id | alpine.DEB.2.10.1508230812470.29146@sto Whole thread Raw |
In response to | Re: checkpointer continuous flushing (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: checkpointer continuous flushing
|
List | pgsql-hackers |
Hello Amit, > I have tried your scripts and found some problem while using avg.py > script. > grep 'progress:' test_medium4_FW_off.out | cut -d' ' -f4 | ./avg.py > --limit=10 --length=300 > : No such file or directory > I didn't get chance to poke into avg.py script (the command without > avg.py works fine). Python version on the m/c, I planned to test is > Python 2.7.5. Strange... What does "/usr/bin/env python" say? Can the script be started on its own at all? I think that the script should work both with python2 and python3, at least it does on my laptop... > Today while reading the first patch (checkpoint-continuous-flush-10-a), > I have given some thought to below part of patch which I would like > to share with you. > > + * Select a tablespace depending on the current overall progress. > + * > + * The progress ratio of each unfinished tablespace is compared to > + * the overall progress ratio to find one with is not in advance > + * (i.e. overall ratio > tablespace ratio, > + * i.e. tablespace written/to_write > overall written/to_write > Here, I think above calculation can go for toss if backend or bgwriter > starts writing buffers when checkpoint is in progress. The tablespace > written parameter won't be able to consider the one's written by backends > or bgwriter. Sure... This is *already* the case with the current checkpointer, the schedule is performed with respect to the initial number of buffers it think it will have to write, and if someone else writes these buffers then the schedule is skewed a little bit, or more... I have not changed this logic, but I extended it to handle several tablespaces. If this (the checkpointer progress evaluation used for its schedule is sometimes wrong because of other writes) is proven to be a major performance issue, then the processes which writes the checkpointed buffers behind its back should tell the checkpointer about it, probably with some shared data structure, so that the checkpointer can adapt its schedule. This is an independent issue, that may be worth to address some day. My opinion is that when the bgwriter or backends quick in to write buffers, they are basically generating random I/Os on HDD and killing tps and latency, so it is a very bad time anyway, thus I'm not sure that this is the next problem to address to improve pg performance and responsiveness. > Now it may not big thing to worry but I find Heikki's version worth > considering, he has not changed the overall idea of this patch, but the > calculations are somewhat simpler and hence less chance of going wrong. I do not think that Heikki version worked wrt to balancing writes over tablespaces, and I'm not sure it worked at all. However I reused some of his ideas to simplify and improve the code. -- Fabien.
pgsql-hackers by date: