Re: checkpointer continuous flushing - Mailing list pgsql-hackers
From | Fabien COELHO |
---|---|
Subject | Re: checkpointer continuous flushing |
Date | |
Msg-id | alpine.DEB.2.10.1509081531300.25033@sto Whole thread Raw |
In response to | Re: checkpointer continuous flushing (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: checkpointer continuous flushing
|
List | pgsql-hackers |
Hello Amit, > I have done some tests with both the patches(sort+flush) and below > are results: Thanks a lot for these runs on this great harware! > Test - 1 (Data Fits in shared_buffers) Rounded for easier comparison: flush/sort off off: 27480.4 ± 12791.1 [ 0, 16009, 32109, 37629, 51671] (2.8%) off on : 27482.5 ± 12552.0 [ 0, 16587,31226, 37516, 51297] (2.8%) The two above case are pretty indistinguishable, sorting has no impact. The 2.8% means more than 1 minute offline per hour (not necessarily a whole minute, it may be distributed over the whole hour). on off: 25214.8 ± 11059.7 [5268, 14188, 26472, 35626, 51479] (0.0%) on on : 26819.6 ± 10589.7 [5192, 16825, 29430, 35708,51475] (0.0%) > For this test run, the best results are when both the sort and flush > options are enabled, the value of lowest TPS is increased substantially > without sacrificing much on average or median TPS values (though there > is ~9% dip in median TPS value). When only sorting is enabled, there is > neither significant gain nor any loss. When only flush is enabled, > there is significant degradation in both average and median value of TPS > ~8% and ~21% respectively. I interpret the five numbers in bracket as an indicator of performance stability: they should be equal for perfect stability. Once they show some stability, the next point for me is to focus at the average performance. I do not see a median decrease as a big issue if the average is reasonably good. Thus I essentially note the -2.5% dip on average of on-on vs off-on. I would say that it is probably significant, although it might be in the error margin of the measure. Not sure whether the little stddev reduction is really significant. Anyway the benefit is clear: 100% availability. Flushing without sorting is a bad idea (tm), not a surprise. > Test - 2 (Data doesn't fit in shared_buffers, but fits in RAM) flush/sort off off: 5050.1 ± 4884.5 [ 0, 98, 4699, 10126, 13631] ( 7.7%) off on : 6194.2 ± 4913.5 [ 0, 98, 8982,10558, 14035] (11.0%) on off: 2771.3 ± 1861.0 [ 288, 2039, 2375, 2679, 12862] ( 0.0%) on on : 6110.6 ± 1939.3 [1652,5215, 5724, 6196, 13828] ( 0.0%) I'm not sure that the off-on vs on-on -1.3% avg tps dip is significant, but it may be. With both flushing and sorting pg becomes fully available, and the standard deviation is devided by more than 2, so the benefit is clear. > For this test run, again the best results are when both the sort and flush > options are enabled, the value of lowest TPS is increased substantially > and the average and median value of TPS has also increased to > ~21% and ~22% respectively. When only sorting is enabled, there is a > significant gain in average and median TPS values, but then there is also > an increase in number of times when TPS is below 10 which is bad. > When only flush is enabled, there is significant degradation in both average > and median value of TPS to ~82% and ~97% respectively, now I am not > sure if such a big degradation could be expected for this case or it's just > a problem in this run, I have not repeated this test. Yes, I agree that it is strange that sorting without flushing on its own both improves performance (+20% tps) but seems to degrade availability at the same time. A rerun would have helped to check whether it is a fluke or it is reproducible. > Test - 3 (Data doesn't fit in shared_buffers, but fits in RAM) > ---------------------------------------------------------------------------------------- > Same configuration and settings as above, but this time, I have enforced > Flush to use posix_fadvise() rather than sync_file_range() (basically > changed code to comment out sync_file_range() and enable posix_fadvise()). > > On using posix_fadvise(), the results for best case (both flush and sort as > on) shows significant degradation in average and median TPS values > by ~48% and ~43% which indicates that probably using posix_fadvise() > with the current options might not be the best way to achieve Flush. Yes, indeed. The way posix_fadvise is implemented on Linux is between no effect and bad effect (the buffer is erased). You hit the later quite strongly... As you are doing a "not fit in shared buffer" test, it is essential that buffers are kept in ram, but posix_fadvise on Linux just instructs to erase the buffer from memory if it was already passed to the I/O subsystem, which given the probably large I/O device cache on your host should be done pretty quickly, so that later read must be fetch back from the device (either cache or disk), which means a drop in performance. Note that FreeBSD implementation seems more convincing, although less good than Linux sync_file_range function. I've no idea about other systems. > Overall, I think this patch (sort+flush) brings a lot of value on table > in terms of stablizing the TPS during checkpoint, however some of the > cases like use of posix_fadvise() and the case (all data fits in > shared_buffers) where the value of median TPS is regressed could be > investigated to see what can be done to improve them. I think more > tests can be done to ensure the benefit or regression of this patch, but > for now this is what best I can do. Thanks a lot, again, for these tests! I think that we may conclude, on these run: (1) sorting seems not to harm performance, and may help a lot. (2) Linux flushing with sync_file_range may degrade a little raw tps average in some case, but definitely improves performancestability (always 100% availability when on !). (3) posix_fadvise on Linux is a bad idea... the good news is that it is not needed there:-) How good or bad an idea itis on other system is an open question... These results are consistent with the current default values in the patch: sorting is on by default, flushing is on with Linux and off otherwise (posix_fadvise). Also, as the effect on other systems is unclear, I think it is best to keep both settings as GUCs for now. -- Fabien.
pgsql-hackers by date: