Re: Remove wal_[sync|write][_time] from pg_stat_wal and track_wal_io_timing - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Remove wal_[sync|write][_time] from pg_stat_wal and track_wal_io_timing |
Date | |
Msg-id | xiwwevyjw336dgpqpqwvcuzh7riurucjvdaywsysbmfq5qfmzv@w2jfhutsvs66 Whole thread Raw |
In response to | Re: Remove wal_[sync|write][_time] from pg_stat_wal and track_wal_io_timing (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>) |
Responses |
Re: Remove wal_[sync|write][_time] from pg_stat_wal and track_wal_io_timing
|
List | pgsql-hackers |
Hi, On 2025-02-24 10:54:54 +0000, Bertrand Drouvot wrote: > a051e71e28a added the "timing tracking" in the WAL code path with "only" > track_io_timing enabled (and track_wal_io_timing still false). Here, in this thread, > we remove unnecessary INSTR_TIME_SET_CURRENT()/INSTR_TIME_ACCUM_DIFF() calls when > both track_io_timing and track_wal_io_timing were enabled. > > So I think that your main concern comes from the fact that a051e71e28a added the > "timing tracking" in the WAL code path with "only" track_io_timing enabled. Correct. > That's a concern we also had and discussed in [1]. The question was: should > we track the WAL timing stats in pg_stat_io only when track_wal_io_timing is > enabled or relying only on track_io_timing would be enough? > > We ran several benchmarks ([2], [3] and [4]) and based on the results we reached > the conclusion that to rely only on track_io_timing to display the WAL timing > stats in pg_stat_io was "good" enough. Sorry to be blunt, but those benchmarks miss the mark. The benchmark - emits WAL in a transactional fashion, with fairly small transactions, where the disk is fairly slow. It's absolutely guaranteed that the bottleneck is just the WAL flush disk time. You're doing ~1k-2k TPS - the timing overhead would have to be *ginormous* to show up. - emits WAL at a low concurrency thus lock contention not really an issue. - likely is executed so on a system with very cheap timing functions - but often production workloads unfortunately don't. While it's not quite as common as it used to be, virtualized systems often have considerably slower clocks. On this system it'll likely be fine overhead-wise: $ echo tsc | sudo tee /sys/devices/system/clocksource/clocksource0/current_clocksource tsc $ src/bin/pg_test_timing/pg_test_timing Testing timing overhead for 3 seconds. Per loop time including overhead: 19.17 ns Histogram of timing durations: < us % of total count 1 98.08521 153484414 2 1.91421 2995376 4 0.00040 619 8 0.00014 225 16 0.00002 35 32 0.00001 9 64 0.00001 10 128 0.00000 3 256 0.00000 2 512 0.00000 1 On this system however, I'd not want to bet on it: $ echo hpet | sudo tee /sys/devices/system/clocksource/clocksource0/current_clocksource $ src/bin/pg_test_timing/pg_test_timing Testing timing overhead for 3 seconds. Per loop time including overhead: 1228.91 ns Histogram of timing durations: < us % of total count 1 0.15574 3802 2 78.51065 1916598 4 21.26778 519188 8 0.02069 505 16 0.03957 966 32 0.00356 87 64 0.00193 47 128 0.00008 2 Greetings, Andres Freund
pgsql-hackers by date: