On 08/25/2018 12:11 AM, Jerry Jelinek wrote: > Alvaro, > > I have previously posted ZFS numbers for SmartOS and FreeBSD to this > thread, although not with the exact same benchmark runs that Tomas did. > > I think the main purpose of running the benchmarks is to demonstrate > that there is no significant performance regression with wal recycling > disabled on a COW filesystem such as ZFS (which might just be intuitive > for a COW filesystem). I've tried to be sure it is clear in the doc > change with this patch that this tunable is only applicable to COW > filesystems. I do not think the benchmarks will be able to recreate the > problematic performance state that was originally described in Dave's > email thread here: > > https://www.postgresql.org/message-id/flat/CACukRjO7DJvub8e2AijOayj8BfKK3XXBTwu3KKARiTr67M3E3w%40mail.gmail.com#CACukRjO7DJvub8e2AijOayj8BfKK3XXBTwu3KKARiTr67M3E3w@mail.gmail.com >
I agree - the benchmarks are valuable both to show improvement and lack of regression. I do have some numbers from LVM/ext4 (with snapshot recreated every minute, to trigger COW-like behavior, and without the snapshots), and from ZFS (on Linux, using zfsonlinux 0.7.9 on kernel 4.17.17).
Attached are PDFs with summary charts, more detailed results are available at
lvm/ext4 (no snapshots) ----------------------- This pretty much behaves like plain ex4, at least for scales 200 and 2000. I don't have results for scale 8000, because the test ran out of disk space (I've used part of the device for snapshots, and it was enough to trigger the disk space issue).
lvm/ext4 (snapshots) --------------------- On the smallest scale (200), there's no visible difference. On scale 2000 disabling WAL reuse gives about 10% improvement (21468 vs. 23517 tps), although it's not obvious from the chart. On the largest scale (6000, to prevent the disk space issues) the improvement is about 10% again, but it's much clearer.
zfs (Linux) ----------- On scale 200, there's pretty much no difference. On scale 2000, the throughput actually decreased a bit, by about 5% - from the chart it seems disabling the WAL reuse somewhat amplifies impact of checkpoints, for some reason.
I have no idea what happened at the largest scale (8000) - on master there's a huge drop after ~120 minutes, which somewhat recovers at ~220 minutes (but not fully). Without WAL reuse there's no such drop, although there seems to be some degradation after ~220 minutes (i.e. at about the same time the master partially recovers. I'm not sure what to think about this, I wonder if it might be caused by almost filling the disk space, or something like that. I'm rerunning this with scale 600.
I'm also not sure how much can we extrapolate this to other ZFS configs (I mean, this is a ZFS on a single SSD device, while I'd generally expect ZFS on multiple devices, etc.).
From:
Peter Geoghegan Date: Subject:
Re: "Write amplification" is made worse by "getting tired" whileinserting into nbtree secondary indexes (Was: Why B-Tree suffix truncation matters)