Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: BitmapHeapScan streaming read user and prelim refactoring |
Date | |
Msg-id | bc1d4ebc-3778-4836-ace5-e1daa7223d36@vondra.me Whole thread Raw |
In response to | Re: BitmapHeapScan streaming read user and prelim refactoring (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: BitmapHeapScan streaming read user and prelim refactoring
Re: BitmapHeapScan streaming read user and prelim refactoring |
List | pgsql-hackers |
On 1/30/25 21:36, Melanie Plageman wrote: > > ... > > Attached v28 is rebased and has a few updates/cleanup. All patches in > the set need review and I need to do some benchmarking of v28-0003. > > - Melanie Hi, I've been re-running the benchmarks on v28 on Melanie's request. The tests are still running, but here's what I have so far. The results are pretty large, I've pushed them here: https://github.com/tvondra/bitmapscan-tests I've decommissioned the small "i5" machine and replaced it with a new ryzen 9900x one. I kept the "old" storage (SSD SATA RAID), but other than that it's much more capable. That might explain some of the differences in benchmark results. So now I have these two machines: xeon: 2x E5-2699v4 - nvme: WD Ultrastar DC SN640 960GB ryzen: Ryzen 9 9900X - data: Samsung SSD 990 PRO 1TB (NVME) - raid-nvme: 4x Samsung SSD 990 PRO 1TB (RAID0, NVME) - raid-sata: 4x Intel DC S3700 (RAID0) See the PDFs for the usual "colored pivot table" visualization, or the raw CSV files. The pivot table shows results for various combinations of parameters, the columns are for different patches and values of the effective_io_concurrency GUC. The columns on the right compare results to the previous patch, i.e. it shows how the query duration changes after applying the patch. I didn't have time to do a particularly detailed analysis yet, or try to investigate the regressions. But some basic observations. 1) xeon ------- This machine only has a single type of storage (NVMe SSD), and the results looks really good. Most of the chart is "green" i.e. the patch makes queries faster, and the regressions are random, for the very short queries, and it seems random / no pattern. Which means this is likely random noise, nothing to worry about. There's one minor issue that I failed to test the 0004 patch due to a bug in the script, so the PDF only compares master, 0001, 0002 and 0003. But that should not matter too much, because 0003 is the main change. There are "gaps" for tests on the 100M data set that are still running, but I think I'll abort those, it'd take many days to complete and it doesn't seem we'd learn very much. 2) ryzen -------- This "new" machine has multiple types of storage. The cached results (be it in shared buffers or in page cache) are not very interesting. 0003 helps a bit (~15%), but other than that it's just random noise. The "uncached" results starting on page 23 are much more interesting. In general 0001, 0002 and 0004 have little impact, it seems just random noise. So in the rest I'll focus on 0003. For the single nvme device (device: data), it seems mostly fine. It's green, even though there are a couple "localized regressions" for eic=0. I haven't looked into those yet. For the nvme RAID (device: raid-nvme), it's looks almost exactly the same, except that with parallel query (page 27) there's a clear area of regression with eic=1 (look for "column" of red cells). That's a bit unfortunate, because eic=1 is the default value. For the SATA SSD RAID (device: raid-sata), it's similar - a couple regressions for eic=0, just like for NVMe. But then there's also a couple regressions for higher eic values, usually for queries with selective conditions. Overall, I think this looks good. It might be a good idea to look into some of the regressions, but ultimately the read stream relies on a heuristic to determine how far ahead to read etc. And each heuristics has counter examples. So the presence of regressions is guaranteed, it should not be a reason to reject a patch on its own. There's one more thing to consider when looking at those results - the script *forces* the use of bitmap index scan, even if the planner would not pick it on it's own. The ryzen-filtered.pdf shows only results when that would be the case (i.e. the optimizer would pick bitmapscan). Most of the regressions are gone, simply because it'd do index scan, seqscan. Of course, this is not perfect - planning mistakes happen, and it would be nice to not regress more than before. The plan choice depends also on the cost parameters, and maybe with different values we'd pick bitmap scans more often. Anyway, the results look sensible. It might be good to investigate some of the regressions, and I'll try doing that if I find the time. But I don't think that's necessarily a blocker - every patch of this type will have a hardware where the heuristics doesn't quite do the right thing by default. Which is why we have GUCs to tune it if appropriate. regards -- Tomas Vondra
pgsql-hackers by date: