Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan |
Date | |
Msg-id | CAH2-WzkSZEOy_RaqSiBKnVH=8rv-6dhFWfbHaRPfOxjyMAcZXA@mail.gmail.com Whole thread Raw |
In response to | Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
|
List | pgsql-hackers |
On Tue, Nov 28, 2023 at 4:29 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > I haven't looked at the code, but I decided to do a bit of blackbox perf > and stress testing, to get some feeling of what to expect in terms of > performance improvements, and see if there happen to be some unexpected > regressions. Attached is a couple simple bash scripts doing a > brute-force test with tables of different size / data distribution, > number of values in the SAOP expression, etc. My own stress-testing has focussed on the two obvious extremes for this patch, using variants of pgbench with SAOP SELECTs on pgbench_accounts: 1. The case where there is almost no chance of finding any two index tuples together on the same page, because the constants are completely random. This workload makes the patch's attempts at "coalescing together" index page reads pure overhead, with no possible benefit. Obviously that's a cost that needs to be kept under control. 2. The case where there are 255 of tuples with distinct values that are clustered together (both in the key space and in physical index pages). Usually they'll span two index pages, but they might all fit together on one index page, allowing us to descend to it directly and read it only once. With 32 clients, I typically see a regression of about 1.5% for the first case relative to master, measured in throughput/TPS. The second case typically sees throughput that's ~4.8x master (i.e. a ~380% increase). I consider both of these extremes to be fairly unrealistic. With fewer array constants, the speed-ups you'll see in sympathetic cases are still very significant, but nothing like 4.8x. They're more like the 30% numbers that you saw. As you know, I'm not actually all that excited about cases like 2 -- it's not where users are likely to benefit from the patch. The truly interesting cases are those cases where we can completely avoid heap accesses in the first place (not just *repeat* accesses to the same index pages), due to the patch's ability to consistently use index quals rather than filter quals. It's not that hard to show cases where there are 100x+ fewer pages accessed -- often with cases have very few array constants. It's just that these cases aren't that interesting from a performance validation point of view -- it's obvious that filter quals are terrible. Another thing that the patch does particularly well on is cases where the array keys don't have any matches at all, but there is significant clustering (relatively common when multiple SAOPs are used as index quals, which becomes far more likely due to the planner changes). We don't just skip over parts of the index that aren't relevant -- we also skip over parts of the arrays that aren't relevant. Some of my adversarial test cases that take ~1 millisecond for the patch to execute will practically take forever on master (I had to have my test suite not run those tests against master). You just need lots of array keys. What master does on those adversarial cases with billions of distinct combinations of array keys (not that unlikely if there are 4 or 5 SAOPs with mere hundreds or thousands of array keys each) is so inefficient that we might as well call it infinitely slower than the patch. This is interesting to me from a performance robustness/stability point of view. The slowest kind of SAOP index scan with the patch becomes a full index scan -- just as it would be if we were using any type of non-SOAP qual before now. The worst case is a lot easier to reason about. > I'm not convinced this is a problem we have to solve. It's possible it > only affects cases that are implausible in practice (the script forces a > particular scan type, and maybe it would not be picked in practice). But > maybe it's fixable ... I would expect the patch to do quite well (relative to what is actually possible) on cases like the two extremes that I've focussed on so far. It seems possible that it will do less well on cases that are somewhere in the middle (that also have lots of distinct values on each page). We effectively do a linear search on a page that we know has at least one more match (following a precheck that uses the high key). We hope that the next match (for the next array value) closely follows an initial match. But what if there are only 2 or 3 matches on each leaf page, that are spaced relatively far apart? You're going to have to grovel through the whole page. It's not obvious that that's a problem to be fixed -- we're still only descending the index once and still only locking the leaf page once, so we'll probably still win relative to master. And it's not that easy to imagine beating a linear search -- it's not like there is just one "next value" to search for in these cases. But it's something that deserves further consideration. -- Peter Geoghegan
pgsql-hackers by date: