Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays |
Date | |
Msg-id | CAApHDvr_b3VGY3LEcdjJutCjFS3S7hOCTCE1ddg8x3Qf12f19g@mail.gmail.com Whole thread Raw |
In response to | Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
|
List | pgsql-hackers |
On Fri, 9 Apr 2021 at 09:32, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > I ran the same set of benchmarks on the v6, which I think should be > mostly identical to what was committed. I extended the test a bit to > test table with 0, 1, 5 and 1000 rows, and also with int and text > values, to see how it works with more expensive comparators. > > I built binaries with gcc 9.2.0 and clang 11.0.0, the full results are > attached. There's a bit of difference between gcc and clang, but the > general behavior is about the same, so I'll only present gcc results to > keep this simple. I'll only throughput comparison to master, so >1.0 > means good, <1.0 means bad. If you're interested in actual tps, see the > full results. > > For the v5 patch (actually v4-0001 + v5-0002) and v6, the results are: > > integer column / v5 > =================== > > rows 10/p 100/p 16/p 10/s 100/s 16/s > ----------------------------------------------------------- > 0 97% 97% 97% 107% 126% 108% > 1 95% 82% 94% 108% 132% 110% > 5 95% 83% 95% 108% 132% 110% > 1000 129% 481% 171% 131% 382% 165% I think we should likely ignore the v5 patch now. The reason is that it was pretty much unfinished and there were many places that I'd not yet added support for the HashedScalarArrayOpExpr node type yet. This could cause these nodes to be skipped during node mutations or node walking which would certainly make planning faster, just not in a way that's correct. > integer column / v6 > =================== > > rows 10/p 100/p 16/p 10/s 100/s 16/s > ----------------------------------------------------------- > 0 97% 97% 97% 98% 98% 98% > 1 96% 84% 95% 97% 97% 98% > 5 97% 85% 96% 98% 97% 97% > 1000 129% 489% 172% 128% 330% 162% This is a really informative set of results. I can only guess that the slowdown of the 100/prepared query is down to building the hash table. I think that because the 0 rows test does not show the slowdown and we only build the table when evaluating for the first time. There's a slightly larger hit on 1 row vs 5 rows, which makes sense since the rewards of the hash lookup start paying off more with more rows. Looking at your tps numbers, I think I can see why we get the drop in performance with prepared statements but not simple statements. This seems to just be down to the fact that the planning time dominates in the simple statement case. For example, the "1 row" test for 100/s for v6 is 10023.3 tps, whereas the 100/p result is 44093.8 tps. With master, prepared gets 52400.0 tps. So we could say the hash table build costs us 8306.2 tps, or 3.59 microseconds per execution, per: postgres=# select 1000000 / 52400.0 - 1000000 / 44093.8; ?column? --------------------- -3.5949559161508538 (1 row) If we look at the tps for the simple query version of the same test. Master did 10309.6 tps, v6 did 10023.3 tps. If we apply that 3.59 microsecond slowdown to master's tps, then we get pretty close to within 1% of the v6 tps: postgres=# select 1000000 / (1000000 / 10309.6 + 3.59); ?column? ----------------------- 9941.6451581291294165 (1 row) > text column / v6 > ================ > > rows 10/p 100/p 16/p 10/s 100/s 16/s > ----------------------------------------------------------- > 0 101% 101% 101% 98% 99% 99% > 1 98% 82% 96% 98% 96% 97% > 5 100% 84% 98% 98% 96% 98% > 1000 297% 1645% 408% 255% 1000% 336% > > > Overall, the behavior for integer and text columns is the same, for both > patches. There's a couple interesting observations: > > 1) For the "simple" query mode, v5 helped quite a bit (20-30% speedup), > but v6 does not seem to help at all - it's either same or slower than > unpatched master. I think that's related to the fact that I didn't finish adding HashedScalarArrayOpExpr processing to all places that needed it. > I wonder why is that, and if we could get some of the speedup with v6? > At first I thought that maybe v5 is not building the hash table in cases > where v6 does, but that shouldn't be faster than master. I don't think v5 and v6 really do anything much differently in the executor. The only difference is really during ExecInitExprRec() when we initialize the expression. With v5 we had a case T_HashedScalarArrayOpExpr: to handle the new node type, but in v6 we have if (OidIsValid(opexpr->hashfuncid)). Oh, wait. I did add the missing permissions check on the hash function, so that will account for something. As far as I can see, that's required. > 2) For the "prepared" mode, there's a clear performance hit the longer > the array is (for both v5 and v6). For 100 elements it's about 15%, > which is not great. > > I think the reason is fairly simple - building the hash table is not > free, and with few rows it's not worth it - it'd be faster to just > search the array directly. Unfortunately, the logic that makes the > decision to switch to hashing only looks at the array length only, and > ignores the number of rows entirely. So I think if we want to address > this, convert_saop_to_hashed_saop needs to compare > > has_build_cost + nrows * hash_lookup_cost > > and > > nrows * linear_lookup_cost > > to make reasonable decision. I thought about that but I was really worried that the performance of ScalarArrayOpExpr would just become too annoyingly unpredictable. You know fairly well that we can often get massive row underestimations in the planner. (I guess you worked on ext stats mainly because of that) The problem I want to avoid is the ones where we get a big row underestimation but don't really get a bad plan as a result. For example a query like: SELECT * FROM big_table WHERE col1 = ... AND col2 = ... AND col3 = ... AND col4 IN( ... big list of values ...); If col1, col2 and col3 are highly correlated but individually fairly selective, then we could massively underestimate how many rows the IN clause will see (assuming no rearranging was done here). I'm not completely opposed to the idea of taking the estimated rows into account during planning. It might just mean having to move the convert_saop_to_hashed_saop() call somewhere else. I imagine that's fairly trivial to do. I just have concerns about doing so. > I was thinking that maybe we can ignore this, because people probably > have much larger tables in practice. But I'm not sure that's really > true, because there may be other quals and it's possible the preceding > ones are quite selective, filtering most of the rows. > > I'm not sure how much of the necessary information we have available in > convert_saop_to_hashed_saop_walker, though :-( I suppose we know the > number of input rows for that plan node, not sure about selectivity of > the other quals, though. > > It's also a bit strange that we get speedup for "simple" protocol, while > for "prepared" it gets slower. That seems counter-intuitive, because why > should we see opposite outcomes in those cases? I'd assume that we'll > see either speedup or slowdown in both cases, with the relative change > being more significant in the "prepared" mode. I hope my theory above about the planner time dominating the overall time shows why that is. Another way to look at these result is by taking your tps value and calculating how long it takes to do N number of transactions then totalling up the time it takes. If I do that to calculate how long it took each test to perform 1000 transactions and sum each test grouping by mode and version with rollup on mode, I get: time values are in seconds: pg-master 28.07 prepared 11.28 simple 16.79 pg-v6 15.86 prepared 5.23 simple 10.63 So, the overall result when applying the total time is 177%. David
pgsql-hackers by date: