Thread: Incorrect assumptions with low LIMITs
I have a query where the LIMIT clause is incorrectly reducing the cost of the query. This results in queries doing LIMIT m run much longer when m is small (e.g. 1-3) than if m is larger. (PG 9.0) (And yes, ANALYZE was run and, no, increasing stats_target for the columns doesn't help). Notice that the total predicted cost is a small fraction of the cost of the IndexScan. It's a small cost because we assume that the LIMIT reduces the cost of the query in a linear manner, which just ain't so. Limit (cost=0.00..8320.49 rows=25 width=28) (actual time=0.024..239.044 rows=25 loops=1) -> Index Scan Backward using blah_pkey on blah (cost=0.00..4112652.70 rows=12357 width=28) (actual time=0.022..239.009 rows=25 loops=1) Index Cond: (blah_id <= 72572020) Filter: (user_id = ANY ('{....list....}'::integer[])) What we are assuming is that if doing the whole scan would reveal N rows, then we only need to do m/N fraction of the scan per output row. So if the Limit is very small then this means that the cost of this plan looks really low. Low enough that it beats the plan you might expect to see, which is an IndexScan or a BitMapIndexScan on user_id. This inappropriate assumption is causing bad plans on the two worst query types on this large database, so its not just an idle annoyance. I've also seen this before in other cases at other release levels. I'm not reporting this because I need help finding a solution, but because its a clear bug. So its not a topic for the performance list. Other examples Limit 1-3 Total runtime: 3394.080 ms example plan Limit (cost=0.00..1719.89 rows=2 width=1757) (actual time=3394.000..3394.000 rows=0 loops=1) -> Nested Loop (cost=0.00..769649.42 rows=895 width=1757) (actual time=3393.998..3393.998 rows=0 loops=1) -> Seq Scan on user (cost=0.00..763051.97 rows=895 width=1607) (actual time=3393.996..3393.996 rows=0 loops=1) Filter: ((user_id <> ALL ('{...about 10 users...}'::integer[])) AND (thing_id = ANY ('{array of integers}'::bigint[]))) -> Index Scan using auth_user_pkey on auth_user (cost=0.00..7.36 rows=1 width=150) (never executed) Index Cond: (id = user.user_id) Limit 4+ Total runtime: 2.132 ms Limit (cost=3052.13..3100.82 rows=4 width=1757) (actual time=2.053..2.053 rows=0 loops=1) -> Nested Loop (cost=3052.13..13946.44 rows=895 width=1757) (actual time=2.050..2.050 rows=0 loops=1) -> Bitmap Heap Scan on user (cost=3052.13..7348.98 rows=895 width=1607) (actual time=2.048..2.048 rows=0 loops=1) Recheck Cond: (thing_id = ANY ('{...lots...}'::bigint[])) Filter: (user_id <> ALL ('{...~10 users...}'::integer[])) -> Bitmap IndexScan on user_thing_id (cost=0.00..3051.90 rows=895 width=0) (actual time=2.026..2.026 rows=6 loops=1) Index Cond: (thing_id = ANY ('{...lots...}'::bigint[])) -> Index Scan using auth_user_pkeyon auth_user (cost=0.00..7.36 rows=1 width=150) (never executed) Index Cond: (id = user.user_id) 2000x slower is enough for me to call this a bug, rather than euphemistically saying "this is sub-optimal". So there are a few problems: 1. We assume that all values exist within the table. There is no measure of sparsity. Any value between min and max is assumed to exist and is assigned the averaged selectivity for a non-MFV. 2. We assume that if values do exist that they have rows uniformly distributed across the whole table like rungs on a ladder. So the fewer rows you want the cheaper it is to access them. This mistake is made any time we propagate the LIMIT clause onto a SeqScan. 3. We take no account of the potential for error in the plan. The plan chosen is very risky, but has a lower cost. It would be better to favour less risky plans for most cases. So (1) leads us to make an overestimate of the selectivity, then we use (2) to deduce that the overall cost is therefore incredibly low. So the "worst case" selectivity actually leads us to making a best-case assumption: that rows are frequent and also uniformly distributed. In terms of proposed solutions, (2) is the only one that seems able to be altered with any ease. If we have a reasonably large selectivity, its OK to assume that we don't have to scan much of a table before we find a matching row. But taking that assumption to extremes causes these bad plans. Any time we apply a LIMIT clause to a plan with a SeqScan or unqualified IndexScan, we shouldn't assume the scan will do less than say 10% of the table. It might, but its an unsafe assumption because as the selectivity decreases so does the safety of the assumption that rows are uniformly distributed. So basically put a clamp on the ability of this assumption to scale downwards when m/N is very small. We could also improve on (1) for some data types. Some measure of sparsity could easily be gained by taking NDistinct/(Max - Min) for integer types. That is useful because integers are commonly used as keys and so some special attention there is not unwarranted. Other ideas welcome. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Simon Riggs <simon@2ndQuadrant.com> writes: > 2. We assume that if values do exist that they have rows uniformly > distributed across the whole table like rungs on a ladder. Well, yeah. That's sometimes wrong, but not always. In the absence of evidence to the contrary, I think it's a better assumption than most others. > Any time we apply a LIMIT clause to a plan with a SeqScan or > unqualified IndexScan, we shouldn't assume the scan will do less than > say 10% of the table. This is a horrid idea, because it will stop the planner from using fast-start plans in many situations where they are wins. > Other ideas welcome. You are not the first person to have run into this type of problem. If it were easily solved by some minor hack, we would have done that long since. The problem is to not throw the baby out with the bathwater. I can see an argument for downgrading the assumed effectiveness of restrictions that are applied as qpquals (ie, not indexquals), but the LIMIT node coster doesn't have easy access to the information needed for that, especially not in situations more complicated than a single-table scan. regards, tom lane
On Fri, 2012-03-16 at 18:25 +0000, Simon Riggs wrote: > Any time we apply a LIMIT clause to a plan with a SeqScan or > unqualified IndexScan, we shouldn't assume the scan will do less than > say 10% of the table. It might, but its an unsafe assumption because > as the selectivity decreases so does the safety of the assumption that > rows are uniformly distributed. Just trying to follow along. You mean "as the selectivity _increases_ the safety of the assumption that the rows are uniformly distributed decreases", right? Regards,Jeff Davis
On Fri, Mar 16, 2012 at 9:39 PM, Jeff Davis <pgsql@j-davis.com> wrote: > On Fri, 2012-03-16 at 18:25 +0000, Simon Riggs wrote: >> Any time we apply a LIMIT clause to a plan with a SeqScan or >> unqualified IndexScan, we shouldn't assume the scan will do less than >> say 10% of the table. It might, but its an unsafe assumption because >> as the selectivity decreases so does the safety of the assumption that >> rows are uniformly distributed. > > Just trying to follow along. You mean "as the selectivity _increases_ > the safety of the assumption that the rows are uniformly distributed > decreases", right? Selectivity meaning the value between 0 and 1 that describes, in the planner, the fraction of rows we will get back from a scan. 1.0 means 100% of rows. When selectivity is low, that means very few rows will come back. I think you are using "high selectivity" as meaning "returns few of the rows", so you understand me, but just flip the meaning of the words. When you have lots of rows, its a good assumption they are spread out and a scan will find some of them quickly. When you have very few rows, assuming they are evenly spaced is just weird. Clumpiness of some kind seems much more likely. Much more easily understood if the values are dates, for example. Given the estimated number of rows is deliberately a worst case (i,e. high), that sounds like the scan will work. Yet the reality is that doing the scan is incredibly costly when those assumptions break, which they do, often. Especially when the values don't exist at all because the table is sparse. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Mar 16, 2012 at 9:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: >> 2. We assume that if values do exist that they have rows uniformly >> distributed across the whole table like rungs on a ladder. > > Well, yeah. That's sometimes wrong, but not always. In the absence > of evidence to the contrary, I think it's a better assumption than > most others. The evidence I showed proves it is x1000 worse. I'm not sure how you say it is "better than most" or that there is no evidence. > You are not the first person to have run into this type of problem. Yes - if I thought it was an isolated problem I would not have brought it up. Also, I don't bring it up because I need help; the actual problem has workarounds. But rewriting SQL because of a planner problem is not something that highlights our strengths. > If it were easily solved by some minor hack, we would have done that > long since. The problem is to not throw the baby out with the > bathwater. > > I can see an argument for downgrading the assumed effectiveness of > restrictions that are applied as qpquals (ie, not indexquals), but > the LIMIT node coster doesn't have easy access to the information > needed for that, especially not in situations more complicated than > a single-table scan. My wish was to register this as both a common and significant bug, whatever the solution is. The bug appears in very simple queries, so we can't hide behind some doubt as to the exact cause. Few planning errors are wrong by more than 1000 times; this plan gets worse every time the client gains new customers, so its not just bad, its getting worse. Multiple different queries show the same symptoms, so its common also. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Sat, Mar 17, 2012 at 9:34 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > My wish was to register this as both a common and significant bug, It has definitely come up here before many times. However at root the problem is part of the general class of not understanding how two different columns are related. Postgres is assuming they're entirely independent and therefore all the values of x are uniformly distributed over values of y. To plan this better in your case it would have to know that blah_id <= 72572020 is not equally likely for user_id = ANY ('{....list....}'::integer[]) as it is for the table as a whole. -- greg
On Sat, Mar 17, 2012 at 11:33 AM, Greg Stark <stark@mit.edu> wrote: > On Sat, Mar 17, 2012 at 9:34 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> My wish was to register this as both a common and significant bug, > > It has definitely come up here before many times. > > However at root the problem is part of the general class of not > understanding how two different columns are related. Postgres is > assuming they're entirely independent and therefore all the values of > x are uniformly distributed over values of y. > > To plan this better in your case it would have to know that blah_id <= > 72572020 is not equally likely for user_id = ANY > ('{....list....}'::integer[]) as it is for the table as a whole. That is not the root cause in this case, though I agree that is also a problem. The first plan is more complex because of an ORDER BY that favours the index scan, but that's actually irrelevant to the case; the use of blah_id is actually the user using the fact that things are allocated in date order to avoid doing a date sort. The problems are as I described them (1) no account made for sparsity, and other factors leading to an overestimate of rows (N) (2) inappropriate assumption of the effect of LIMIT m, which causes a costly SeqScan to appear better than an IndexScan for low m/N, when in fact that is seldom the case. Overestimating N in (1) inverts the problem, so that an overestimate isn't the safe thing at all. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Sat, 2012-03-17 at 12:48 +0000, Simon Riggs wrote: > The problems are as I described them > > (1) no account made for sparsity, and other factors leading to an > overestimate of rows (N) > > (2) inappropriate assumption of the effect of LIMIT m, which causes a > costly SeqScan to appear better than an IndexScan for low m/N, when in > fact that is seldom the case. > > Overestimating N in (1) inverts the problem, so that an overestimate > isn't the safe thing at all. I think the actual problem has more to do with risk. The planner doesn't know how uniform the distribution of the table is, which introduces risk for the table scan. I would tend to agree that for low selectivity fraction and a very low limit (e.g. 1-3 in your example) and a large table, it doesn't seem like a good risk to use a table scan. I don't know how that should be modeled or implemented though. Regards,Jeff Davis
On Mon, Mar 19, 2012 at 6:19 PM, Jeff Davis <pgsql@j-davis.com> wrote: > On Sat, 2012-03-17 at 12:48 +0000, Simon Riggs wrote: >> The problems are as I described them >> >> (1) no account made for sparsity, and other factors leading to an >> overestimate of rows (N) >> >> (2) inappropriate assumption of the effect of LIMIT m, which causes a >> costly SeqScan to appear better than an IndexScan for low m/N, when in >> fact that is seldom the case. >> >> Overestimating N in (1) inverts the problem, so that an overestimate >> isn't the safe thing at all. > > I think the actual problem has more to do with risk. The planner doesn't > know how uniform the distribution of the table is, which introduces risk > for the table scan. > > I would tend to agree that for low selectivity fraction and a very low > limit (e.g. 1-3 in your example) and a large table, it doesn't seem like > a good risk to use a table scan. I don't know how that should be modeled > or implemented though. FWIW, we have been bitten by the uniformity assumption a number of times, but we kind of know what to look for. It also has an interesting interpretation as applied to garbage (insomuch as queries all carry a conceptual "WHERE xmin...xmax... predicate). Sometimes a table is mostly garbage for a particular value, and the index scan constantly has to seek the next tuple in its search to find non-garbage. I don't know how many times the uniformity assumption has held up fine (guess: many), but cases of where an index condhas that little "Filter:" footnote has been a ticking timebomb for a number of people I've interacted with. It can be subtle because the amount of scanned data may not be enormous, so upon re-running the query it is not nearly as pathological as the original run given cache effects, the only way to make it very visible is to look at the EXPLAIN with BUFFERS option and note how many pages are being discarded by the filter/how many pages of the underlying relation and indexes are being processed only to be discarded. Were I writing a OLTP-query performance linting tool, queries with partially-matched index conditions might even be on my list of things to warn about. That is unfortunate for cases where the unmatched part of the index condition may only qualify, say, five records in all situations, but the database has no construct to know and enforce this AFAIK. -- fdr