Thread: BETWEEN optimizer problems with single-value range
Attached is a simplified example of a performance problem we have seen, with a workaround and a suggestion for enhancement (hence both the performance and hackers lists). Our software is allowing users to specify the start and end dates for a query. When they enter the same date for both, the optimizer makes a very bad choice. We can work around it in application code by using an equality test if both dates match. I think the planner should be able to make a better choice here. (One obvious way to fix it would be to rewrite "BETWEEN a AND b" as "= a" when a is equal to b, but it seems like there is some underlying problem which should be fixed instead (or in addition to) this. The first query uses BETWEEN with the same date for both min and max values. The second query uses an equality test for the same date. The third query uses BETWEEN with a two-day range. In all queries, there are less than 4,600 rows for the specified cotfcNo value out of over 18 million rows in the table. We tried boosting the statistics samples for the columns in the selection, which made the estimates of rows more accurate, but didn't change the choice of plans. -Kevin
Attachment
Kevin Grittner <Kevin.Grittner@wicourts.gov> schrieb: > Attached is a simplified example of a performance problem we have seen, Odd. Can you tell us your PG-Version? Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknow) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
>>> On Wed, Mar 15, 2006 at 12:17 pm, in message <20060315181735.GA22240@KanotixBox>, Andreas Kretschmer <akretschmer@spamfence.net> wrote: > Kevin Grittner <Kevin.Grittner@wicourts.gov> schrieb: > >> Attached is a simplified example of a performance problem we have seen, > > Odd. Can you tell us your PG- Version? I know we really should move to 8.1.3, but I haven't gotten to it yet. We're on a build from the 8.1 stable branch as of February 10th, with a patch to allow ANSI standard interpretation of string literals. (So this is 8.1.2 with some 8.1.3 changes plus the string literal patch.) If there are any changes in that time frame which might affect this issue, I could deploy a standard release and make sure that I see the same behavior. Let me know. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> Odd. Can you tell us your PG- Version? > this is 8.1.2 with some 8.1.3 changes plus the string literal patch.) 8.1 is certainly capable of devising the plan you want, for example in the regression database: regression=# explain select * from tenk1 where thousand = 10 and tenthous between 42 and 144; QUERY PLAN ------------------------------------------------------------------------------------ Index Scan using tenk1_thous_tenthous on tenk1 (cost=0.00..6.01 rows=1 width=244) Index Cond: ((thousand = 10) AND (tenthous >= 42) AND (tenthous <= 144)) (2 rows) It looks to me like this is a matter of bad cost estimation, ie, it's thinking the other index is cheaper to use. Why that is is not clear. Can we see the pg_stats rows for ctofcNo and calDate? Also, try to force it to generate the plan you want, so we can see what it thinks the cost is for that. If you temporarily drop the wrong index you should be able to get there: begin; drop index "Cal_CalDate"; explain analyze select ... ; -- repeat as needed if it chooses some other wrong index rollback; I hope you have a play copy of the database to do this in --- although it would be safe to do the above in a live DB, the DROP would exclusive-lock the table until you finish the experiment and rollback, which probably is not good for response time ... regards, tom lane
On 3/15/06, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Attached is a simplified example of a performance problem we have seen, > with a workaround and a suggestion for enhancement (hence both the > performance and hackers lists). Hi Kevin. In postgres 8.2 you will be able to use the row-wise comparison for your query which should guarantee good worst case performance without having to maintain two separate query forms. it is also a more elegant syntax as you will see. SELECT "CA"."calDate", "CA"."startTime" FROM "Cal" "CA" WHERE ("CA"."ctofcNo", "CA"."calDate") BETWEEN (2192, '2006-03-15') and (2192, '2006-03-15') ORDER BY "ctofcNo", "calDate", "startTime"; Be warned this will not work properly in pg < 8.2. IMO, row-wise is the best way to write this type of a query. Please note the row constructor and the addition of ctofcNo into the order by clause to force use of the index. Merlin
>>> On Wed, Mar 15, 2006 at 1:17 pm, in message <28798.1142450270@sss.pgh.pa.us>, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > 8.1 is certainly capable of devising the plan you want, for example > in the regression database: > > regression=# explain select * from tenk1 where thousand = 10 and tenthous > between 42 and 144; > QUERY PLAN > ------------------------------------------------------------------------------------ > Index Scan using tenk1_thous_tenthous on tenk1 (cost=0.00..6.01 rows=1 > width=244) > Index Cond: ((thousand = 10) AND (tenthous >= 42) AND (tenthous <= 144)) > (2 rows) That matches one of the examples where it optimized well. I only saw the bad plan when low and high ends of the BETWEEN range were equal. > It looks to me like this is a matter of bad cost estimation, ie, it's > thinking the other index is cheaper to use. Why that is is not clear. > Can we see the pg_stats rows for ctofcNo and calDate? schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation ------------+-----------+---------+-----------+-----------+------------+-----------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------+------------- public | Cal | calDate | 0 | 4 | 2114 | {2003-06-02,2000-06-20,2001-04-16,2003-06-17,2003-12-01,2004-10-12,2001-04-23,2001-10-15,2002-03-06,2002-05-03} | {0.00333333,0.00233333,0.00233333,0.00233333,0.00233333,0.00233333,0.002,0.002,0.002,0.002} | {1986-03-14,1999-06-11,2000-07-14,2001-05-18,2002-03-21,2002-12-04,2003-08-12,2004-05-13,2005-02-01,2005-09-28,2080-12-31} | 0.0545768 public | Cal | ctofcNo | 0 | 8 | 669 | {0793,1252,1571,0964,0894,1310,"DA ",0944,1668,0400} | {0.024,0.019,0.015,0.0123333,0.012,0.011,0.0106667,0.01,0.00966667,0.00866667} | {0000,0507,0733,0878,1203,1336,14AG,1633,1971,3705,YVJO} | -0.0179665 (2 rows) > Also, try to force it to generate the plan you want, so we can see what > it thinks the cost is for that. If you temporarily drop the wrong index > you should be able to get there: > > begin; > drop index "Cal_CalDate"; > explain analyze select ... ; > -- repeat as needed if it chooses some other wrong index > rollback; Sort (cost=4.03..4.03 rows=1 width=12) (actual time=48.484..48.486 rows=4 loops=1) Sort Key: "calDate", "startTime" -> Index Scan using "Cal_CtofcNo" on "Cal" "CA" (cost=0.00..4.02 rows=1 width=12) (actual time=36.750..48.228 rows=4 loops=1) Index Cond: ((("ctofcNo")::bpchar = '2192'::bpchar) AND (("calDate")::date >= '2006-03-15'::date) AND (("calDate")::date <= '2006-03-15'::date)) Total runtime: 56.616 ms
Merlin Moncure <mmoncure@gmail.com> schrieb: > On 3/15/06, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > > Attached is a simplified example of a performance problem we have seen, > > with a workaround and a suggestion for enhancement (hence both the > > performance and hackers lists). > > > Hi Kevin. In postgres 8.2 you will be able to use the row-wise 8.2? AFAIK, Feature freeze in juni/juli this year... Release august/september. > comparison for your query which should guarantee good worst case > performance without having to maintain two separate query forms. it Perhaps, a bitmap index scan (since 8.1) are useful for such querys. Thats why i asked which version. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknow) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
On 3/15/06, Andreas Kretschmer <akretschmer@spamfence.net> wrote: > Merlin Moncure <mmoncure@gmail.com> schrieb: > > > On 3/15/06, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > > > Attached is a simplified example of a performance problem we have seen, > > > with a workaround and a suggestion for enhancement (hence both the > > > performance and hackers lists). > > > > > > Hi Kevin. In postgres 8.2 you will be able to use the row-wise > > 8.2? AFAIK, Feature freeze in juni/juli this year... > Release august/september. yes, but I was addressing kevin's point about enhancing the server... > > comparison for your query which should guarantee good worst case > > performance without having to maintain two separate query forms. it > > Perhaps, a bitmap index scan (since 8.1) are useful for such querys. > Thats why i asked which version. I think you will find that reading a range of records from a table ordered by an index utilizing the 8.2 comparison feature is much faster than a bitmap index scan. Merlin
On Wed, 2006-03-15 at 11:56 -0600, Kevin Grittner wrote: > Attached is a simplified example of a performance problem we have seen, > with a workaround and a suggestion for enhancement (hence both the > performance and hackers lists). > > Our software is allowing users to specify the start and end dates for a > query. When they enter the same date for both, the optimizer makes a > very bad choice. We can work around it in application code by using an > equality test if both dates match. I think the planner should be able > to make a better choice here. > (One obvious way to fix it would be to > rewrite "BETWEEN a AND b" as "= a" when a is equal to b, but it seems > like there is some underlying problem which should be fixed instead (or > in addition to) this. That might work, but I'm not sure if that is in itself the problem and it would be mostly wasted overhead in 99% of cases. The main issue appears to be that the planner chooses "Cal_CalDate" index rather than "Cal_CtofcNo" index when the BETWEEN values match. It seems that the cost of the first and third EXPLAINs is equal, yet for some reason it chooses different indexes in each case. My understanding was that it would pick the first index created if plan costs were equal. Is that behaviour repeatable with each query? ISTM that if we have equal plan costs then we should be choosing the index for which we have more leading columns, since that is more likely to lead to a more selective answer. But the plan selection is a simple "pick the best, or if they're equal pick the best sort order". > The first query uses BETWEEN with the same date for both min and max > values. The second query uses an equality test for the same date. The > third query uses BETWEEN with a two-day range. In all queries, there > are less than 4,600 rows for the specified cotfcNo value out of over 18 > million rows in the table. We tried boosting the statistics samples for > the columns in the selection, which made the estimates of rows more > accurate, but didn't change the choice of plans. The selectivity seems the same in both - clamped to a minimum of 1 row, so changing that doesn't look like it would help. Best Regards, Simon Riggs
>>> On Wed, Mar 15, 2006 at 5:05 pm, in message <1142463908.3859.188.camel@localhost.localdomain>, Simon Riggs <simon@2ndquadrant.com> wrote: > On Wed, 2006- 03- 15 at 11:56 - 0600, Kevin Grittner wrote: > >> (One obvious way to fix it would be to >> rewrite "BETWEEN a AND b" as "= a" when a is equal to b, but it seems >> like there is some underlying problem which should be fixed instead (or >> in addition to) this. > > That might work, but I'm not sure if that is in itself the problem and > it would be mostly wasted overhead in 99% of cases. It sounds like we agree. > The main issue appears to be that the planner chooses "Cal_CalDate" > index rather than "Cal_CtofcNo" index when the BETWEEN values match. Agreed. > It seems that the cost of the first and third EXPLAINs is equal, yet for > some reason it chooses different indexes in each case. My understanding > was that it would pick the first index created if plan costs were equal. > Is that behaviour repeatable with each query? It seems to be a consistent pattern, although strictly speaking our evidence is anecdotal. We've got hundreds of known failures with the BETWEEN variant on equal dates and no known successes. We have a few dozen tests of the equality variant with 100% success in those tests. > ISTM that if we have equal plan costs then we should be choosing the > index for which we have more leading columns, since that is more likely > to lead to a more selective answer. But the plan selection is a simple > "pick the best, or if they're equal pick the best sort order". > The selectivity seems the same in both - clamped to a minimum of 1 row, > so changing that doesn't look like it would help. The fact that it costs these as equivalent is surprising in itself, and might be worth examining. This might be an example of something I suggested a while ago -- that the rounding a row estimate to an integer on the basis that "you can't read half a row" is not necessarily wise, because you can have a 50% chance of reading a row versus a higher or lower percentage. -Kevin
On Wed, 2006-03-15 at 14:17 -0500, Tom Lane wrote: > It looks to me like this is a matter of bad cost estimation, ie, it's > thinking the other index is cheaper to use. Why that is is not clear. > Can we see the pg_stats rows for ctofcNo and calDate? ISTM that when the BETWEEN constants match we end up in this part of clauselist_selectivity()...
On Thu, 2006-03-16 at 00:07 +0000, Simon Riggs wrote: > On Wed, 2006-03-15 at 14:17 -0500, Tom Lane wrote: > > > It looks to me like this is a matter of bad cost estimation, ie, it's > > thinking the other index is cheaper to use. Why that is is not clear. > > Can we see the pg_stats rows for ctofcNo and calDate? > > ISTM that when the BETWEEN constants match we end up in this part of > clauselist_selectivity()... (and now for the whole email...) /* * It's just roundoff error; use a small positive * value */ s2 = 1.0e-10; so that the planner underestimates the cost of using "Cal_CalDate" so that it ends up the same as "Cal_CtofcNo", and then we pick "Cal_CalDate" because it was created first. Using 1.0e-10 isn't very useful... the selectivity for a range should never be less than the selectivity for an equality, so we should simply put in a test against one of the pseudo constants and use that as the minimal value. That should lead to raising the apparent cost of Cal_CalDate so that Cal_CtofcNo can take precedence. Best Regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: >> ISTM that when the BETWEEN constants match we end up in this part of >> clauselist_selectivity()... Yeah, I think you are right. > so that the planner underestimates the cost of using "Cal_CalDate" so > that it ends up the same as "Cal_CtofcNo", and then we pick > "Cal_CalDate" because it was created first. No, it doesn't end up the same --- but the difference is small enough to be in the roundoff-error regime. The real issue here is that we're effectively assuming that one row will be fetched from the index in both cases, and this is clearly not the case for the Cal_CalDate index. So we need a more accurate estimate for the boundary case. > Using 1.0e-10 isn't very useful... the selectivity for a range should > never be less than the selectivity for an equality, so we should simply > put in a test against one of the pseudo constants and use that as the > minimal value. That's easier said than done, because you'd first have to find the appropriate equality operator to use (ie, one having semantics that agree with the inequality operators). Another point is that the above statement is simply wrong, consider calDate BETWEEN '2006-03-15' AND '2006-03-14' for which an estimate of zero really is correct. Possibly we could drop this code's reliance on seeing SCALARLTSEL/SCALARGTSEL as the estimators, and instead try to locate a common btree opclass for the operators --- which would then let us identify the right equality operator to use, and also let us distinguish > from >= etc. If we're trying to get the boundary cases right I suspect we have to account for that. I could see such an approach being tremendously slow though :-(, because we'd go looking for btree opclasses even for operators that have nothing to do with < or >. regards, tom lane
On Wed, 2006-03-15 at 21:05 -0500, Tom Lane wrote: > So we need a more accurate estimate for the boundary case. Agreed. > > Using 1.0e-10 isn't very useful... the selectivity for a range should > > never be less than the selectivity for an equality, so we should simply > > put in a test against one of the pseudo constants and use that as the > > minimal value. > > That's easier said than done, because you'd first have to find the > appropriate equality operator to use (ie, one having semantics that > agree with the inequality operators). ... Kevin: this is also the reason we can't simply transform the WHERE clause into a more appropriate form... > Possibly we could drop this code's reliance on seeing > SCALARLTSEL/SCALARGTSEL as the estimators, and instead try to locate a > common btree opclass for the operators --- which would then let us > identify the right equality operator to use, and also let us distinguish > > from >= etc. If we're trying to get the boundary cases right I > suspect we have to account for that. I could see such an approach being > tremendously slow though :-(, because we'd go looking for btree > opclasses even for operators that have nothing to do with < or >. Trying to get the information in the wrong place would be very expensive, I agree. But preparing that information when we have access to it and passing it through the plan would be much cheaper. Relating op->opclass will be very useful in other places in planning, even if any one case seems not to justify the work to record it. (This case feels like deja vu, all over again.) The operator and the opclass are only connected via an index access method, but for a particular index each column has only one opclass. So the opclass will have a 1-1 correspondence with the operator for *that* plan only, realising that other plans might have different correspondences. find_usable_indexes() or thereabouts could annotate a restriction OpExpr with the opclass it will use. Once we have the link, clauselist_selectivity() can trivially compare opclasses for both OpExprs, then retrieve other information for that opclass for various purposes. Seems lots of work for such a corner case, but would be worth it if this solves other problems as well. Best Regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: > Trying to get the information in the wrong place would be very > expensive, I agree. But preparing that information when we have access > to it and passing it through the plan would be much cheaper. Where would that be? > The operator and the opclass are only connected via an index access > method, but for a particular index each column has only one opclass. If you're proposing making clauselist_selectivity depend on what indexes exist, I think that's very much the wrong approach. In the first place, it still has to give usable answers for unindexed columns, and in the second place there might be multiple indexes with different opclasses for the same column, so the ambiguity problem still exists. I have been wondering if we shouldn't add some more indexes on pg_amop or something to make it easier to do this sort of lookup --- we definitely seem to be finding multiple reasons to want to look up which opclasses contain a given operator. regards, tom lane
On Thu, 2006-03-16 at 10:57 -0500, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > Trying to get the information in the wrong place would be very > > expensive, I agree. But preparing that information when we have access > > to it and passing it through the plan would be much cheaper. > > Where would that be? > > > The operator and the opclass are only connected via an index access > > method, but for a particular index each column has only one opclass. > > If you're proposing making clauselist_selectivity depend on what indexes > exist, I think that's very much the wrong approach. Using available information sounds OK to me. Guess you're thinking of the lack of plan invalidation? > In the first place, > it still has to give usable answers for unindexed columns, and in the > second place there might be multiple indexes with different opclasses > for the same column, so the ambiguity problem still exists. I was thinking that we would fill out the OpExpr with different opclasses for each plan, so each one sees a different story. (I was thinking there was a clauselist for each plan; if not, there could be.) So the multiple index problem shouldn't exist. Non-indexed cases still cause the problem, true. > I have been wondering if we shouldn't add some more indexes on pg_amop > or something to make it easier to do this sort of lookup --- we > definitely seem to be finding multiple reasons to want to look up > which opclasses contain a given operator. Agreed, but still looking for better way than that. [BTW how do you add new indexes to system tables? I want to add one to pg_inherits but not sure where to look.] Best Regards, Simon Riggs
Simon Riggs wrote: > [BTW how do you add new indexes to system tables? I want to add one to > pg_inherits but not sure where to look.] See src/include/catalog/indexing.h -- I don't remember if there's anything else that needs modification. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Thu, 2006-03-16 at 15:41 -0400, Alvaro Herrera wrote: > Simon Riggs wrote: > > > [BTW how do you add new indexes to system tables? I want to add one to > > pg_inherits but not sure where to look.] > > See src/include/catalog/indexing.h -- I don't remember if there's > anything else that needs modification. That was easy: many thanks! Best Regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: > I was thinking that we would fill out the OpExpr with different > opclasses for each plan, so each one sees a different story. (I was > thinking there was a clauselist for each plan; if not, there could be.) This is backwards: there isn't a plan yet. If there were, having clauselist_selectivity return different answers depending on what index the plan was thinking of using would still be wrong. > [BTW how do you add new indexes to system tables? I want to add one to > pg_inherits but not sure where to look.] src/include/catalog/indexing.h Offhand I think adding a new entry is all you have to do. You may also want a syscache to go with it, which'll take a bit more work. regards, tom lane
On Thu, 2006-03-16 at 14:45 -0500, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > [BTW how do you add new indexes to system tables? I want to add one to > > pg_inherits but not sure where to look.] > > src/include/catalog/indexing.h > > Offhand I think adding a new entry is all you have to do. You may also > want a syscache to go with it, which'll take a bit more work. I see its actually postgres.bki... I never scrolled to the bottom before now. I'll have a go. Best Regards, Simon Riggs