Thread: BUG #5645: Query Optimizer fails when it encounters an unsatisfiable part of a query
BUG #5645: Query Optimizer fails when it encounters an unsatisfiable part of a query
From
"Daniel Wagner"
Date:
The following bug has been logged online: Bug reference: 5645 Logged by: Daniel Wagner Email address: dtw30@cam.ac.uk PostgreSQL version: 8.4.4 Operating system: tested under Linux (Ubuntu 10.4) and Windows (7) Description: Query Optimizer fails when it encounters an unsatisfiable part of a query Details: I posted this earlier on the general mailing list and received no reply. I assume I found a genuine bug: I am using Postgres 8.4.4 on a large-ish amount of data and recently noticed that my application got very slow at times. I quickly discovered that a specific query was triggering a sequential scan despite suitable indices being available. The query in question looks like this: "select * from kvstore where deviceid = 7 AND (locid >= 1410929 AND locid <= 1690468) OR (locid = 1690469 and locid <= 1690468)" Note that the last condition (locid = 2 AND locid <= 1) can never be satisfied. Now, the Postgres optimizer seems to believe that a sequential scan of 16 million rows is the right way of approaching this query, despite having accurate statistics (I ran VACUUM ANALYZE before to ensure everything is up-to-date). However, if I remove the last part and query for "select * from kvstore where deviceid = 7 AND (locid >= 1410929 AND locid <= 1690468)", indices are used and everything works nicely. And I believe that the optimizer should remove an invalid query, or at least handle it gracefully (e.g. use it as a parameter for a range query). Since it doesn't do that, I am a little stumped as to what the correct course of action for me is. I could try to manually remove "invalid" parts of my query, but then again I don't want to be patching queries to accommodate a stubborn optimizer if I don't have to... maybe I stumbled upon a bug? If you have any further questions please do not hesitate to ask! I'd love to resolve this issue soon!
Re: BUG #5645: Query Optimizer fails when it encounters an unsatisfiable part of a query
From
Tom Lane
Date:
"Daniel Wagner" <dtw30@cam.ac.uk> writes: > I posted this earlier on the general mailing list and received no reply. I > assume I found a genuine bug: > "select * from kvstore where deviceid = 7 AND (locid >= 1410929 AND locid <= > 1690468) OR (locid = 1690469 and locid <= 1690468)" No, you're just expecting the optimizer to spend much more effort than it actually does on recognizing poorly-written queries. While there is code in there that could prove that (locid = 1690469 and locid <= 1690468) is constant false, we don't apply that code to sub-branches of OR conditions; it's too expensive and the probability of a win is too small. > Now, the Postgres optimizer seems to believe that a sequential scan of 16 > million rows is the right way of approaching this query, despite having > accurate statistics (I ran VACUUM ANALYZE before to ensure everything is > up-to-date). Well, you haven't shown us what alternatives it might have, nor what rowcount estimates it's deriving, so it's hard to comment about that. > However, if I remove the last part and query for "select * from kvstore > where deviceid = 7 AND (locid >= 1410929 AND locid <= 1690468)", indices > are used and everything works nicely. Um ... I wonder whether you've been careless about whether OR binds tighter or looser than AND. Are you expecting the deviceid condition to apply to both locid ranges? Because it doesn't, with the query written like that. Maybe you just need more parentheses. regards, tom lane