Re: [HACKERS] Index not used on select (Is this more OR + LIKE?) - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: [HACKERS] Index not used on select (Is this more OR + LIKE?) |
Date | |
Msg-id | 1475.932828001@sss.pgh.pa.us Whole thread Raw |
In response to | Index not used on select (Is this more OR + LIKE?) (Mike Mascari <mascarim@yahoo.com>) |
List | pgsql-hackers |
Mike Mascari <mascarim@yahoo.com> writes: > ... However, if an OR clause is introduced as below: > WHERE > permitbuy.webuser = 'mascarj' AND > (locations.company,locations.costcntr) = > (permitbuy.company, permitbuy.costcntr) AND > supplychains.target = locations.target AND > reserves.target = supplychains.supplysource AND > supplies.supply = reserves.supply AND > supplies.inventory = '1' AND > ((upper(supplies.supply) LIKE '%SEQ%') OR > (upper(supplies.vendoritem) LIKE '%SEQ%') OR > (upper(supplies.vendorname) LIKE '%SEQ%') OR > (upper(supplies.description) LIKE '%SEQ%')) > OR <-- This is built by our search engine to allow > -- our users to enter: [SEQ or SCD]... > ((upper(supplies.supply) LIKE '%SCD%') OR > (upper(supplies.vendoritem) LIKE '%SCD%') OR > (upper(supplies.vendorname) LIKE '%SCD%') OR > (upper(supplies.description) LIKE '%SCD%')) > The plan shows that it will have to perform a > sequential scan on the supplies table, which I > obviously expected because of the use of LIKE, in > both plans. Not necessarily --- since you have a restriction clause on supplies.inventory, an index on that field could be used for an index scan. This would only be worthwhile if "supplies.inventory = '1'" eliminates a goodly fraction of the supplies records, of course. Another possibility is using an index on supplies.supply to implement the join on supplies.supply = reserves.supply. The LIKE clauses will certainly have to be done the hard way, but they don't necessarily have to be done the hard way on every single record. > However, why is it, that, when an OR clause which exclusively > references the supplies table is appended to the query, the > planner/optimizer (which already must perform a sequential scan on > supplies) now totally ignores all the indices built on the other > tables? I think the problem is that the OR appears at top level in the WHERE clause (assuming the above is a verbatim transcript of your query). OR groups less tightly than AND, so what this really means is(other-conditions AND (LIKEs-for-SEQ)) OR (LIKEs-for-SCD) which is undoubtedly not what you had in mind, and will certainly produce a lot of unwanted records if the query manages to complete. Every supplies tuple matching SCD will appear joined to every possible combination of records from the other tables... Per recent discussions, the query optimizer is currently doing a really bad job of optimizing OR-of-ANDs conditions, but I think that you didn't mean to ask for that anyway. > The result is an execution plan which consumes all RAM on the machine, > and, at 410M, I killed it, because it was about to consume all swap > space as well... You're confusing two different problems --- the efficiency of the query plan has a lot to do with speed, but relatively little to do with memory usage. I think that the memory usage problem here stems from the use of upper() in the WHERE condition. Each evaluation of upper() generates a temporary result string, which is not reclaimed until end of statement in the current code. (I hope to see that fixed in the next release or two, but for now you gotta work around it.) You would be better advised to use a case-insensitive match operator instead of LIKE. For example, the above conditions could be writtensupplies.supply ~* 'SEQ' Dunno how inconvenient it is for you to use regular expression patterns instead of LIKE-style patterns, but the memory savings would be considerable. Even after we fix the memory leakage problem, I expect this would be faster than the LIKE version. BTW, the reason that your correctly-phrased query doesn't run out of memory is that the LIKE conditions don't get evaluated for the tuples that don't make it past the other qual conditions. In the mistaken version, they get evaluated for every possible combination of joined tuples... regards, tom lane
pgsql-hackers by date: