Thread: Understanding how partial indexes work?
I have a query on a table:- X between k1 and k2 or X < k1 and Y <> k3 where k1, k2, k3 are constants. How would this query work, if I created an index on X and a partial index on X where Y <> k3? Chris -- Chris Velevitch Manager - Sydney Flash Platform Developers Group m: 0415 469 095 www.flashdev.org.au
"Chris Velevitch" <chris.velevitch@gmail.com> writes: > I have a query on a table:- > X between k1 and k2 or X < k1 and Y <> k3 > where k1, k2, k3 are constants. > How would this query work, if I created an index on X and a partial > index on X where Y <> k3? Is it worth the trouble? You didn't mention the statistics involved, but ordinarily I'd think a non-equal condition is too nonselective to justify the cost of maintaining an extra index. The real problem here is that "X < k1" is probably too nonselective as well, which will likely lead the planner to decide that a plain seqscan is the cheapest solution. In principle the above could be done with a BitmapOr of two indexscans on X, but unless the constants are such that only a small fraction of rows are going to be selected, it's likely that a seqscan is the way to go. regards, tom lane
On Dec 6, 2007 1:44 AM, Chris Velevitch <chris.velevitch@gmail.com> wrote: > I have a query on a table:- > > X between k1 and k2 or X < k1 and Y <> k3 > > where k1, k2, k3 are constants. > > How would this query work, if I created an index on X and a partial > index on X where Y <> k3? Ummm. Using AND and OR in the same where clause without parenthesis is a bad idea, as the logic you might think you're expressing isn't the exact logic you're actually expressing. Do you mean: X between k1 and k2 or (X < k1 and Y <> k3) OR (X between k1 and k2 or X < k1) and Y <> k3 Those are two different questions. That plan chosen by the query planner depends on what the data distribution looks like, and what decisions the planner makes based on the stats it has about that distribution. Why not make a table, fill it with test data, analyze it, and see what you get with your indexes with explain analyze select ...
On Dec 7, 2007 2:39 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Dec 6, 2007 1:44 AM, Chris Velevitch <chris.velevitch@gmail.com> wrote: > > I have a query on a table:- > > > > X between k1 and k2 or X < k1 and Y <> k3 > > > > where k1, k2, k3 are constants. > > > > How would this query work, if I created an index on X and a partial > > index on X where Y <> k3? > > Ummm. Using AND and OR in the same where clause without parenthesis > is a bad idea, as the logic you might think you're expressing isn't > the exact logic you're actually expressing. Do you mean: Are you saying that operator precedence doesn't apply here? When else does operator precedence not apply and where is this documented? According to the documentation (http://www.postgresql.org/docs/7.4/interactive/sql-syntax.html#SQL-PRECEDENCE) X between k1 and k2 or X < k1 and Y <> k3 is the same as:- (X between k1 and k2) or ((X < k1) and (Y <> k3)) which is what I want. Chris -- Chris Velevitch Manager - Sydney Flash Platform Developers Group m: 0415 469 095 www.flashdev.org.au
On Dec 7, 2007 2:39 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Chris Velevitch" <chris.velevitch@gmail.com> writes: > > I have a query on a table:- > > X between k1 and k2 or X < k1 and Y <> k3 > > > where k1, k2, k3 are constants. > > > How would this query work, if I created an index on X and a partial > > index on X where Y <> k3? > > Is it worth the trouble? You didn't mention the statistics involved, > but ordinarily I'd think a non-equal condition is too nonselective > to justify the cost of maintaining an extra index. Yes, I did neglect to mention the stats. Y is the status of the record which represents the life cycle of the record starting at zero and ending up at k3. So basically the query would retrieving and the partial index will indexing records which aren't at their end of life. So the number of records where Y <> k3 will be low and the number of records where Y = k3 will be growing over time. So the ratio of Y <> k3 to Y = k3 will be getting smaller over time. > The real problem here is that "X < k1" is probably too nonselective > as well, which will likely lead the planner to decide that a plain > seqscan is the cheapest solution. In principle the above could be > done with a BitmapOr of two indexscans on X, but unless the constants > are such that only a small fraction of rows are going to be selected, > it's likely that a seqscan is the way to go. How would that compare doing a union the results of two queries:- X between k1 and k2 UNION X where X < k1 and Y <> k3 where there is index on X and a partial index on X where Y <> k3 Chris -- Chris Velevitch Manager - Sydney Flash Platform Developers Group m: 0415 469 095 www.flashdev.org.au
"Chris Velevitch" <chris.velevitch@gmail.com> writes: > On Dec 7, 2007 2:39 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Is it worth the trouble? You didn't mention the statistics involved, >> but ordinarily I'd think a non-equal condition is too nonselective >> to justify the cost of maintaining an extra index. > Yes, I did neglect to mention the stats. Y is the status of the record > which represents the life cycle of the record starting at zero and > ending up at k3. So basically the query would retrieving and the > partial index will indexing records which aren't at their end of life. > So the number of records where Y <> k3 will be low and the number of > records where Y = k3 will be growing over time. OK, in that case the partial index might be worth trying. regards, tom lane