Thread: Correct syntax to create partial index on a boolean column
For the boolean column Foo in Table1, if I want to index all values of TRUE, is this syntax correct? CREATE INDEX IDX_MyIndex ON Table1(Foo) WHERE Foo; The query: SELECT * FROM Table1 WHERE Foo; should use the index, and: SELECT * FROM Table1 WHERE NOT Foo; should not, correct? I just want to make sure I don't need an operator on the WHERE clause. Thanks! Mike
> For the boolean column Foo in Table1, if I want to index all values of > TRUE, is this syntax correct? > > CREATE INDEX IDX_MyIndex ON Table1(Foo) WHERE Foo; > > The query: > > SELECT * FROM Table1 WHERE Foo; > > should use the index, and: > > SELECT * FROM Table1 WHERE NOT Foo; > > should not, correct? > > I just want to make sure I don't need an operator on the WHERE clause. Thanks! FYI, I've posted this on StackOverflow too in case anyone wants to score some points.. http://stackoverflow.com/questions/8514923/postgres-is-this-the-right-way-to-create-a-partial-index-on-a-boolean-column I'm 90% sure this is the right way to do it though. Mike
On Wed, Dec 14, 2011 at 9:54 PM, Mike Christensen <mike@kitchenpc.com> wrote: >> For the boolean column Foo in Table1, if I want to index all values of >> TRUE, is this syntax correct? >> >> CREATE INDEX IDX_MyIndex ON Table1(Foo) WHERE Foo; >> >> The query: >> >> SELECT * FROM Table1 WHERE Foo; >> >> should use the index, and: >> >> SELECT * FROM Table1 WHERE NOT Foo; >> >> should not, correct? >> >> I just want to make sure I don't need an operator on the WHERE clause. Thanks! > > FYI, I've posted this on StackOverflow too in case anyone wants to > score some points.. > > http://stackoverflow.com/questions/8514923/postgres-is-this-the-right-way-to-create-a-partial-index-on-a-boolean-column > > I'm 90% sure this is the right way to do it though. > > Mike I've confirmed the index works as expected. I created 10,000 rows of random data, and set `diet_glutenfree` to `random() > 0.9` so there's only a 10% chance of an `on` bit. I then re-created the indexes and tried the query again. SELECT RecipeId from RecipeMetadata where diet_glutenfree; Returns: 'Index Scan using idx_recipemetadata_glutenfree on recipemetadata (cost=0.00..135.15 rows=1030 width=16)' ' Index Cond: (diet_glutenfree = true)' And: SELECT RecipeId from RecipeMetadata where NOT diet_glutenfree; Returns: 'Seq Scan on recipemetadata (cost=0.00..214.26 rows=8996 width=16)' ' Filter: (NOT diet_glutenfree)' So, it will definitely use the index when I query for ON values. Just out of curiosity, is there a way to verify the number of rows that are indexed on a partial query? Mike
On 15 Dec 2011, at 5:43, Mike Christensen wrote: > For the boolean column Foo in Table1, if I want to index all values of > TRUE, is this syntax correct? > > CREATE INDEX IDX_MyIndex ON Table1(Foo) WHERE Foo; > > The query: > > SELECT * FROM Table1 WHERE Foo; > > should use the index, and: > > SELECT * FROM Table1 WHERE NOT Foo; > > should not, correct? Correct, but... That's not a particularly useful index to create. That index just contains values of true where the associated column equalstrue - you're storing the same information twice. It's generally more useful to index a column with values that you're likely to be interested in for limiting the result setfurther or for sorting or some-such, as long as the operation performed benefits from using an index. From your later example, for instance: SELECT RecipeId from RecipeMetadata where diet_glutenfree; If you plan to use this query in a join, an index like this would be more useful: CREATE INDEX recipemetadata_recipeid_glutenfree_idx ON RecipeMetadata(RecipeId) WHERE diet_glutenfree; That's a bit similar to creating an index on (RecipeId, diet_glutenfree), except that the latter also contains entries thatare not gluten-free of course. Alban Hertroys -- The scale of a problem often equals the size of an ego.
Mike Christensen wrote: > For the boolean column Foo in Table1, if I want to index all values of > TRUE, is this syntax correct? > > CREATE INDEX IDX_MyIndex ON Table1(Foo) WHERE Foo; Yes, that is correct. Yours, Laurenz Albe
On 12/15/2011 03:53 PM, Alban Hertroys wrote: > Correct, but... > That's not a particularly useful index to create. That index just contains values of true where the associated column equalstrue - you're storing the same information twice. It could be very handy if you have an extremely high selectivity index (say 1:1000 or more) where you want to keep the index tiny, fast, and very quick to scan. I guess ideally Pg would be able to deduce that the index value is always the same and just store a page list rather than a b-tree, but it's a bit of a tiny use case. -- Craig Ringer
On Thu, Dec 15, 2011 at 6:00 AM, Craig Ringer <ringerc@ringerc.id.au> wrote: > On 12/15/2011 03:53 PM, Alban Hertroys wrote: >> >> Correct, but... >> That's not a particularly useful index to create. That index just contains >> values of true where the associated column equals true - you're storing the >> same information twice. > > It could be very handy if you have an extremely high selectivity index (say > 1:1000 or more) where you want to keep the index tiny, fast, and very quick > to scan. > > I guess ideally Pg would be able to deduce that the index value is always > the same and just store a page list rather than a b-tree, but it's a bit of > a tiny use case. The partial index is definitely a lot smaller. BTW, this table (RecipeMetadata) will only ever be used in a join. I will never query it directly. But I'll query Recipes and join in RecipeMetadata.
Mike Christensen <mike@kitchenpc.com> writes: > BTW, this table (RecipeMetadata) will only ever be used in a join. I > will never query it directly. But I'll query Recipes and join in > RecipeMetadata. In that case possibly you want the join key to be the index payload. regards, tom lane
Craig Ringer wrote: > it's a *bit* of a tiny use case. It certainly is. Jay