Correct syntax to create partial index on a boolean column
For the boolean column Foo in Table1, if I want to index all values of
TRUE, is this syntax correct?
CREATE INDEX IDX_MyIndex ON Table1(Foo) WHERE Foo;
The query:
SELECT * FROM Table1 WHERE Foo;
should use the index, and:
SELECT * FROM Table1 WHERE NOT Foo;
should not, correct?
I just want to make sure I don't need an operator on the WHERE clause. Thanks!
Mike
For the boolean column Foo in Table1, if I want to index all values of
TRUE, is this syntax correct?CREATE INDEX IDX_MyIndex ON Table1(Foo) WHERE Foo;
The query:
SELECT * FROM Table1 WHERE Foo;
should use the index, and:
SELECT * FROM Table1 WHERE NOT Foo;
should not, correct?
I just want to make sure I don't need an operator on the WHERE clause. Thanks!
FYI, I've posted this on StackOverflow too in case anyone wants to
score some points..
I'm 90% sure this is the right way to do it though.
Mike
On Wed, Dec 14, 2011 at 9:54 PM, Mike Christensen <mike@kitchenpc.com> wrote:
For the boolean column Foo in Table1, if I want to index all values of
TRUE, is this syntax correct?CREATE INDEX IDX_MyIndex ON Table1(Foo) WHERE Foo;
The query:
SELECT * FROM Table1 WHERE Foo;
should use the index, and:
SELECT * FROM Table1 WHERE NOT Foo;
should not, correct?
I just want to make sure I don't need an operator on the WHERE clause. Thanks!
FYI, I've posted this on StackOverflow too in case anyone wants to
score some points..I'm 90% sure this is the right way to do it though.
Mike
I've confirmed the index works as expected.
I created 10,000 rows of random data, and set `diet_glutenfree` to
`random() > 0.9` so there's only a 10% chance of an `on` bit.
I then re-created the indexes and tried the query again.
SELECT RecipeId from RecipeMetadata where diet_glutenfree;
Returns:
'Index Scan using idx_recipemetadata_glutenfree on recipemetadata
(cost=0.00..135.15 rows=1030 width=16)'
' Index Cond: (diet_glutenfree = true)'
And:
SELECT RecipeId from RecipeMetadata where NOT diet_glutenfree;
Returns:
'Seq Scan on recipemetadata (cost=0.00..214.26 rows=8996 width=16)'
' Filter: (NOT diet_glutenfree)'
So, it will definitely use the index when I query for ON values.
Just out of curiosity, is there a way to verify the number of rows
that are indexed on a partial query?
Mike
On 15 Dec 2011, at 5:43, Mike Christensen wrote:
For the boolean column Foo in Table1, if I want to index all values of
TRUE, is this syntax correct?CREATE INDEX IDX_MyIndex ON Table1(Foo) WHERE Foo;
The query:
SELECT * FROM Table1 WHERE Foo;
should use the index, and:
SELECT * FROM Table1 WHERE NOT Foo;
should not, correct?
Correct, but...
That's not a particularly useful index to create. That index just contains values of true where the associated column equals true - you're storing the same information twice.
It's generally more useful to index a column with values that you're likely to be interested in for limiting the result set further or for sorting or some-such, as long as the operation performed benefits from using an index.
From your later example, for instance:
SELECT RecipeId from RecipeMetadata where diet_glutenfree;
If you plan to use this query in a join, an index like this would be more useful:
CREATE INDEX recipemetadata_recipeid_glutenfree_idx ON RecipeMetadata(RecipeId) WHERE diet_glutenfree;
That's a bit similar to creating an index on (RecipeId, diet_glutenfree), except that the latter also contains entries that are not gluten-free of course.
Alban Hertroys
--
The scale of a problem often equals the size of an ego.
Mike Christensen wrote:
For the boolean column Foo in Table1, if I want to index all values of
TRUE, is this syntax correct?CREATE INDEX IDX_MyIndex ON Table1(Foo) WHERE Foo;
Yes, that is correct.
Yours,
Laurenz Albe
On 12/15/2011 03:53 PM, Alban Hertroys wrote:
Correct, but...
That's not a particularly useful index to create. That index just contains values of true where the associated column equals true - you're storing the same information twice.
It could be very handy if you have an extremely high selectivity index
(say 1:1000 or more) where you want to keep the index tiny, fast, and
very quick to scan.
I guess ideally Pg would be able to deduce that the index value is
always the same and just store a page list rather than a b-tree, but
it's a bit of a tiny use case.
--
Craig Ringer
On Thu, Dec 15, 2011 at 6:00 AM, Craig Ringer <ringerc@ringerc.id.au> wrote:
On 12/15/2011 03:53 PM, Alban Hertroys wrote:
Correct, but...
That's not a particularly useful index to create. That index just contains
values of true where the associated column equals true - you're storing the
same information twice.It could be very handy if you have an extremely high selectivity index (say
1:1000 or more) where you want to keep the index tiny, fast, and very quick
to scan.I guess ideally Pg would be able to deduce that the index value is always
the same and just store a page list rather than a b-tree, but it's a bit of
a tiny use case.
The partial index is definitely a lot smaller.
BTW, this table (RecipeMetadata) will only ever be used in a join. I
will never query it directly. But I'll query Recipes and join in
RecipeMetadata.
Mike Christensen <mike@kitchenpc.com> writes:
BTW, this table (RecipeMetadata) will only ever be used in a join. I
will never query it directly. But I'll query Recipes and join in
RecipeMetadata.
In that case possibly you want the join key to be the index payload.
regards, tom lane