Correct syntax to create partial index on a boolean column

Started by Mike Christensenover 14 years ago9 messagesgeneral

Jump to latest

Mike Christensen

mike@kitchenpc.com

over 14 years ago

For the boolean column Foo in Table1, if I want to index all values of
TRUE, is this syntax correct?

CREATE INDEX IDX_MyIndex ON Table1(Foo) WHERE Foo;

The query:

SELECT * FROM Table1 WHERE Foo;

should use the index, and:

SELECT * FROM Table1 WHERE NOT Foo;

should not, correct?

I just want to make sure I don't need an operator on the WHERE clause. Thanks!

Mike

Mike Christensen

mike@kitchenpc.com

over 14 years ago

In reply to: Mike Christensen (#1)

Re: Correct syntax to create partial index on a boolean column

For the boolean column Foo in Table1, if I want to index all values of
TRUE, is this syntax correct?

CREATE INDEX IDX_MyIndex ON Table1(Foo) WHERE Foo;

The query:

SELECT * FROM Table1 WHERE Foo;

should use the index, and:

SELECT * FROM Table1 WHERE NOT Foo;

should not, correct?

I just want to make sure I don't need an operator on the WHERE clause. Thanks!

FYI, I've posted this on StackOverflow too in case anyone wants to
score some points..

http://stackoverflow.com/questions/8514923/postgres-is-this-the-right-way-to-create-a-partial-index-on-a-boolean-column

I'm 90% sure this is the right way to do it though.

Mike

Mike Christensen

mike@kitchenpc.com

over 14 years ago

In reply to: Mike Christensen (#2)

Re: Correct syntax to create partial index on a boolean column

On Wed, Dec 14, 2011 at 9:54 PM, Mike Christensen <mike@kitchenpc.com> wrote:

For the boolean column Foo in Table1, if I want to index all values of
TRUE, is this syntax correct?

CREATE INDEX IDX_MyIndex ON Table1(Foo) WHERE Foo;

The query:

SELECT * FROM Table1 WHERE Foo;

should use the index, and:

SELECT * FROM Table1 WHERE NOT Foo;

should not, correct?

I just want to make sure I don't need an operator on the WHERE clause. Thanks!

FYI, I've posted this on StackOverflow too in case anyone wants to
score some points..

http://stackoverflow.com/questions/8514923/postgres-is-this-the-right-way-to-create-a-partial-index-on-a-boolean-column

I'm 90% sure this is the right way to do it though.

Mike

I've confirmed the index works as expected.

I created 10,000 rows of random data, and set `diet_glutenfree` to
`random() > 0.9` so there's only a 10% chance of an `on` bit.

I then re-created the indexes and tried the query again.

SELECT RecipeId from RecipeMetadata where diet_glutenfree;

Returns:

'Index Scan using idx_recipemetadata_glutenfree on recipemetadata
(cost=0.00..135.15 rows=1030 width=16)'
' Index Cond: (diet_glutenfree = true)'

And:

SELECT RecipeId from RecipeMetadata where NOT diet_glutenfree;

Returns:

'Seq Scan on recipemetadata (cost=0.00..214.26 rows=8996 width=16)'
' Filter: (NOT diet_glutenfree)'

So, it will definitely use the index when I query for ON values.

Just out of curiosity, is there a way to verify the number of rows
that are indexed on a partial query?

Mike

Alban Hertroys

haramrae@gmail.com

over 14 years ago

In reply to: Mike Christensen (#1)

Re: Correct syntax to create partial index on a boolean column

On 15 Dec 2011, at 5:43, Mike Christensen wrote:

For the boolean column Foo in Table1, if I want to index all values of
TRUE, is this syntax correct?

CREATE INDEX IDX_MyIndex ON Table1(Foo) WHERE Foo;

The query:

SELECT * FROM Table1 WHERE Foo;

should use the index, and:

SELECT * FROM Table1 WHERE NOT Foo;

should not, correct?

Correct, but...
That's not a particularly useful index to create. That index just contains values of true where the associated column equals true - you're storing the same information twice.

It's generally more useful to index a column with values that you're likely to be interested in for limiting the result set further or for sorting or some-such, as long as the operation performed benefits from using an index.

From your later example, for instance:

SELECT RecipeId from RecipeMetadata where diet_glutenfree;

If you plan to use this query in a join, an index like this would be more useful:

CREATE INDEX recipemetadata_recipeid_glutenfree_idx ON RecipeMetadata(RecipeId) WHERE diet_glutenfree;

That's a bit similar to creating an index on (RecipeId, diet_glutenfree), except that the latter also contains entries that are not gluten-free of course.

Alban Hertroys

--
The scale of a problem often equals the size of an ego.

Laurenz Albe

laurenz.albe@cybertec.at

over 14 years ago

In reply to: Mike Christensen (#1)

Re: Correct syntax to create partial index on a boolean column

Mike Christensen wrote:

For the boolean column Foo in Table1, if I want to index all values of
TRUE, is this syntax correct?

CREATE INDEX IDX_MyIndex ON Table1(Foo) WHERE Foo;

Yes, that is correct.

Yours,
Laurenz Albe

Craig Ringer

craig@2ndquadrant.com

over 14 years ago

In reply to: Alban Hertroys (#4)

Re: Correct syntax to create partial index on a boolean column

On 12/15/2011 03:53 PM, Alban Hertroys wrote:

Correct, but...
That's not a particularly useful index to create. That index just contains values of true where the associated column equals true - you're storing the same information twice.

It could be very handy if you have an extremely high selectivity index
(say 1:1000 or more) where you want to keep the index tiny, fast, and
very quick to scan.

I guess ideally Pg would be able to deduce that the index value is
always the same and just store a page list rather than a b-tree, but
it's a bit of a tiny use case.

--
Craig Ringer

Mike Christensen

mike@kitchenpc.com

over 14 years ago

In reply to: Craig Ringer (#6)

Re: Correct syntax to create partial index on a boolean column

On Thu, Dec 15, 2011 at 6:00 AM, Craig Ringer <ringerc@ringerc.id.au> wrote:

On 12/15/2011 03:53 PM, Alban Hertroys wrote:

Correct, but...
That's not a particularly useful index to create. That index just contains
values of true where the associated column equals true - you're storing the
same information twice.

It could be very handy if you have an extremely high selectivity index (say
1:1000 or more) where you want to keep the index tiny, fast, and very quick
to scan.

I guess ideally Pg would be able to deduce that the index value is always
the same and just store a page list rather than a b-tree, but it's a bit of
a tiny use case.

The partial index is definitely a lot smaller.

BTW, this table (RecipeMetadata) will only ever be used in a join. I
will never query it directly. But I'll query Recipes and join in
RecipeMetadata.