Any "guide to indexes" exists?

Started by Jan Bilekalmost 19 years ago9 messagesgeneral
Jump to latest
#1Jan Bilek
bilekj@gmail.com

Hello,
I would need more info about index types in postgre (btree, hash, gin and
gist) - is there any guide that explains in detail when to use which index
type? These index types have different performance with certain collumn
types and data characteristics store in them. There's not much info about it
pg docs. So, does any document describing detailed index usage or do you
have any personal recomendations when to use which index?

Thanks for you answers.

Regards,
Jan

#2Peter Eisentraut
peter_e@gmx.net
In reply to: Jan Bilek (#1)
Re: Any "guide to indexes" exists?

Am Montag, 7. Mai 2007 16:09 schrieb Jan Bilek:

I would need more info about index types in postgre (btree, hash, gin and
gist) - is there any guide that explains in detail when to use which index
type?

You use btree unless you have a specific, particular reason to use one of the
other ones.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#3Jan Bilek
bilekj@gmail.com
In reply to: Jan Bilek (#1)
Re: Any "guide to indexes" exists?

I was asking for these "specific/particular reasons". Im not the database
developer, to be expert on indexes (i know whats btree and hash - gin and
gist are rather mysterious for me).
Ok - btree is fine, but sometimes could be better to use gist - my question
is: when is that "sometimes"? Unless i know how to use the indexes, then
they are useless for me - am i right?

Note: We are using gin with tsearch2 vectors, but here we had no choice -
tsearch2 works only with gin and gist.

Regards,
Jan

Show quoted text

Am Montag, 7. Mai 2007 16:09 schrieb Jan Bilek:

I would need more info about index types in postgre (btree, hash, gin and
gist) - is there any guide that explains in detail when to use which
index
type?

You use btree unless you have a specific, particular reason to use one of
the
other ones.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jan Bilek (#1)
Re: Any "guide to indexes" exists?

Hello,
I would need more info about index types in postgre (btree, hash, gin and
gist) - is there any guide that explains in detail when to use which index
type? These index types have different performance with certain collumn
types and data characteristics store in them. There's not much info about it
pg docs. So, does any document describing detailed index usage or do you
have any personal recomendations when to use which index?

http://people.planetpostgresql.org/mha/index.php?/archives/112-GIN-performance-postgresql.org-websearch-part-2.html
http://people.planetpostgresql.org/xzilla/index.php?/archives/278-PostgreSQL-full-text-search-testing.html

Summary:
* fulltext, GIS, arrays .. gin, gist indexes (gin is faster, but slow update)
* others .. btree index (I don't know anybody who use hash index)

Regards
Pavel Stehule

#5Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Pavel Stehule (#4)
Re: Any "guide to indexes" exists?

On May 7, 2007, at 9:44 AM, Pavel Stehule wrote:

Hello,
I would need more info about index types in postgre (btree, hash,
gin and
gist) - is there any guide that explains in detail when to use
which index
type? These index types have different performance with certain
collumn
types and data characteristics store in them. There's not much
info about it
pg docs. So, does any document describing detailed index usage or
do you
have any personal recomendations when to use which index?

http://people.planetpostgresql.org/mha/index.php?/archives/112-GIN-
performance-postgresql.org-websearch-part-2.html
http://people.planetpostgresql.org/xzilla/index.php?/archives/278-
PostgreSQL-full-text-search-testing.html

Summary:
* fulltext, GIS, arrays .. gin, gist indexes (gin is faster, but
slow update)
* others .. btree index (I don't know anybody who use hash index)

GiST can also be useful if you have to query in multiple dimensions,
which can occur outside the normal case of geometry. Best example I
know of is a table containing duration information in the form of
start_time and end_time. Trying to query for what events happened on
5/28/2005 will generally be much cheaper with a GiST index than a b-
tree.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#6Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Jim Nasby (#5)
Re: Any "guide to indexes" exists?

On Mon, May 07, 2007 at 10:47:24AM -0500, Jim Nasby wrote:

GiST can also be useful if you have to query in multiple dimensions,
which can occur outside the normal case of geometry. Best example I
know of is a table containing duration information in the form of
start_time and end_time. Trying to query for what events happened on
5/28/2005 will generally be much cheaper with a GiST index than a b-
tree.

Are you referring to queries with

... where some_timestamp between some_start and some_end ...

or

... where some_timestamp > some_start and some_timestamp < some_end ...

?

Thanks,
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jan Bilek (#3)
Re: Any "guide to indexes" exists?

"Jan Bilek" <bilekj@gmail.com> writes:

Ok - btree is fine, but sometimes could be better to use gist - my question
is: when is that "sometimes"? Unless i know how to use the indexes, then
they are useless for me - am i right?

gist and gin are for indexing queries that btree is not capable of
dealing with, ie, the WHERE clauses are not simple scalar equality
or range checks. All you need to pay attention to is whether the
index has an operator class that includes the WHERE operator you
want to use.

regards, tom lane

#8Chris Browne
cbbrowne@acm.org
In reply to: Jan Bilek (#1)
Re: Any "guide to indexes" exists?

bilekj@gmail.com ("Jan Bilek") writes:

I was asking for these "specific/particular reasons". Im not the
database developer, to be expert on indexes (i know whats btree and
hash - gin and gist are rather mysterious for me).
Ok - btree is fine, but sometimes could be better to use gist - my
question is: when is that "sometimes"? Unless i know how to use the
indexes, then they are useless for me - am i right?

Note: We are using gin with tsearch2 vectors, but here we had no
choice -
tsearch2 works only with gin and gist.

You use btree unless you have a specific, particular reason to use
one of the
other ones.

Peter's comments are pretty germane.

You use btree if you can. btree is fine for anything where values are
reasonable "scalar," and compare to one another either in a
straightforward scalar fashion, or piece-wise scalar, as with compound
keys where you start by comparing the first component, then, if it's
equal, considering the second, and so on.

You use GIST if you *need to*, if you have a data structure where
values *aren't* scalar, *can't* be compared (e.g. - where you can't
simply say A > B, B > C).
--
"cbbrowne","@","cbbrowne.com"
http://cbbrowne.com/info/wp.html
"Access to a COFF symbol table via ldtbread is even less abstract,
really sucks in general, and should be banned from earth."
-- SCSH 0.5.1 unix.c

#9Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Karsten Hilbert (#6)
Re: Any "guide to indexes" exists?

On May 7, 2007, at 11:01 AM, Karsten Hilbert wrote:

On Mon, May 07, 2007 at 10:47:24AM -0500, Jim Nasby wrote:

GiST can also be useful if you have to query in multiple dimensions,
which can occur outside the normal case of geometry. Best example I
know of is a table containing duration information in the form of
start_time and end_time. Trying to query for what events happened on
5/28/2005 will generally be much cheaper with a GiST index than a b-
tree.

Are you referring to queries with

... where some_timestamp between some_start and some_end ...

or

... where some_timestamp > some_start and some_timestamp <
some_end ...

Yes. :)

Say the table spans 10 years of records. You're looking for a
specific set of records that affect a day 5 years ago. Your where
clause will look something like this:

WHERE start_time < now() - 5 years AND end_time > now() - 5 years 1 day

B-tree indexes will be nearly useless in this case, because each one
is going to match on over half the table. But there are operators
that would let you treat this as a 2 dimensional problem and then use
GiST (or r-tree, but that's no longer recommended). For example, if
you create an index on:

box(point(start_time, start_time), point(end_time, end_time))

Then you just need to query for all rows that overlap the box defined
by:

box(point(now() - 5 years 1 day, now() - 5 years 1 day), point(now()
- 5 years, now() - 5 years))

You'd have to double-check the exact logic and see which overlaps
operator you want, but the basic idea is the same: you're converting
2 one-dimensional timelines (start_time and end_time) into a two-
dimensional timeline, and then using geometric functions to index that.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)