Indexing columns with low cardinality: persistent bitmap indexes?
Hello,
I have a column with a small number of distinct values, indexing this
one with a standard BTree is useless. How do I can index this column
efficiently? I searched and it seems that pg doesn't support the
creation of persistent bitmap indexes... Is that feature planned in next
releases of pg?
Thanks
Bruno Lavoie
Bruno Lavoie escribi�:
Hello,
I have a column with a small number of distinct values, indexing this
one with a standard BTree is useless. How do I can index this column
efficiently? I searched and it seems that pg doesn't support the
creation of persistent bitmap indexes...
It doesn't.
Is that feature planned in next releases of pg?
There are some efforts to get it done, but don't hold your breath (it
won't be in 8.4 either, as it has major problems currently.)
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Monday 12 January 2009 21:38:02 Bruno Lavoie wrote:
Hello,
I have a column with a small number of distinct values, indexing this
one with a standard BTree is useless. How do I can index this column
efficiently? I searched and it seems that pg doesn't support the
creation of persistent bitmap indexes... Is that feature planned in next
releases of pg?Thanks
Bruno Lavoie
I would try partial indexes, as many as the distinct values.
I'm not sure this would help, though.
--
Fahrbahn ist ein graues Band
weisse Streifen, grüner Rand
On Mon, Jan 12, 2009 at 4:16 PM, Reg Me Please <regmeplease@gmail.com> wrote:
On Monday 12 January 2009 21:38:02 Bruno Lavoie wrote:
Hello,
I have a column with a small number of distinct values, indexing this
one with a standard BTree is useless. How do I can index this column
efficiently? I searched and it seems that pg doesn't support the
creation of persistent bitmap indexes... Is that feature planned in next
releases of pg?Thanks
Bruno LavoieI would try partial indexes, as many as the distinct values.
I'm not sure this would help, though.
you should create partial indexes only on those values that are a
lower fraction on the table
ie: if you have
value "fraction of the table that has this value"
1 5%
2 3%
3 20%
4 25%
5 47%
then only partial indexes on values 1 and 2 are of some value
--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157
On Mon, Jan 12, 2009 at 1:38 PM, Bruno Lavoie <bruno.lavoie@gmail.com> wrote:
Hello,
I have a column with a small number of distinct values, indexing this one
with a standard BTree is useless. How do I can index this column
efficiently? I searched and it seems that pg doesn't support the creation of
persistent bitmap indexes... Is that feature planned in next releases of pg?
You have a few options based on your access patterns. If you tend to
access just one of these an get them all at once, then either clusting
on this value, or partitioning your table will help.
If you access your data using these values and other column values at
the same time, then partial or multi-column indexes might help.
-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
owner@postgresql.org] On Behalf Of Alvaro Herrera
Sent: Monday, January 12, 2009 12:41 PM
To: Bruno Lavoie
Cc: PostgreSQL
Subject: Re: [GENERAL] Indexing columns with low cardinality:
persistentbitmap indexes?Bruno Lavoie escribió:
Hello,
I have a column with a small number of distinct values, indexing this
one with a standard BTree is useless. How do I can index this column
efficiently? I searched and it seems that pg doesn't support the
creation of persistent bitmap indexes...It doesn't.
Is that feature planned in next releases of pg?
There are some efforts to get it done, but don't hold your breath (it
won't be in 8.4 either, as it has major problems currently.)
Here is an interesting experiment:
Application of Bitmap Index to Information Retrieval. K. Fujioka, Y. Uematsu, and M. Onizuka. WWW 2008
Source: [ACM]
Synopsis: This paper proposes a hierarchical structure called HS-bitmap index to represent document-term matrix. The authors implemented their data structure on PostgreSQL and observed it to perform better than an inverted index. A short-coming might be that HS-bitmap index takes more space than the inverted index even after compression.
Note this work makes use of PostgreSQL but is unrelated to the on-going work of implementing bitmap index in PostgreSQL.
http://portal.acm.org/citation.cfm?doid=1367497.1367680
Here is the research page where I found the above:
http://www-users.cs.umn.edu/~kewu/annotated.html
Jaime-
Porque no utiliza Bitmap?
*Saludos Cordiales desde EEUU*
Martin
______________________________________________
Disclaimer and confidentiality note
Everything in this e-mail and any attachments relates to the official business of Sender. This transmission is of a confidential nature and Sender does not endorse distribution to any party other than intended recipient. Sender does not necessarily endorse content contained within this transmission.
Date: Mon, 12 Jan 2009 16:20:40 -0500
From: jcasanov@systemguards.com.ec
To: regmeplease@gmail.com
Subject: Re: [GENERAL] Indexing columns with low cardinality: persistent bitmap indexes?
CC: pgsql-general@postgresql.org; bruno.lavoie@gmail.comOn Mon, Jan 12, 2009 at 4:16 PM, Reg Me Please <regmeplease@gmail.com> wrote:
On Monday 12 January 2009 21:38:02 Bruno Lavoie wrote:
Hello,
I have a column with a small number of distinct values, indexing this
one with a standard BTree is useless. How do I can index this column
efficiently? I searched and it seems that pg doesn't support the
creation of persistent bitmap indexes... Is that feature planned in next
releases of pg?Thanks
Bruno LavoieI would try partial indexes, as many as the distinct values.
I'm not sure this would help, though.you should create partial indexes only on those values that are a
lower fraction on the table
ie: if you havevalue "fraction of the table that has this value"
1 5%
2 3%
3 20%
4 25%
5 47%then only partial indexes on values 1 and 2 are of some value
--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
_________________________________________________________________
Windows Live™: Keep your life in sync.
http://windowslive.com/explore?ocid=TXT_TAGLM_WL_t1_allup_explore_012009
Hello,
I have a column with a small number of distinct values, indexing this
one
with a standard BTree is useless. How do I can index this column
efficiently? I searched and it seems that pg doesn't support the
creation of
persistent bitmap indexes... Is that feature planned in next releases of
pg?
You have a few options based on your access patterns. If you tend to
access just one of these an get them all at once, then either clusting
on this value, or partitioning your table will help.
How will clustering benefit this pattern? Won't a full table scan be
required
regardless of the table being clustered? And I thought the point of
clustering
was the organize the table by some indexed key, requiring fewer seeks and
increasing the likelihood of the pages being in the cache .... if the index
is never used in this case ( low cardinality ) would it still help?
Show quoted text
If you access your data using these values and other column values at
the same time, then partial or multi-column indexes might help.
Import Notes
Resolved by subject fallback