plans for bitmap indexes?
Hi,
I'd like to know if there are any plans on introducing bitmap indexes
into postgresql. I think this could mean a big performance improvement
especially for datawarehousing applications. I know that there is an
index type hash but I don't know how both types are comparable due to
they are both best usable for equality expressions.
Thanks in advance!
Regards,
Yann
Yann Michel schrieb:
I'd like to know if there are any plans on introducing bitmap indexes
into postgresql. I think this could mean a big performance improvement
especially for datawarehousing applications. I know that there is an
index type hash but I don't know how both types are comparable due to
they are both best usable for equality expressions.
Sure, and the next will come with musicbrainz.
Why not :)
If you don't want to code that in the app, the database backend is the
solution. The database is the golden hammer.
http://c2.com/cgi/wiki?GoldenHammer
--
Reini Urban
http://xarch.tu-graz.ac.at/home/rurban/
Yann Michel wrote:
Hi,
I'd like to know if there are any plans on introducing bitmap indexes
into postgresql. I think this could mean a big performance improvement
especially for datawarehousing applications. I know that there is an
index type hash but I don't know how both types are comparable due to
they are both best usable for equality expressions.
Lots of people have talked about it but I don't know anyone coding it.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Hi,
On Thu, Oct 07, 2004 at 06:54:15PM -0400, Bruce Momjian wrote:
I'd like to know if there are any plans on introducing bitmap indexes
into postgresql. I think this could mean a big performance improvement
especially for datawarehousing applications. I know that there is an
index type hash but I don't know how both types are comparable due to
they are both best usable for equality expressions.Lots of people have talked about it but I don't know anyone coding it.
have you ever discussed if bitmap indexes lead to better query
performance than hash indexes will do?
Regards,
Yann
Hi,
On Thu, Oct 07, 2004 at 10:41:04PM +0200, Reini Urban wrote:
I'd like to know if there are any plans on introducing bitmap indexes
into postgresql. I think this could mean a big performance improvement
especially for datawarehousing applications. I know that there is an
index type hash but I don't know how both types are comparable due to
they are both best usable for equality expressions.Sure, and the next will come with musicbrainz.
Why not :)
I don't know what this means to my questin, but anyway...
If you don't want to code that in the app, the database backend is the
solution. The database is the golden hammer.
Therefore I asked wheather there are any thoughts of implementing them.
Regards,
Yann
-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Yann Michel
Sent: 08 October 2004 08:28
To: Reini Urban
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] plans for bitmap indexes?I don't know what this means to my questin, but anyway...
If you don't want to code that in the app, the database
backend is the
solution. The database is the golden hammer.
Therefore I asked wheather there are any thoughts of
implementing them.
I think what Reini was asking was why do you think you need bitmap
indexes as opposed to any existing type?
Regards, Dave.
Import Notes
Resolved by subject fallback
Hi,
On Fri, Oct 08, 2004 at 10:09:18AM +0100, Dave Page wrote:
I think what Reini was asking was why do you think you need bitmap
indexes as opposed to any existing type?
due to I'm developing a datawarehousing application we have lots of
fact-data in our central fact-table. As I know how to improve
performance on Oracle based datawarehouses, I'm used to add bitmap
indexes for atributes having only a few distinct values.
So I was looking for any comparable indexing technology but didn't find
any so far.
Regards,
Yann
On R, 2004-10-08 at 12:45, Yann Michel wrote:
Hi,
On Fri, Oct 08, 2004 at 10:09:18AM +0100, Dave Page wrote:
I think what Reini was asking was why do you think you need bitmap
indexes as opposed to any existing type?due to I'm developing a datawarehousing application we have lots of
fact-data in our central fact-table. As I know how to improve
performance on Oracle based datawarehouses, I'm used to add bitmap
indexes for atributes having only a few distinct values.
So I was looking for any comparable indexing technology but didn't find
any so far.
There is currently no suitable index type for this type of queries (huge
tables with a few distinct values).
You may try to optimise performance by partitioning your fact tables on
these few dimension values by using table inheritance or union all
views.
There was a discussion on partitioning postgres tables on
pgsql-performance list a few days ago.
-------------
Hannu
On Fri, 8 Oct 2004, Yann Michel wrote:
Hi,
On Fri, Oct 08, 2004 at 10:09:18AM +0100, Dave Page wrote:
I think what Reini was asking was why do you think you need bitmap
indexes as opposed to any existing type?due to I'm developing a datawarehousing application we have lots of
fact-data in our central fact-table. As I know how to improve
performance on Oracle based datawarehouses, I'm used to add bitmap
indexes for atributes having only a few distinct values.
So I was looking for any comparable indexing technology but didn't find
any so far.
Yann, you may try our contrib/intarray module
Regards,
Yann---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
Yann,
Lots of people have talked about it but I don't know anyone coding it.
I would love to have bitmap indexes in Postgres, as would a lot of other
community members. However, they are far from trivial to code. Are you
offering to help?
--
Josh Berkus
Aglio Database Solutions
San Francisco
Import Notes
Reply to msg id not found: 20041008132832.1F3CA32A7A3@svr1.postgresql.orgReference msg id not found: 20041008132832.1F3CA32A7A3@svr1.postgresql.org | Resolved by subject fallback
Hi Josh,
On Fri, Oct 08, 2004 at 09:59:41AM -0700, Josh Berkus wrote:
Lots of people have talked about it but I don't know anyone coding it.
I would love to have bitmap indexes in Postgres, as would a lot of other
community members. However, they are far from trivial to code. Are you
offering to help?
I'd like to help you, but I think, that my C-Experience is not good
enough for beeing able to. I mean, I coded some C-stuff and I know how
bitmap indexes (should) work but I guess that this won't be enough. In
addidtion I never took a look into postgresql's sources.
Regards,
Yann
Yann,
I'd like to help you, but I think, that my C-Experience is not good
enough for beeing able to. I mean, I coded some C-stuff and I know how
bitmap indexes (should) work but I guess that this won't be enough. In
addidtion I never took a look into postgresql's sources.
Well, there's no time like the present to get started! ;-)
--
Josh Berkus
Aglio Database Solutions
San Francisco
Hi Josh,
On Fri, Oct 08, 2004 at 10:18:27AM -0700, Josh Berkus wrote:
I'd like to help you, but I think, that my C-Experience is not good
enough for beeing able to. I mean, I coded some C-stuff and I know how
bitmap indexes (should) work but I guess that this won't be enough. In
addidtion I never took a look into postgresql's sources.Well, there's no time like the present to get started! ;-)
O.K. I downloaded it :-)
We will see if and how I can help....
Regards,
Yann
"Yann" == Yann Michel <yann-postgresql@spline.de> writes:
Yann> O.K. I downloaded it :-) We will see if and how I can
Yann> help....
FYI .. in case you aren't aware already:
http://portal.acm.org/citation.cfm?id=98720
--
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh
yann-postgresql@spline.de (Yann Michel) writes:
On Fri, Oct 08, 2004 at 10:09:18AM +0100, Dave Page wrote:
I think what Reini was asking was why do you think you need bitmap
indexes as opposed to any existing type?due to I'm developing a datawarehousing application we have lots of
fact-data in our central fact-table. As I know how to improve
performance on Oracle based datawarehouses, I'm used to add bitmap
indexes for atributes having only a few distinct values. So I was
looking for any comparable indexing technology but didn't find any
so far.
The most nearly comparable thing is be the notion of "partial
indexes," where, supposing you had 60 region codes (e.g. - 50 US
states, 10 Canadian provinces), you might set up indices thus:
TABLE=my_table
FIELD=stateprov
FILE=$HOME/regionlist.txt
for region in `cat $FILE`; do
query="create index ${TABLE}_partial_on_${region} on $TABLE($FIELD) where $FIELD = '$region';"
echo $query | psql -d datawarehouse
done
That would set up 60 (or whatever $HOME/regionlist.txt indicated)
partial indices on the table on that field.
By the way, I thought ahead a little, in that; doing the same thing
for country codes might be as easy as replacing:
FIELD=country
FILE=$HOME/countrylist.txt
The partial indexes will not ALWAYS be useful; in cases where they
aren't, it is not inconceivable that there are improvements to be made
in the query optimizer...
--
let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/linuxxian.html
A VAX is virtually a computer, but not quite.
josh@agliodbs.com (Josh Berkus) writes:
Lots of people have talked about it but I don't know anyone coding it.
I would love to have bitmap indexes in Postgres, as would a lot of other
community members. However, they are far from trivial to code. Are you
offering to help?
I'm curious as to whether partial indexes might suffice as an
alternative.
There are doubtless cases where the optimizer won't use them where it
would be plausible to do so; that suggests, to me, possibilities for
enhancing the optimizer.
--
let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/linuxxian.html
A VAX is virtually a computer, but not quite.
Import Notes
Reference msg id not found: 20041008132832.1F3CA32A7A3@svr1.postgresql.org
Chris,
The most nearly comparable thing is be the notion of "partial
indexes," where, supposing you had 60 region codes (e.g. - 50 US
states, 10 Canadian provinces), you might set up indices thus:
I'm afraid that you're mistaken about the functionality of bitmap indexes.
The purpose of a bitmap index is not to partition an index, but to allow
multiple indexes to be used in the same operation.
For example, imagine you have a table on a dating website with 18 columns
representing 18 different characteristics for matching. Imagine that you
index each of those columns seperately. If you do:
SELECT * FROM people WHERE orientation = 'gay' AND gender = 'male' AND city =
'San Francisco';
... then the planner can use an index on orientation OR on gender OR on city,
but not all three. Multicolumn indexes are no solution for this use case
because you'd have to create a multicolumn index for each possible combo of
two or three columns ( 18! ).
The Bitmap index allows the query executor to use several indexes on the same
operation, comparing them and selecting rows where they "overlap" like a Venn
diagram.
...
--
--Josh
Josh Berkus
Aglio Database Solutions
San Francisco
Import Notes
Reply to msg id not found: 20041012192756.20B6332AE6F@svr1.postgresql.orgReference msg id not found: 20041012192756.20B6332AE6F@svr1.postgresql.org | Resolved by subject fallback
Josh Berkus <josh@agliodbs.com> writes:
SELECT * FROM people WHERE orientation = 'gay' AND gender = 'male' AND city =
'San Francisco';
There are actually two TODOs here.
1) a "bitmap scan" that would be usable with any type of index. The tuple
locations can be read in for each criteria and sorted by location and built
into bitmaps. The results can be combined using bitmap operations and the
tuples scanned in physical order.
2) A persistent bitmap index that would enable skipping the first step of the
above.
In the case if all the columns were btree indexes it might still make sense to
scan through all the indexes and combine the results before reading in the
actual tuples. Especially if the tuples are very wide and each column
individually very unselective, but the combination very selective.
However it would work even better if gender and orientation could be stored on
disk in a bitmap representation. They're very low cardinality and could be
stored quite compactly. The result would read the data faster, skip the sort,
and be able to start returning tuples even before it finished reading the
entire index.
--
greg
Josh Berkus <josh@agliodbs.com> writes:
The Bitmap index allows the query executor to use several indexes on
the same operation, comparing them and selecting rows where they
"overlap" like a Venn diagram.
Note that what Josh is describing is not really a distinct index type,
but a different way of using an index: that is, you pull candidate tuple
locations from several indexes and intersect or union those sets before
you go to the heap. In principle this works whatever the index access
methods are.
I believe that the term "bitmap index" is also used with a different
meaning wherein it actually does describe a particular kind of on-disk
index structure, with one bit per table row.
IMHO building in-memory bitmaps (the first idea) is a very good idea to
pursue for Postgres. I'm not at all sold on on-disk bitmap indexes,
though ... those I suspect *are* sufficiently replaced by partial
indexes.
regards, tom lane
Josh Berkus wrote:
Chris,
The most nearly comparable thing is be the notion of "partial
indexes," where, supposing you had 60 region codes (e.g. - 50 US
states, 10 Canadian provinces), you might set up indices thus:I'm afraid that you're mistaken about the functionality of bitmap indexes.
The purpose of a bitmap index is not to partition an index, but to allow
multiple indexes to be used in the same operation.For example, imagine you have a table on a dating website with 18 columns
representing 18 different characteristics for matching. Imagine that you
index each of those columns seperately. If you do:SELECT * FROM people WHERE orientation = 'gay' AND gender = 'male' AND city =
'San Francisco';
... then the planner can use an index on orientation OR on gender OR on city,
but not all three. Multicolumn indexes are no solution for this use case
because you'd have to create a multicolumn index for each possible combo of
two or three columns ( 18! ).
I'm wondering if in some cases the following query could be more efficient
select * from people where orientation = 'gay'
intersect
select * from people where gender = 'male'
intersect
select * from people where city =
Regards
Gaetano Mendola