Bitmap Indexes: request for feedback
Hi everybody,
me and Gabriele Bartolini have been working on Bitmap Indexes (BMI) in
the last weeks, with advice and guidance from Simon Riggs. We feel
that we are about to approach the point where it is appropriate to ask
for feedback from this list.
Thank you,
Dr. Gianni Ciolli - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gianni.ciolli@2ndquadrant.it | www.2ndquadrant.it
---8<------8<------8<------8<------8<------8<------8<------8<------8<---
First of all, let us describe what we have done so far, what we found
and how we think to proceed now.
== 1. Bringing the patch up to date ==
We started from Gavin Sherry's patch dating May 2007
http://archives.postgresql.org/pgsql-patches/2007-05/msg00013.php
As far as we could see, there has been no further activity on this
subject.
First, we brought the patch up to date. The latest version of the
patch was anterior to the new Index Access Method Interface
http://developer.postgresql.org/pgdocs/postgres/indexam.html
so we adapted the patch to that interface.
Then we added a few BMI page inspection functions to the pageinspect
contrib module, and we used them to examine the code. In addition to
finding and fixing a minor bug, we diagnosed an effect of HOT tuples
on the BMI patch, described below in greater detail. This also helped
us to produce extended descriptive documentation of how these indexes
work, and suggested us how to produce some more tests to verify that
(a newer version of) the BMI patch works; we are going to add some
regression tests especially targeted to HOT tuples.
After message
http://archives.postgresql.org/pgsql-hackers/2008-10/msg00855.php
maybe it is appropriate to mention that backwards scan would not be
supported at all by BMI indexes.
== 2. The effect of HOT tuples on BMI creation ==
The latest BMI patch mentioned above was also prior to the
introduction of HOT tuples.
Some parts of that patch rely on the assumption that
IndexBuildHeapScan scans tuples in increasing TID order. It is easy to
verify that this property is no longer valid after the introduction of
HOT tuples; however, a similar but weaker property still holds (the
scan is always done in non-decreasing block order).
This breaks some low-level bitmap vector build routines, which have to
be rewritten from scratch because they expect TIDs to came in
increasing order; but it does not harm block-level locking used in
that patch.
== 3. What we would do after ==
We understand that BMI development was suspended because of lack of
time from the last developer, during the improvement of the VACUUM
phase. The main obstacle was that the physical size of a compressed
bitmap vector can either grow or shrink, possibly creating new BMV
pages, which can mean bad performance.
The current VACUUM algorithm is unfinished; we are going to examine
it, looking for some improvements, and to measure the current status
with some ad-hoc benchmarks.
== 4. Timeline ==
Up to now, we spent many days to isolate, describe and partially fix
the incompatibilies described above; now we feel that points 1. and
2. can be cleared in a couple of days, bringing the patch up to date
with current HEAD.
As for the remaining part, we expect to finish the patch before the
deadline for the latest CommitFest.
We will re-post the patch as soon as the HOT tuples will be working;
then we will post a new version the patch when also VACUUM will be
done.
Does anybody have any comments and/or additional requests?
Gianni,
me and Gabriele Bartolini have been working on Bitmap Indexes (BMI) in
the last weeks, with advice and guidance from Simon Riggs. We feel
that we are about to approach the point where it is appropriate to ask
for feedback from this list.
The other major issue with the Bitmap index patch as it stood in 2007 was
that performance just wasn't that much faster than a btree, except for
specific corner cases. Otherwise, someone else would have been interested
enough to pick it up and finish it.
So performance testing of the patch is absolutely essential.
--
--Josh
Josh Berkus
PostgreSQL
San Francisco
Josh Berkus <josh@agliodbs.com> writes:
Gianni,
me and Gabriele Bartolini have been working on Bitmap Indexes (BMI) in
the last weeks, with advice and guidance from Simon Riggs. We feel
that we are about to approach the point where it is appropriate to ask
for feedback from this list.The other major issue with the Bitmap index patch as it stood in 2007 was
that performance just wasn't that much faster than a btree, except for
specific corner cases. Otherwise, someone else would have been interested
enough to pick it up and finish it.
Actually as I recall the immediate issue was that the patch was more complex
than necessary. In particular it reimplemented parts of the executor
internally rather than figuring out what api was necessary to integrate it
fully into the executor.
When we last left our heros they were proposing ways to refactor the index api
to allow index ams to stream results to the executor in bitmap form. That
would allow a scan of a bitmap index to return bitmap elements wholesale and
have the executor apply bitmap operations to them along with the elements
returned by a btree bitmap scan or other index ams.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's On-Demand Production Tuning
Gregory Stark <stark@enterprisedb.com> writes:
Josh Berkus <josh@agliodbs.com> writes:
The other major issue with the Bitmap index patch as it stood in 2007 was
that performance just wasn't that much faster than a btree, except for
specific corner cases. Otherwise, someone else would have been interested
enough to pick it up and finish it.
Actually as I recall the immediate issue was that the patch was more complex
than necessary.
Well, yeah, but if the performance isn't there then who's going to spend
time refactoring the code?
regards, tom lane
On Wed, 2008-10-22 at 00:00 +0100, Gregory Stark wrote:
Actually as I recall the immediate issue was that the patch was more complex
than necessary. In particular it reimplemented parts of the executor
internally rather than figuring out what api was necessary to integrate it
fully into the executor.When we last left our heros they were proposing ways to refactor the index api
to allow index ams to stream results to the executor in bitmap form. That
would allow a scan of a bitmap index to return bitmap elements wholesale and
have the executor apply bitmap operations to them along with the elements
returned by a btree bitmap scan or other index ams.
The indexAM API has now been changed, so that is a simple matter now.
amgetbitmap() was committed on 10 April 2008.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
On Tue, 2008-10-21 at 14:26 -0700, Josh Berkus wrote:
Gianni,
me and Gabriele Bartolini have been working on Bitmap Indexes (BMI) in
the last weeks, with advice and guidance from Simon Riggs. We feel
that we are about to approach the point where it is appropriate to ask
for feedback from this list.The other major issue with the Bitmap index patch as it stood in 2007 was
that performance just wasn't that much faster than a btree, except for
specific corner cases. Otherwise, someone else would have been interested
enough to pick it up and finish it.
That seems a strange comment - are you thinking of hash indexes? Do you
have references to these poor performance tests?
BMIs will be a win for indexes with high numbers of duplicate keys -
where btrees don't operate very well. OTOH I expect btrees to continue
to be very useful in places where many keys are unique or nearly unique.
Index creation was much faster and produced much smaller indexes than
btrees, when used in the right situations. Better use of memory and
reduction of I/O alone are important factors. Significantly smaller
indexes also allow more indexes to be built on a table, leading to
overall gains in performance.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
Josh Berkus wrote:
The other major issue with the Bitmap index patch as it stood in 2007 was
that performance just wasn't that much faster than a btree, except for
specific corner cases. Otherwise, someone else would have been interested
enough to pick it up and finish it.So performance testing of the patch is absolutely essential.
As Simon mentioned - index creation time and size was certainly improved
considerably.
There were certainly cases when row retrieval performance was not
improved much (as compared to using a comparable btree index) - my
analysis was that these were typically when heap fetch time dominated
index scan time i.e it didn't matter how good your index access was, you
were mired in heap seeks. ISTM that this situation will change
dramatically when index only access (via dead space map? or similar)
arrives.
Note that even if only for the on disk size savings, these are worth
having for data warehousing situations.
regards
Mark
Simon Riggs <simon@2ndQuadrant.com> writes:
On Wed, 2008-10-22 at 00:00 +0100, Gregory Stark wrote:
When we last left our heros they were proposing ways to refactor the index api
to allow index ams to stream results to the executor in bitmap form.
The indexAM API has now been changed, so that is a simple matter now.
No, that was merely one component of the problem. The APIs for
tidbitmaps need revision too. You can't "stream" a bitmap yet.
regards, tom lane
On Wed, 2008-10-22 at 08:31 -0400, Tom Lane wrote:
Simon Riggs <simon@2ndQuadrant.com> writes:
On Wed, 2008-10-22 at 00:00 +0100, Gregory Stark wrote:
When we last left our heros they were proposing ways to refactor the index api
to allow index ams to stream results to the executor in bitmap form.The indexAM API has now been changed, so that is a simple matter now.
No, that was merely one component of the problem. The APIs for
tidbitmaps need revision too. You can't "stream" a bitmap yet.
Please explain further.
Which calls? Why do we need to stream them?
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
Simon Riggs <simon@2ndQuadrant.com> writes:
On Wed, 2008-10-22 at 08:31 -0400, Tom Lane wrote:
No, that was merely one component of the problem. The APIs for
tidbitmaps need revision too. You can't "stream" a bitmap yet.
Please explain further.
Which calls? Why do we need to stream them?
Well, I guess we don't absolutely *have* to --- we could insist that a
bitmap scan on a bitmap index proceed by first sucking the whole bitmap
into memory, the same as is done for other index types. It seems pretty
silly though, especially for the expected use-case of low cardinality;
the bitmaps would get big.
The idea that was being kicked around was to make it possible for
TIDBitmap to be an alias representing an indexscan in progress. So
you'd pull an index page or so's worth of TIDs from the index, hand
them back to nodeBitmapHeapscan.c to read those tuples, repeat till
done. With judicious use of a few "method" function pointers,
nodeBitmapHeapscan wouldn't need to know whether it was doing this or
using an in-memory bitmap. ANDing and ORing of bitmaps could be made
to work streamably too.
As Greg said, the major complaint against the original patch was that
they'd made it do this (interleave the heap and index access) by hack
slash and burn instead of extending the existing APIs appropriately.
regards, tom lane
On Wed, 2008-10-22 at 09:13 -0400, Tom Lane wrote:
Simon Riggs <simon@2ndQuadrant.com> writes:
On Wed, 2008-10-22 at 08:31 -0400, Tom Lane wrote:
No, that was merely one component of the problem. The APIs for
tidbitmaps need revision too. You can't "stream" a bitmap yet.Please explain further.
Which calls? Why do we need to stream them?Well, I guess we don't absolutely *have* to --- we could insist that a
bitmap scan on a bitmap index proceed by first sucking the whole bitmap
into memory, the same as is done for other index types. It seems pretty
silly though, especially for the expected use-case of low cardinality;
the bitmaps would get big.
OK, now I understand.
Realistically, introducing a new index type is non-trivial and getting
the index working at all will be quite some feat, considering the
requirements of HOT and Vacuum.
I agree we could do the stream bitmap as well, but I'd suggest we should
save that until the rest of the patch has been checked out. GIN wasn't
all written in one release, and I think it likely that it may be another
couple of releases before we do all the tuning on BMIs.
Please let's not set the bar too high for the first implementation,
especially before we have test results that indicate where our tuning
should be focused.
The idea that was being kicked around was to make it possible for
TIDBitmap to be an alias representing an indexscan in progress. So
you'd pull an index page or so's worth of TIDs from the index, hand
them back to nodeBitmapHeapscan.c to read those tuples, repeat till
done. With judicious use of a few "method" function pointers,
nodeBitmapHeapscan wouldn't need to know whether it was doing this or
using an in-memory bitmap. ANDing and ORing of bitmaps could be made
to work streamably too.
Yes, completely agree that would be the better way.
As Greg said, the major complaint against the original patch was that
they'd made it do this (interleave the heap and index access) by hack
slash and burn instead of extending the existing APIs appropriately.
Yeh, I never liked that original approach.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
Simon Riggs <simon@2ndQuadrant.com> writes:
I agree we could do the stream bitmap as well, but I'd suggest we should
save that until the rest of the patch has been checked out.
Well, that would be a reasonable implementation path, but it's got
nothing to do with the patch as presented. The concern was getting
rid of some massive and ugly hacks in the executor ...
regards, tom lane
On Wed, 2008-10-22 at 10:00 -0400, Tom Lane wrote:
Simon Riggs <simon@2ndQuadrant.com> writes:
I agree we could do the stream bitmap as well, but I'd suggest we should
save that until the rest of the patch has been checked out.Well, that would be a reasonable implementation path, but it's got
nothing to do with the patch as presented. The concern was getting
rid of some massive and ugly hacks in the executor ...
Agreed, no ugly hacks in executor allowed.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
Hi to everybody,
following the useful feedback that we received from this list, we
would like to submit the patch for Bitmap Indexes for the november
CommitFest (joint work of me with Gabriele Bartolini, starting from
Gavin Sherry's patch).
There are two open issues (a bug that we are catching and something
missing in the catalog), but we think that it is worth to submit the
patch, for two reasons: because we are confident to fix the open
issues soon, and moreover because bitmap index seem to have their
non-trivial use cases (index creation is significantly faster than in
other types, and they occupy far less disk space, while queries are
only slightly faster).
What we have done:
* We refactored Gavin Sherry's patch
http://archives.postgresql.org/pgsql-patches/2007-05/msg00013.php
using the new access method interface; in particular, we eliminate
a large part of the code, which has been obsoleted by the new
access method API.
Now the interaction of the patch with the existing code is
significantly smaller, and essentially limited to the catalog
entries and the access method API.
* We fixed some wrong behaviours due to the fact that the patch
relies on some assumptions that are no longer true because of the
evolution of PostgreSQL (HOT tuples break the assumption that
during index scan tuples are scanned in TID increasing order).
* We added a chapter to the manual, plus some references to bitmap
index in the remaining test;
* In a separate archive we enclose some performance tests (time and
disk size), which show that bitmap index behave slightly better
than btree in terms of speed, especially in certain cases specific
of the index type (low-cardinality); the main advantage (according
to some feedback that we received from this list) is that bitmap
indexes are significantly smaller than any other type.
Current issues:
* Our workaround for HOT tuples has still one bug; we are currently
working on it and we expect to fix it soon. This bug can be
reproduced by looking at the "rows" column of the performance test.
* Two regression tests are failing (oidjoins and opr_sanity), due to
incomplete catalog entries. We expect to fix it very soon.
Now I am going to add the patch to the wiki page.
Best regards,
Dr. Gianni Ciolli - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gianni.ciolli@2ndquadrant.it | www.2ndquadrant.it