Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

Started by Pavan Deolaseeabout 13 years ago4 messages
#1Pavan Deolasee
pavan.deolasee@gmail.com

On Wed, Oct 31, 2012 at 11:41 AM, Amit Kapila <amit.kapila@huawei.com>wrote:

According to me, the problem happens at Step-4. As at Step-4, it does the
HOT update due to which validate_index() is not able to put an entry for
C2=5

Thanks for the report. I can reproduce this issue. As you rightly pointed
out, step-4 should not have been a HOT update because that would create
broken HOT chains. We used to guard against this by consulting
not-yet-ready or not-yet-valid indexes while deciding whether to do HOT
update or not.

ISTM that the following commit broke things:

commit 8cb53654dbdb4c386369eb988062d0bbb6de725e
Author: Simon Riggs <simon@2ndQuadrant.com>
Date: Fri Apr 6 10:21:40 2012 +0100

Add DROP INDEX CONCURRENTLY [IF EXISTS], uses ShareUpdateExclusiveLock

There is a comment in src/backend/commands/indexcmds.c:694

/*
* At this moment we are sure that there are no transactions with the
* table open for write that don't have this new index in their list of
* indexes. We have waited out all the existing transactions and any
new
* transaction will have the new index in its list, but the index is
still
* marked as "not-ready-for-inserts". The index is consulted while
* deciding HOT-safety though. This arrangement ensures that no new HOT
* chains can be created where the new tuple and the old tuple in the
* chain have different index keys.
*/

So we mark the index not-ready-for-inserts, but still consult it while
deciding whether an UPDATE could be HOT or not.

Somewhere else in heapam.c:2730, we do the following to get information
about index columns for all ready/not-yet-ready indexes:

hot_attrs = RelationGetIndexAttrBitmap(relation);

This internally calls RelationGetIndexList() to get the list of indexes on
the relation.

The offending commit made certain changes to this function and we stopped
returning indexes which has indisready and indisvalid set to false. This
has caused the said regression. Since there are a bunch of callers to this
function, I wouldn't be surprised if we see other regressions because of
the same change.

diff --git a/src/backend/utils/cache/relcache.c
b/src/backend/utils/cache/relcache.c
index a59950e..9cadb3f 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3355,6 +3355,12 @@ RelationGetIndexList(Relation relation)
                oidvector  *indclass;
                bool            isnull;
+               /*
+                * Ignore any indexes that are currently being dropped
+                */
+               if (!index->indisvalid && !index->indisready)
+                       continue;
+
                /* Add index's OID to result list in the proper order */
                result = insert_ordered_oid(result, index->indexrelid);

Thanks,
Pavan

#2Simon Riggs
simon@2ndQuadrant.com
In reply to: Pavan Deolasee (#1)

On 31 October 2012 08:59, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:

On Wed, Oct 31, 2012 at 11:41 AM, Amit Kapila <amit.kapila@huawei.com>
wrote:

According to me, the problem happens at Step-4. As at Step-4, it does the
HOT update due to which validate_index() is not able to put an entry for
C2=5

Thanks for the report. I can reproduce this issue. As you rightly pointed
out, step-4 should not have been a HOT update because that would create
broken HOT chains. We used to guard against this by consulting not-yet-ready
or not-yet-valid indexes while deciding whether to do HOT update or not.

ISTM that the following commit broke things:

commit 8cb53654dbdb4c386369eb988062d0bbb6de725e
Author: Simon Riggs <simon@2ndQuadrant.com>
Date: Fri Apr 6 10:21:40 2012 +0100

Add DROP INDEX CONCURRENTLY [IF EXISTS], uses ShareUpdateExclusiveLock

There is a comment in src/backend/commands/indexcmds.c:694

/*
* At this moment we are sure that there are no transactions with the
* table open for write that don't have this new index in their list of
* indexes. We have waited out all the existing transactions and any
new
* transaction will have the new index in its list, but the index is
still
* marked as "not-ready-for-inserts". The index is consulted while
* deciding HOT-safety though. This arrangement ensures that no new HOT
* chains can be created where the new tuple and the old tuple in the
* chain have different index keys.
*/

So we mark the index not-ready-for-inserts, but still consult it while
deciding whether an UPDATE could be HOT or not.

Somewhere else in heapam.c:2730, we do the following to get information
about index columns for all ready/not-yet-ready indexes:

hot_attrs = RelationGetIndexAttrBitmap(relation);

This internally calls RelationGetIndexList() to get the list of indexes on
the relation.

The offending commit made certain changes to this function and we stopped
returning indexes which has indisready and indisvalid set to false. This has
caused the said regression. Since there are a bunch of callers to this
function, I wouldn't be surprised if we see other regressions because of the
same change.

diff --git a/src/backend/utils/cache/relcache.c
b/src/backend/utils/cache/relcache.c
index a59950e..9cadb3f 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3355,6 +3355,12 @@ RelationGetIndexList(Relation relation)
oidvector  *indclass;
bool            isnull;
+               /*
+                * Ignore any indexes that are currently being dropped
+                */
+               if (!index->indisvalid && !index->indisready)
+                       continue;
+
/* Add index's OID to result list in the proper order */
result = insert_ordered_oid(result, index->indexrelid);

Agreed, will fix.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#3Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Simon Riggs (#2)

On Wed, Oct 31, 2012 at 2:40 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

diff --git a/src/backend/utils/cache/relcache.c
b/src/backend/utils/cache/relcache.c
index a59950e..9cadb3f 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3355,6 +3355,12 @@ RelationGetIndexList(Relation relation)
oidvector  *indclass;
bool            isnull;
+               /*
+                * Ignore any indexes that are currently being dropped
+                */
+               if (!index->indisvalid && !index->indisready)
+                       continue;
+
/* Add index's OID to result list in the proper order */
result = insert_ordered_oid(result, index->indexrelid);

Agreed, will fix.

Thanks Simon.

Looks like a severe data corruption issue. Is there a minor release planned
in near future or would this need one ?

Please let me know if you need any help with this. I investigated this a
bit before posting my analysis (just to ensure that its not a HOT's bug),
but since it involved DROP INDEX CONCURRENTLY, thought it will best be
addressed by you.

Thanks,
Pavan

#4Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Simon Riggs (#2)

On Wed, Oct 31, 2012 at 2:40 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

diff --git a/src/backend/utils/cache/relcache.c
b/src/backend/utils/cache/relcache.c
index a59950e..9cadb3f 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3355,6 +3355,12 @@ RelationGetIndexList(Relation relation)
oidvector  *indclass;
bool            isnull;
+               /*
+                * Ignore any indexes that are currently being dropped
+                */
+               if (!index->indisvalid && !index->indisready)
+                       continue;
+
/* Add index's OID to result list in the proper order */
result = insert_ordered_oid(result, index->indexrelid);

Agreed, will fix.

Simon,

I hope this hasn't fallen through the cracks.

Thanks,
Pavan