REINDEX checking of index constraints

Started by Noah Mischover 12 years ago7 messages
#1Noah Misch
noah@leadboat.com
1 attachment(s)

Historically, REINDEX would always revalidate any uniqueness enforced by the
index. An EDB customer reported that this is not happening, and indeed I
broke it way back in commit 8ceb24568054232696dddc1166a8563bc78c900a.
Specifically, REINDEX TABLE and REINDEX DATABASE no longer revalidate
constraints, but REINDEX INDEX still does so. As a consequence, REINDEX INDEX
is the only form of REINDEX that fixes a failed CREATE INDEX CONCURRENTLY.

Attached patch just restores the old behavior. Would it be worth preserving
the ability to fix an index consistency problem with a REINDEX independent
from related heap consistency problems such as duplicate keys?

Thanks,
nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

Attachments:

reindex-constraints-v1.patchtext/plain; charset=us-asciiDownload
*** a/src/backend/commands/indexcmds.c
--- b/src/backend/commands/indexcmds.c
***************
*** 1768,1774 **** ReindexTable(RangeVar *relation)
  	heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
  									   RangeVarCallbackOwnsTable, NULL);
  
! 	if (!reindex_relation(heapOid, REINDEX_REL_PROCESS_TOAST))
  		ereport(NOTICE,
  				(errmsg("table \"%s\" has no indexes",
  						relation->relname)));
--- 1768,1776 ----
  	heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
  									   RangeVarCallbackOwnsTable, NULL);
  
! 	if (!reindex_relation(heapOid,
! 						  REINDEX_REL_PROCESS_TOAST |
! 						  REINDEX_REL_CHECK_CONSTRAINTS))
  		ereport(NOTICE,
  				(errmsg("table \"%s\" has no indexes",
  						relation->relname)));
***************
*** 1884,1890 **** ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
  		StartTransactionCommand();
  		/* functions in indexes may want a snapshot set */
  		PushActiveSnapshot(GetTransactionSnapshot());
! 		if (reindex_relation(relid, REINDEX_REL_PROCESS_TOAST))
  			ereport(NOTICE,
  					(errmsg("table \"%s.%s\" was reindexed",
  							get_namespace_name(get_rel_namespace(relid)),
--- 1886,1894 ----
  		StartTransactionCommand();
  		/* functions in indexes may want a snapshot set */
  		PushActiveSnapshot(GetTransactionSnapshot());
! 		if (reindex_relation(relid,
! 							 REINDEX_REL_PROCESS_TOAST |
! 							 REINDEX_REL_CHECK_CONSTRAINTS))
  			ereport(NOTICE,
  					(errmsg("table \"%s.%s\" was reindexed",
  							get_namespace_name(get_rel_namespace(relid)),
*** a/src/test/regress/expected/create_index.out
--- b/src/test/regress/expected/create_index.out
***************
*** 2298,2306 **** COMMIT;
  BEGIN;
  CREATE INDEX std_index on concur_heap(f2);
  COMMIT;
! -- check to make sure that the failed indexes were cleaned up properly and the
! -- successful indexes are created properly. Notably that they do NOT have the
! -- "invalid" flag set.
  \d concur_heap
  Table "public.concur_heap"
   Column | Type | Modifiers 
--- 2298,2310 ----
  BEGIN;
  CREATE INDEX std_index on concur_heap(f2);
  COMMIT;
! -- Failed builds are left invalid by VACUUM FULL, fixed by REINDEX
! VACUUM FULL concur_heap;
! REINDEX TABLE concur_heap;
! ERROR:  could not create unique index "concur_index3"
! DETAIL:  Key (f2)=(b) is duplicated.
! DELETE FROM concur_heap WHERE f1 = 'b';
! VACUUM FULL concur_heap;
  \d concur_heap
  Table "public.concur_heap"
   Column | Type | Modifiers 
***************
*** 2316,2321 **** Indexes:
--- 2320,2341 ----
      "concur_index5" btree (f2) WHERE f1 = 'x'::text
      "std_index" btree (f2)
  
+ REINDEX TABLE concur_heap;
+ \d concur_heap
+ Table "public.concur_heap"
+  Column | Type | Modifiers 
+ --------+------+-----------
+  f1     | text | 
+  f2     | text | 
+ Indexes:
+     "concur_index2" UNIQUE, btree (f1)
+     "concur_index3" UNIQUE, btree (f2)
+     "concur_heap_expr_idx" btree ((f2 || f1))
+     "concur_index1" btree (f2, f1)
+     "concur_index4" btree (f2) WHERE f1 = 'a'::text
+     "concur_index5" btree (f2) WHERE f1 = 'x'::text
+     "std_index" btree (f2)
+ 
  --
  -- Try some concurrent index drops
  --
*** a/src/test/regress/sql/create_index.sql
--- b/src/test/regress/sql/create_index.sql
***************
*** 721,730 **** BEGIN;
  CREATE INDEX std_index on concur_heap(f2);
  COMMIT;
  
! -- check to make sure that the failed indexes were cleaned up properly and the
! -- successful indexes are created properly. Notably that they do NOT have the
! -- "invalid" flag set.
! 
  \d concur_heap
  
  --
--- 721,733 ----
  CREATE INDEX std_index on concur_heap(f2);
  COMMIT;
  
! -- Failed builds are left invalid by VACUUM FULL, fixed by REINDEX
! VACUUM FULL concur_heap;
! REINDEX TABLE concur_heap;
! DELETE FROM concur_heap WHERE f1 = 'b';
! VACUUM FULL concur_heap;
! \d concur_heap
! REINDEX TABLE concur_heap;
  \d concur_heap
  
  --
#2Josh Berkus
josh@agliodbs.com
In reply to: Noah Misch (#1)
Re: REINDEX checking of index constraints

Noah,

Attached patch just restores the old behavior. Would it be worth preserving
the ability to fix an index consistency problem with a REINDEX independent
from related heap consistency problems such as duplicate keys?

I would love to have two versions of REINDEX, one which validated and
one which didn't. Maybe a ( validate off ) type check?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Josh Berkus
josh@agliodbs.com
In reply to: Noah Misch (#1)
Re: REINDEX checking of index constraints

On 07/21/2013 11:30 AM, Josh Berkus wrote:

Noah,

Attached patch just restores the old behavior. Would it be worth preserving
the ability to fix an index consistency problem with a REINDEX independent
from related heap consistency problems such as duplicate keys?

I would love to have two versions of REINDEX, one which validated and
one which didn't. Maybe a ( validate off ) type check?

Cancel this. I just did some tests, and there amount of time required
for the validation (at least, in simple two-column table test) is < 10%
of the time required to reindex in general. At that difference, we
don't need two options.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Josh Berkus
josh@agliodbs.com
In reply to: Noah Misch (#1)
Re: REINDEX checking of index constraints

On 07/21/2013 11:30 AM, Josh Berkus wrote:

Noah,

Attached patch just restores the old behavior. Would it be worth preserving
the ability to fix an index consistency problem with a REINDEX independent
from related heap consistency problems such as duplicate keys?

I would love to have two versions of REINDEX, one which validated and
one which didn't. Maybe a ( validate off ) type check?

Cancel this. I just did some tests, and there amount of time required
for the validation (at least, in simple two-column table test) is < 10%
of the time required to reindex in general. At that difference, we
don't need two options.

Unless you're asking if we want a command to check the index validity
without rebuilding it? That might be more valuable ...

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5ktm@rice.edu
ktm@rice.edu
In reply to: Josh Berkus (#2)
Re: REINDEX checking of index constraints

On Sun, Jul 21, 2013 at 11:30:54AM -0700, Josh Berkus wrote:

Noah,

Attached patch just restores the old behavior. Would it be worth preserving
the ability to fix an index consistency problem with a REINDEX independent
from related heap consistency problems such as duplicate keys?

I would love to have two versions of REINDEX, one which validated and
one which didn't. Maybe a ( validate off ) type check?

+1 There are reasons to reindex that do not involve its validity and it would
be great to not need to visit the heap for that.

Regards,
Ken

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Noah Misch
noah@leadboat.com
In reply to: Josh Berkus (#4)
Re: REINDEX checking of index constraints

On Sun, Jul 21, 2013 at 01:47:00PM -0700, Josh Berkus wrote:

On 07/21/2013 11:30 AM, Josh Berkus wrote:

Attached patch just restores the old behavior. Would it be worth preserving
the ability to fix an index consistency problem with a REINDEX independent
from related heap consistency problems such as duplicate keys?

I would love to have two versions of REINDEX, one which validated and
one which didn't. Maybe a ( validate off ) type check?

Cancel this. I just did some tests, and there amount of time required
for the validation (at least, in simple two-column table test) is < 10%
of the time required to reindex in general. At that difference, we
don't need two options.

Unless you're asking if we want a command to check the index validity
without rebuilding it? That might be more valuable ...

I meant to ask whether, instead of reverting the accidental behavior change,
we should do something like leave the behavior and change the documentation
instead. I personally vote "no", but that alternative seemed credible enough
to justify mentioning it. Something more radical, like a new UI, would be a
separate patch.

Thanks,
nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Noah Misch (#6)
Re: REINDEX checking of index constraints

Noah Misch wrote:

I meant to ask whether, instead of reverting the accidental behavior change,
we should do something like leave the behavior and change the documentation
instead. I personally vote "no", but that alternative seemed credible enough
to justify mentioning it. Something more radical, like a new UI, would be a
separate patch.

separate patch++. I agree some use cases probably justify new UI.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers