ALTER TYPE 1: recheck index-based constraints
When ALTER TABLE rewrites a table, it reindexes, but the reindex does not
revalidate UNIQUE/EXCLUDE constraints. This behaves badly in cases like this,
neglecting to throw an error on the new UNIQUE violation:
CREATE TABLE t (c numeric UNIQUE);
INSERT INTO t VALUES (1.1),(1.2);
ALTER TABLE t ALTER c TYPE int;
The comment gave a reason for skipping the checks: it would cause deadlocks when
we rewrite a system catalog. So, this patch changes things to only skip the
check for system catalogs.
Attachments:
at1-check-unique.patchtext/plain; charset=us-asciiDownload
*** a/src/backend/catalog/index.c
--- b/src/backend/catalog/index.c
***************
*** 2544,2552 **** reindex_index(Oid indexId, bool skip_constraint_checks)
* catalog indexes while collecting the list.)
*
* We also skip rechecking uniqueness/exclusion constraint properties if
! * heap_rebuilt is true. This avoids likely deadlock conditions when doing
! * VACUUM FULL or CLUSTER on system catalogs. REINDEX should be used to
! * rebuild an index if constraint inconsistency is suspected.
*
* Returns true if any indexes were rebuilt. Note that a
* CommandCounterIncrement will occur after each index rebuild.
--- 2544,2554 ----
* catalog indexes while collecting the list.)
*
* We also skip rechecking uniqueness/exclusion constraint properties if
! * heap_rebuilt is true and the relation is a system catalog. This avoids
! * likely deadlock conditions when doing VACUUM FULL or CLUSTER. We trust that
! * constraints will remain consistent across a user-issued ALTER TABLE against a
! * system catalog. REINDEX should be used to rebuild an index if constraint
! * inconsistency is suspected.
*
* Returns true if any indexes were rebuilt. Note that a
* CommandCounterIncrement will occur after each index rebuild.
***************
*** 2629,2635 **** reindex_relation(Oid relid, bool toast_too, bool heap_rebuilt)
if (is_pg_class)
RelationSetIndexList(rel, doneIndexes, InvalidOid);
! reindex_index(indexOid, heap_rebuilt);
CommandCounterIncrement();
--- 2631,2637 ----
if (is_pg_class)
RelationSetIndexList(rel, doneIndexes, InvalidOid);
! reindex_index(indexOid, heap_rebuilt && IsSystemRelation(rel));
CommandCounterIncrement();
*** a/src/test/regress/expected/alter_table.out
--- b/src/test/regress/expected/alter_table.out
***************
*** 1774,1779 **** DEBUG: Rebuilding index "t_strarr_idx"
--- 1774,1781 ----
DEBUG: Rebuilding index "t_square_idx"
DEBUG: Rebuilding index "t_expr_idx"
DEBUG: Rebuilding index "t_touchy_f_idx"
+ ERROR: could not create unique index "t_touchy_f_idx"
+ DETAIL: Key (touchy_f(constraint1))=(100) is duplicated.
ALTER TABLE t ALTER constraint2 TYPE trickint; -- noop-e
DEBUG: Rewriting table "t"
DEBUG: Rebuilding index "t_constraint4_key"
***************
*** 1792,1816 **** DEBUG: Rebuilding index "t_strarr_idx"
DEBUG: Rebuilding index "t_square_idx"
DEBUG: Rebuilding index "t_touchy_f_idx"
DEBUG: Rebuilding index "t_expr_idx"
! -- Temporary fixup until behavior of the previous two improves.
! ALTER TABLE t ALTER constraint1 TYPE int, ALTER constraint2 TYPE int;
! DEBUG: Rewriting table "t"
! DEBUG: Rebuilding index "t_constraint4_key"
! DEBUG: Rebuilding index "t_integral_key"
! DEBUG: Rebuilding index "t_rational_key"
! DEBUG: Rebuilding index "t_daytimetz_key"
! DEBUG: Rebuilding index "t_daytime_key"
! DEBUG: Rebuilding index "t_stamptz_key"
! DEBUG: Rebuilding index "t_stamp_key"
! DEBUG: Rebuilding index "t_timegap_key"
! DEBUG: Rebuilding index "t_bits_key"
! DEBUG: Rebuilding index "t_network_key"
! DEBUG: Rebuilding index "t_string_idx"
! DEBUG: Rebuilding index "t_string_idx1"
! DEBUG: Rebuilding index "t_strarr_idx"
! DEBUG: Rebuilding index "t_square_idx"
! DEBUG: Rebuilding index "t_touchy_f_idx"
! DEBUG: Rebuilding index "t_expr_idx"
-- Change a column with an outgoing foreign key constraint.
ALTER TABLE t ALTER constraint3 TYPE numeric(8,1); -- rewrite, FK error
DEBUG: Rewriting table "t"
--- 1794,1801 ----
DEBUG: Rebuilding index "t_square_idx"
DEBUG: Rebuilding index "t_touchy_f_idx"
DEBUG: Rebuilding index "t_expr_idx"
! ERROR: could not create unique index "t_expr_idx"
! DETAIL: Key ((1))=(1) is duplicated.
-- Change a column with an outgoing foreign key constraint.
ALTER TABLE t ALTER constraint3 TYPE numeric(8,1); -- rewrite, FK error
DEBUG: Rewriting table "t"
*** a/src/test/regress/sql/alter_table.sql
--- b/src/test/regress/sql/alter_table.sql
***************
*** 1246,1253 **** FROM pg_trigger WHERE tgrelid = 't'::regclass ORDER BY tgname;
ALTER TABLE t ALTER constraint0 TYPE trickint; -- verify-e
ALTER TABLE t ALTER constraint1 TYPE trickint; -- noop-e
ALTER TABLE t ALTER constraint2 TYPE trickint; -- noop-e
- -- Temporary fixup until behavior of the previous two improves.
- ALTER TABLE t ALTER constraint1 TYPE int, ALTER constraint2 TYPE int;
-- Change a column with an outgoing foreign key constraint.
ALTER TABLE t ALTER constraint3 TYPE numeric(8,1); -- rewrite, FK error
--- 1246,1251 ----
On Sun, Jan 9, 2011 at 5:00 PM, Noah Misch <noah@leadboat.com> wrote:
When ALTER TABLE rewrites a table, it reindexes, but the reindex does not
revalidate UNIQUE/EXCLUDE constraints. This behaves badly in cases like this,
neglecting to throw an error on the new UNIQUE violation:CREATE TABLE t (c numeric UNIQUE);
INSERT INTO t VALUES (1.1),(1.2);
ALTER TABLE t ALTER c TYPE int;The comment gave a reason for skipping the checks: it would cause deadlocks when
we rewrite a system catalog. So, this patch changes things to only skip the
check for system catalogs.
It looks like this behavior was introduced by Tom's commit
1ddc2703a936d03953657f43345460b9242bbed1 on 2010-02-07, and it appears
to be quite broken. The behavior is reasonable in the case of VACUUM
FULL and CLUSTER, but your example demonstrates that it's completely
broken in the case of ALTER TABLE. This strikes me as something we
need to fix in REL9_0_STABLE as well as master, and my gut feeling is
that we ought to fix it not by excluding system relations but by
making ALTER TABLE work differently from VACUUM FULL and CLUSTER.
There's an efficiency benefit to skipping this even on normal
relations in cases where it isn't necessary, and it shouldn't be
necessary if we're rewriting the rows unchanged.
Also, you need to start adding these patches to the CF app.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Tue, Jan 11, 2011 at 10:03:11PM -0500, Robert Haas wrote:
On Sun, Jan 9, 2011 at 5:00 PM, Noah Misch <noah@leadboat.com> wrote:
When ALTER TABLE rewrites a table, it reindexes, but the reindex does not
revalidate UNIQUE/EXCLUDE constraints. ?This behaves badly in cases like this,
neglecting to throw an error on the new UNIQUE violation:CREATE TABLE t (c numeric UNIQUE);
INSERT INTO t VALUES (1.1),(1.2);
ALTER TABLE t ALTER c TYPE int;The comment gave a reason for skipping the checks: it would cause deadlocks when
we rewrite a system catalog. ?So, this patch changes things to only skip the
check for system catalogs.It looks like this behavior was introduced by Tom's commit
1ddc2703a936d03953657f43345460b9242bbed1 on 2010-02-07, and it appears
to be quite broken. The behavior is reasonable in the case of VACUUM
FULL and CLUSTER, but your example demonstrates that it's completely
broken in the case of ALTER TABLE. This strikes me as something we
need to fix in REL9_0_STABLE as well as master, and my gut feeling is
that we ought to fix it not by excluding system relations but by
making ALTER TABLE work differently from VACUUM FULL and CLUSTER.
There's an efficiency benefit to skipping this even on normal
relations in cases where it isn't necessary, and it shouldn't be
necessary if we're rewriting the rows unchanged.
Something like the attached?
Also, you need to start adding these patches to the CF app.
Done for all.
Attachments:
at1-check-unique.patchtext/plain; charset=us-asciiDownload
*** a/src/backend/catalog/index.c
--- b/src/backend/catalog/index.c
***************
*** 2543,2558 **** reindex_index(Oid indexId, bool skip_constraint_checks)
* do CCI after having collected the index list. (This way we can still use
* catalog indexes while collecting the list.)
*
! * We also skip rechecking uniqueness/exclusion constraint properties if
! * heap_rebuilt is true. This avoids likely deadlock conditions when doing
! * VACUUM FULL or CLUSTER on system catalogs. REINDEX should be used to
! * rebuild an index if constraint inconsistency is suspected.
*
* Returns true if any indexes were rebuilt. Note that a
* CommandCounterIncrement will occur after each index rebuild.
*/
bool
! reindex_relation(Oid relid, bool toast_too, bool heap_rebuilt)
{
Relation rel;
Oid toast_relid;
--- 2543,2559 ----
* do CCI after having collected the index list. (This way we can still use
* catalog indexes while collecting the list.)
*
! * If we rebuilt a heap without changing tuple values, we also skip rechecking
! * uniqueness/exclusion constraint properties. This avoids likely deadlock
! * conditions when doing VACUUM FULL or CLUSTER on system catalogs. REINDEX
! * should be used to rebuild an index if constraint inconsistency is suspected.
*
* Returns true if any indexes were rebuilt. Note that a
* CommandCounterIncrement will occur after each index rebuild.
*/
bool
! reindex_relation(Oid relid, bool toast_too, bool heap_rebuilt,
! bool tuples_changed)
{
Relation rel;
Oid toast_relid;
***************
*** 2629,2635 **** reindex_relation(Oid relid, bool toast_too, bool heap_rebuilt)
if (is_pg_class)
RelationSetIndexList(rel, doneIndexes, InvalidOid);
! reindex_index(indexOid, heap_rebuilt);
CommandCounterIncrement();
--- 2630,2636 ----
if (is_pg_class)
RelationSetIndexList(rel, doneIndexes, InvalidOid);
! reindex_index(indexOid, heap_rebuilt && !tuples_changed);
CommandCounterIncrement();
***************
*** 2661,2670 **** reindex_relation(Oid relid, bool toast_too, bool heap_rebuilt)
/*
* If the relation has a secondary toast rel, reindex that too while we
! * still hold the lock on the master table.
*/
if (toast_too && OidIsValid(toast_relid))
! result |= reindex_relation(toast_relid, false, false);
return result;
}
--- 2662,2674 ----
/*
* If the relation has a secondary toast rel, reindex that too while we
! * still hold the lock on the master table. There's never a reason to
! * reindex the toast table if heap_rebuilt; it will always be fresh.
*/
+ Assert(!(toast_too && heap_rebuilt));
if (toast_too && OidIsValid(toast_relid))
! result |= reindex_relation(toast_relid, false, heap_rebuilt,
! tuples_changed);
return result;
}
*** a/src/backend/commands/cluster.c
--- b/src/backend/commands/cluster.c
***************
*** 565,571 **** rebuild_relation(Relation OldHeap, Oid indexOid,
* rebuild the target's indexes and throw away the transient table.
*/
finish_heap_swap(tableOid, OIDNewHeap, is_system_catalog,
! swap_toast_by_content, frozenXid);
}
--- 565,571 ----
* rebuild the target's indexes and throw away the transient table.
*/
finish_heap_swap(tableOid, OIDNewHeap, is_system_catalog,
! swap_toast_by_content, false, frozenXid);
}
***************
*** 1362,1367 **** void
--- 1362,1368 ----
finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
bool is_system_catalog,
bool swap_toast_by_content,
+ bool tuples_changed,
TransactionId frozenXid)
{
ObjectAddress object;
***************
*** 1395,1401 **** finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
* so no chance to reclaim disk space before commit. We do not need a
* final CommandCounterIncrement() because reindex_relation does it.
*/
! reindex_relation(OIDOldHeap, false, true);
/* Destroy new heap with old filenode */
object.classId = RelationRelationId;
--- 1396,1402 ----
* so no chance to reclaim disk space before commit. We do not need a
* final CommandCounterIncrement() because reindex_relation does it.
*/
! reindex_relation(OIDOldHeap, false, true, tuples_changed);
/* Destroy new heap with old filenode */
object.classId = RelationRelationId;
*** a/src/backend/commands/indexcmds.c
--- b/src/backend/commands/indexcmds.c
***************
*** 1629,1635 **** ReindexTable(RangeVar *relation)
ReleaseSysCache(tuple);
! if (!reindex_relation(heapOid, true, false))
ereport(NOTICE,
(errmsg("table \"%s\" has no indexes",
relation->relname)));
--- 1629,1635 ----
ReleaseSysCache(tuple);
! if (!reindex_relation(heapOid, true, false, false))
ereport(NOTICE,
(errmsg("table \"%s\" has no indexes",
relation->relname)));
***************
*** 1742,1748 **** ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
StartTransactionCommand();
/* functions in indexes may want a snapshot set */
PushActiveSnapshot(GetTransactionSnapshot());
! if (reindex_relation(relid, true, false))
ereport(NOTICE,
(errmsg("table \"%s.%s\" was reindexed",
get_namespace_name(get_rel_namespace(relid)),
--- 1742,1748 ----
StartTransactionCommand();
/* functions in indexes may want a snapshot set */
PushActiveSnapshot(GetTransactionSnapshot());
! if (reindex_relation(relid, true, false, false))
ereport(NOTICE,
(errmsg("table \"%s.%s\" was reindexed",
get_namespace_name(get_rel_namespace(relid)),
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***************
*** 1076,1082 **** ExecuteTruncate(TruncateStmt *stmt)
/*
* Reconstruct the indexes to match, and we're done.
*/
! reindex_relation(heap_relid, true, false);
}
}
--- 1076,1082 ----
/*
* Reconstruct the indexes to match, and we're done.
*/
! reindex_relation(heap_relid, true, false, false);
}
}
***************
*** 3236,3248 **** ATRewriteTables(List **wqueue, LOCKMODE lockmode)
/*
* Swap the physical files of the old and new heaps, then rebuild
! * indexes and discard the new heap. We can use RecentXmin for
* the table's new relfrozenxid because we rewrote all the tuples
* in ATRewriteTable, so no older Xid remains in the table. Also,
* we never try to swap toast tables by content, since we have no
* interest in letting this code work on system catalogs.
*/
! finish_heap_swap(tab->relid, OIDNewHeap, false, false, RecentXmin);
}
else
{
--- 3236,3249 ----
/*
* Swap the physical files of the old and new heaps, then rebuild
! * indexes and discard the old heap. We can use RecentXmin for
* the table's new relfrozenxid because we rewrote all the tuples
* in ATRewriteTable, so no older Xid remains in the table. Also,
* we never try to swap toast tables by content, since we have no
* interest in letting this code work on system catalogs.
*/
! finish_heap_swap(tab->relid, OIDNewHeap,
! false, false, true, RecentXmin);
}
else
{
*** a/src/include/catalog/index.h
--- b/src/include/catalog/index.h
***************
*** 71,77 **** extern double IndexBuildHeapScan(Relation heapRelation,
extern void validate_index(Oid heapId, Oid indexId, Snapshot snapshot);
extern void reindex_index(Oid indexId, bool skip_constraint_checks);
! extern bool reindex_relation(Oid relid, bool toast_too, bool heap_rebuilt);
extern bool ReindexIsProcessingHeap(Oid heapOid);
extern bool ReindexIsProcessingIndex(Oid indexOid);
--- 71,78 ----
extern void validate_index(Oid heapId, Oid indexId, Snapshot snapshot);
extern void reindex_index(Oid indexId, bool skip_constraint_checks);
! extern bool reindex_relation(Oid relid, bool toast_too, bool heap_rebuilt,
! bool tuples_changed);
extern bool ReindexIsProcessingHeap(Oid heapOid);
extern bool ReindexIsProcessingIndex(Oid indexOid);
*** a/src/include/commands/cluster.h
--- b/src/include/commands/cluster.h
***************
*** 29,34 **** extern Oid make_new_heap(Oid OIDOldHeap, Oid NewTableSpace);
--- 29,35 ----
extern void finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
bool is_system_catalog,
bool swap_toast_by_content,
+ bool tuples_changed,
TransactionId frozenXid);
#endif /* CLUSTER_H */
*** a/src/test/regress/expected/alter_table.out
--- b/src/test/regress/expected/alter_table.out
***************
*** 1774,1779 **** DEBUG: Rebuilding index "t_strarr_idx"
--- 1774,1781 ----
DEBUG: Rebuilding index "t_square_idx"
DEBUG: Rebuilding index "t_expr_idx"
DEBUG: Rebuilding index "t_touchy_f_idx"
+ ERROR: could not create unique index "t_touchy_f_idx"
+ DETAIL: Key (touchy_f(constraint1))=(100) is duplicated.
ALTER TABLE t ALTER constraint2 TYPE trickint; -- noop-e
DEBUG: Rewriting table "t"
DEBUG: Rebuilding index "t_constraint4_key"
***************
*** 1792,1816 **** DEBUG: Rebuilding index "t_strarr_idx"
DEBUG: Rebuilding index "t_square_idx"
DEBUG: Rebuilding index "t_touchy_f_idx"
DEBUG: Rebuilding index "t_expr_idx"
! -- Temporary fixup until behavior of the previous two improves.
! ALTER TABLE t ALTER constraint1 TYPE int, ALTER constraint2 TYPE int;
! DEBUG: Rewriting table "t"
! DEBUG: Rebuilding index "t_constraint4_key"
! DEBUG: Rebuilding index "t_integral_key"
! DEBUG: Rebuilding index "t_rational_key"
! DEBUG: Rebuilding index "t_daytimetz_key"
! DEBUG: Rebuilding index "t_daytime_key"
! DEBUG: Rebuilding index "t_stamptz_key"
! DEBUG: Rebuilding index "t_stamp_key"
! DEBUG: Rebuilding index "t_timegap_key"
! DEBUG: Rebuilding index "t_bits_key"
! DEBUG: Rebuilding index "t_network_key"
! DEBUG: Rebuilding index "t_string_idx"
! DEBUG: Rebuilding index "t_string_idx1"
! DEBUG: Rebuilding index "t_strarr_idx"
! DEBUG: Rebuilding index "t_square_idx"
! DEBUG: Rebuilding index "t_touchy_f_idx"
! DEBUG: Rebuilding index "t_expr_idx"
-- Change a column with an outgoing foreign key constraint.
ALTER TABLE t ALTER constraint3 TYPE numeric(8,1); -- rewrite, FK error
DEBUG: Rewriting table "t"
--- 1794,1801 ----
DEBUG: Rebuilding index "t_square_idx"
DEBUG: Rebuilding index "t_touchy_f_idx"
DEBUG: Rebuilding index "t_expr_idx"
! ERROR: could not create unique index "t_expr_idx"
! DETAIL: Key ((1))=(1) is duplicated.
-- Change a column with an outgoing foreign key constraint.
ALTER TABLE t ALTER constraint3 TYPE numeric(8,1); -- rewrite, FK error
DEBUG: Rewriting table "t"
*** a/src/test/regress/sql/alter_table.sql
--- b/src/test/regress/sql/alter_table.sql
***************
*** 1246,1253 **** FROM pg_trigger WHERE tgrelid = 't'::regclass ORDER BY tgname;
ALTER TABLE t ALTER constraint0 TYPE trickint; -- verify-e
ALTER TABLE t ALTER constraint1 TYPE trickint; -- noop-e
ALTER TABLE t ALTER constraint2 TYPE trickint; -- noop-e
- -- Temporary fixup until behavior of the previous two improves.
- ALTER TABLE t ALTER constraint1 TYPE int, ALTER constraint2 TYPE int;
-- Change a column with an outgoing foreign key constraint.
ALTER TABLE t ALTER constraint3 TYPE numeric(8,1); -- rewrite, FK error
--- 1246,1251 ----
On Wed, Jan 12, 2011 at 8:14 AM, Noah Misch <noah@leadboat.com> wrote:
Something like the attached?
I haven't really analyzed in this detail, but yes, approximately
something sorta like that.
Also, you need to start adding these patches to the CF app.
Done for all.
Thanks.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Jan 12, 2011 at 8:56 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jan 12, 2011 at 8:14 AM, Noah Misch <noah@leadboat.com> wrote:
Something like the attached?
I haven't really analyzed in this detail, but yes, approximately
something sorta like that.
I looked this over some more tonight. The name "tuples_changed" is
giving me some angst, because if we rewrote the heap... the tuples
changed. I think what you intend this to indicate whether the tuples
have changed in a semantic sense, ignoring CTIDs and so on. But it's
not even quite that, either, because ExecuteTruncate() is passing
false, and the set of tuples has probably changed in that case. It
seems that the cases here are:
1. When ALTER TABLE causes a rewrite, we set heap_rebuilt = true and
tuples_changed = true. This causes constraints to be revalidated and
suppresses use of indexes while the rebuild is in progress.
2. When VACUUM FULL or CLUSTER causes a rewrite, we set heap_rebuilt =
true and tuples_changed = false. This avoids constraint revalidation
but still suppresses use of indexes while the rebuild is in progress.
3. When TRUNCATE or REINDEX is invoked, we set heap_rebuilt = false
and tuples_changed = false. Here we neither revalidate constraints
nor suppress use of indexes while the rebuild is in progress.
The first question I asked myself is whether the above is actually
correct, and whether it's possible to simplify back to two cases. So:
The REINDEX case should definitely avoid suppressing the use of the
old index while the new one is rebuilt; I'm not really sure it matters
what TRUNCATE does, since we're going to be operating on a non-system
catalog on which we hold AccessExclusiveLock; the VACUUM FULL/CLUSTER
case certainly needs to suppress uses of indexes, because it can be
used on system catalogs, which may need to be used during the rebuild
itself. Thus #2 and #3 must be distinct. #1 must be distinct from #2
both for performance reasons and to prevent deadlocks when using
VACUUM FULL or CLUSTER on a system catalog (ALTER TABLE can't be so
used, so it's safe for it to not worry about this problem). So I
think the logic is correct and not overly complex.
I think what might make sense is - instead of adding another Boolean
argument, change the heap_rebuilt argument to int flags, and define
REINDEX_CHECK_CONSTRAINTS and REINDEX_SUPPRESS_INDEX_USE as OR-able
flags. I think that's more clear about what's actually going on than
heap_rebuit/tuples_changed.
Thoughts?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Jan 19, 2011 at 11:50:12PM -0500, Robert Haas wrote:
On Wed, Jan 12, 2011 at 8:56 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jan 12, 2011 at 8:14 AM, Noah Misch <noah@leadboat.com> wrote:
Something like the attached?
I haven't really analyzed in this detail, but yes, approximately
something sorta like that.[assessment of current uses] So I think the logic is correct and not overly
complex.
Sounds correct to me.
I think what might make sense is - instead of adding another Boolean
argument, change the heap_rebuilt argument to int flags, and define
REINDEX_CHECK_CONSTRAINTS and REINDEX_SUPPRESS_INDEX_USE as OR-able
flags. I think that's more clear about what's actually going on than
heap_rebuit/tuples_changed.
There are two distinct questions here:
(1) Should reindex_relation receive boolean facts from its callers by way of two
boolean arguments, or by way of one flags vector?
The former seems best when you want every caller to definitely think about which
answer is right, and the latter when that's not so necessary. (For a very long
list of options, the flags might be chosen on other grounds.) As framed, I'd
lean toward keeping distinct arguments, as these are important questions.
However, suppose we inverted both flags, say REINDEX_SKIP_CONSTRAINT_CHECKS and
REINDEX_ALLOW_OLD_INDEX_USE. Then, flags = 0 can hurt performance but not
correctness. That's looking like a win.
(2) Should reindex_relation frame its boolean arguments in terms of what the
caller did (heap_rebuilt, tuples_changed), or in terms of what reindex_relation
will be doing (check_constraints, suppress_index_use)?
The former should be the default approach, but it requires that we be able to
frame good names that effectively convey an abstraction. Prospective callers
must know what to send just by looking at the name and reading the header
comment. When no prospective name is that expressive and you'll end up reading
the reindex_relation code to see how they play out, then it's better to go with
the latter instead. A bad abstraction is worse than none at all.
I agree that both heap_rebuilt and tuples_changed are bad abstractions.
TRUNCATE is lying about the former, and the latter, as you say, is never really
correct. column_values_changed, perhaps. tuples_could_now_violate_constraints
would be correct, but that's just a bad spelling for REINDEX_CHECK_CONSTRAINTS.
I guess the equivalent long-winded improvement for heap_rebuilt would be
indexes_still_valid_for_snapshotnow. Eh.
So yes, let's adopt callee-action-focused names like you propose.
Overall, I'd vote for a flags parameter with negative senses like I noted above.
My second preference would be for two boolean parameters, check_constraints and
suppress_index_use. Not really a big deal to me, though. (I feel a bit silly
writing this email.) What do you think?
Thanks,
nm
On Thu, Jan 20, 2011 at 12:57 AM, Noah Misch <noah@leadboat.com> wrote:
There are two distinct questions here:
Agreed.
(1) Should reindex_relation receive boolean facts from its callers by way of two
boolean arguments, or by way of one flags vector?The former seems best when you want every caller to definitely think about which
answer is right, and the latter when that's not so necessary. (For a very long
list of options, the flags might be chosen on other grounds.) As framed, I'd
lean toward keeping distinct arguments, as these are important questions.
My main beef with the Boolean flags is that this kind of thing is not too clear:
reindex_relation(myrel, false, false, true, true, false, true,
false, false, true);
Unless you have an excellent memory, you can't tell what the heck
that's doing without flipping back and forth between the function
definition and the call site. With a bit-field, it's a lot easier to
glance at the call site and have a clue what's going on. We're of
course not quite to the point of that exaggerated example yet.
However, suppose we inverted both flags, say REINDEX_SKIP_CONSTRAINT_CHECKS and
REINDEX_ALLOW_OLD_INDEX_USE. Then, flags = 0 can hurt performance but not
correctness. That's looking like a win.
I prefer the positive sense for those flags because I think it's more
clear. There aren't so many call sites or so many people using this
that we have to worry about what people are going to do in new calling
locations; getting it right in any new code shouldn't be a
consideration.
(2) Should reindex_relation frame its boolean arguments in terms of what the
caller did (heap_rebuilt, tuples_changed), or in terms of what reindex_relation
will be doing (check_constraints, suppress_index_use)?
Yeah, I know we're doing the former now, but I think it's just getting
confusing for exactly the reasons you state:
I agree that both heap_rebuilt and tuples_changed are bad abstractions.
TRUNCATE is lying about the former, and the latter, as you say, is never really
correct. column_values_changed, perhaps. tuples_could_now_violate_constraints
would be correct, but that's just a bad spelling for REINDEX_CHECK_CONSTRAINTS.
I guess the equivalent long-winded improvement for heap_rebuilt would be
indexes_still_valid_for_snapshotnow. Eh.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Thu, Jan 20, 2011 at 09:26:29AM -0500, Robert Haas wrote:
My main beef with the Boolean flags is that this kind of thing is not too clear:
reindex_relation(myrel, false, false, true, true, false, true,
false, false, true);Unless you have an excellent memory, you can't tell what the heck
that's doing without flipping back and forth between the function
definition and the call site. With a bit-field, it's a lot easier to
glance at the call site and have a clue what's going on. We're of
course not quite to the point of that exaggerated example yet.
Agreed.
However, suppose we inverted both flags, say REINDEX_SKIP_CONSTRAINT_CHECKS and
REINDEX_ALLOW_OLD_INDEX_USE. ?Then, flags = 0 can hurt performance but not
correctness. ?That's looking like a win.I prefer the positive sense for those flags because I think it's more
clear. There aren't so many call sites or so many people using this
that we have to worry about what people are going to do in new calling
locations; getting it right in any new code shouldn't be a
consideration.
Okay. I've attached a new patch version based on that strategy.
Attachments:
at1v3-check-unique.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 1c9df98..75e7055 100644
*** a/src/backend/catalog/index.c
--- b/src/backend/catalog/index.c
***************
*** 2533,2558 **** reindex_index(Oid indexId, bool skip_constraint_checks)
* reindex_relation - This routine is used to recreate all indexes
* of a relation (and optionally its toast relation too, if any).
*
! * If heap_rebuilt is true, then the relation was just completely rebuilt by
! * an operation such as VACUUM FULL or CLUSTER, and therefore its indexes are
! * inconsistent with it. This makes things tricky if the relation is a system
! * catalog that we might consult during the reindexing. To deal with that
! * case, we mark all of the indexes as pending rebuild so that they won't be
! * trusted until rebuilt. The caller is required to call us *without* having
! * made the rebuilt versions visible by doing CommandCounterIncrement; we'll
! * do CCI after having collected the index list. (This way we can still use
! * catalog indexes while collecting the list.)
*
! * We also skip rechecking uniqueness/exclusion constraint properties if
! * heap_rebuilt is true. This avoids likely deadlock conditions when doing
! * VACUUM FULL or CLUSTER on system catalogs. REINDEX should be used to
! * rebuild an index if constraint inconsistency is suspected.
*
* Returns true if any indexes were rebuilt. Note that a
* CommandCounterIncrement will occur after each index rebuild.
*/
bool
! reindex_relation(Oid relid, bool toast_too, bool heap_rebuilt)
{
Relation rel;
Oid toast_relid;
--- 2533,2561 ----
* reindex_relation - This routine is used to recreate all indexes
* of a relation (and optionally its toast relation too, if any).
*
! * "flags" can include REINDEX_SUPPRESS_INDEX_USE and REINDEX_CHECK_CONSTRAINTS.
*
! * If flags has REINDEX_SUPPRESS_INDEX_USE, the relation was just completely
! * rebuilt by an operation such as VACUUM FULL or CLUSTER, and therefore its
! * indexes are inconsistent with it. This makes things tricky if the relation
! * is a system catalog that we might consult during the reindexing. To deal
! * with that case, we mark all of the indexes as pending rebuild so that they
! * won't be trusted until rebuilt. The caller is required to call us *without*
! * having made the rebuilt versions visible by doing CommandCounterIncrement;
! * we'll do CCI after having collected the index list. (This way we can still
! * use catalog indexes while collecting the list.)
! *
! * To avoid deadlocks, VACUUM FULL or CLUSTER on a system catalog must omit the
! * REINDEX_CHECK_CONSTRAINTS flag. REINDEX should be used to rebuild an index
! * if constraint inconsistency is suspected. For optimal performance, other
! * callers should include the flag only after transforming the data in a manner
! * that risks a change in constraint validity.
*
* Returns true if any indexes were rebuilt. Note that a
* CommandCounterIncrement will occur after each index rebuild.
*/
bool
! reindex_relation(Oid relid, bool toast_too, int flags)
{
Relation rel;
Oid toast_relid;
***************
*** 2608,2614 **** reindex_relation(Oid relid, bool toast_too, bool heap_rebuilt)
List *doneIndexes;
ListCell *indexId;
! if (heap_rebuilt)
{
/* Suppress use of all the indexes until they are rebuilt */
SetReindexPending(indexIds);
--- 2611,2617 ----
List *doneIndexes;
ListCell *indexId;
! if (flags & REINDEX_SUPPRESS_INDEX_USE)
{
/* Suppress use of all the indexes until they are rebuilt */
SetReindexPending(indexIds);
***************
*** 2629,2639 **** reindex_relation(Oid relid, bool toast_too, bool heap_rebuilt)
if (is_pg_class)
RelationSetIndexList(rel, doneIndexes, InvalidOid);
! reindex_index(indexOid, heap_rebuilt);
CommandCounterIncrement();
! if (heap_rebuilt)
RemoveReindexPending(indexOid);
if (is_pg_class)
--- 2632,2642 ----
if (is_pg_class)
RelationSetIndexList(rel, doneIndexes, InvalidOid);
! reindex_index(indexOid, !(flags & REINDEX_CHECK_CONSTRAINTS));
CommandCounterIncrement();
! if (flags & REINDEX_SUPPRESS_INDEX_USE)
RemoveReindexPending(indexOid);
if (is_pg_class)
***************
*** 2661,2670 **** reindex_relation(Oid relid, bool toast_too, bool heap_rebuilt)
/*
* If the relation has a secondary toast rel, reindex that too while we
! * still hold the lock on the master table.
*/
if (toast_too && OidIsValid(toast_relid))
! result |= reindex_relation(toast_relid, false, false);
return result;
}
--- 2664,2675 ----
/*
* If the relation has a secondary toast rel, reindex that too while we
! * still hold the lock on the master table. There's never a reason to
! * reindex the toast table right after rebuilding the heap.
*/
+ Assert(!(toast_too && (flags & REINDEX_SUPPRESS_INDEX_USE)));
if (toast_too && OidIsValid(toast_relid))
! result |= reindex_relation(toast_relid, false, flags);
return result;
}
diff --git a/src/backend/commands/index 19c3cf9..59a4394 100644
*** a/src/backend/commands/cluster.c
--- b/src/backend/commands/cluster.c
***************
*** 565,571 **** rebuild_relation(Relation OldHeap, Oid indexOid,
* rebuild the target's indexes and throw away the transient table.
*/
finish_heap_swap(tableOid, OIDNewHeap, is_system_catalog,
! swap_toast_by_content, frozenXid);
}
--- 565,571 ----
* rebuild the target's indexes and throw away the transient table.
*/
finish_heap_swap(tableOid, OIDNewHeap, is_system_catalog,
! swap_toast_by_content, false, frozenXid);
}
***************
*** 1362,1371 **** void
--- 1362,1373 ----
finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
bool is_system_catalog,
bool swap_toast_by_content,
+ bool check_constraints,
TransactionId frozenXid)
{
ObjectAddress object;
Oid mapped_tables[4];
+ int reindex_flags;
int i;
/* Zero out possible results from swapped_relation_files */
***************
*** 1395,1401 **** finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
* so no chance to reclaim disk space before commit. We do not need a
* final CommandCounterIncrement() because reindex_relation does it.
*/
! reindex_relation(OIDOldHeap, false, true);
/* Destroy new heap with old filenode */
object.classId = RelationRelationId;
--- 1397,1406 ----
* so no chance to reclaim disk space before commit. We do not need a
* final CommandCounterIncrement() because reindex_relation does it.
*/
! reindex_flags = REINDEX_SUPPRESS_INDEX_USE;
! if (check_constraints)
! reindex_flags |= REINDEX_CHECK_CONSTRAINTS;
! reindex_relation(OIDOldHeap, false, reindex_flags);
/* Destroy new heap with old filenode */
object.classId = RelationRelationId;
diff --git a/src/backend/commands/indindex f809a26..7a6a4c3 100644
*** a/src/backend/commands/indexcmds.c
--- b/src/backend/commands/indexcmds.c
***************
*** 1629,1635 **** ReindexTable(RangeVar *relation)
ReleaseSysCache(tuple);
! if (!reindex_relation(heapOid, true, false))
ereport(NOTICE,
(errmsg("table \"%s\" has no indexes",
relation->relname)));
--- 1629,1635 ----
ReleaseSysCache(tuple);
! if (!reindex_relation(heapOid, true, 0))
ereport(NOTICE,
(errmsg("table \"%s\" has no indexes",
relation->relname)));
***************
*** 1742,1748 **** ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
StartTransactionCommand();
/* functions in indexes may want a snapshot set */
PushActiveSnapshot(GetTransactionSnapshot());
! if (reindex_relation(relid, true, false))
ereport(NOTICE,
(errmsg("table \"%s.%s\" was reindexed",
get_namespace_name(get_rel_namespace(relid)),
--- 1742,1748 ----
StartTransactionCommand();
/* functions in indexes may want a snapshot set */
PushActiveSnapshot(GetTransactionSnapshot());
! if (reindex_relation(relid, true, 0))
ereport(NOTICE,
(errmsg("table \"%s.%s\" was reindexed",
get_namespace_name(get_rel_namespace(relid)),
diff --git a/src/backend/commands/tableindex 487d0ac..77ea9df 100644
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***************
*** 1076,1082 **** ExecuteTruncate(TruncateStmt *stmt)
/*
* Reconstruct the indexes to match, and we're done.
*/
! reindex_relation(heap_relid, true, false);
}
}
--- 1076,1082 ----
/*
* Reconstruct the indexes to match, and we're done.
*/
! reindex_relation(heap_relid, true, 0);
}
}
***************
*** 3236,3248 **** ATRewriteTables(List **wqueue, LOCKMODE lockmode)
/*
* Swap the physical files of the old and new heaps, then rebuild
! * indexes and discard the new heap. We can use RecentXmin for
* the table's new relfrozenxid because we rewrote all the tuples
* in ATRewriteTable, so no older Xid remains in the table. Also,
* we never try to swap toast tables by content, since we have no
* interest in letting this code work on system catalogs.
*/
! finish_heap_swap(tab->relid, OIDNewHeap, false, false, RecentXmin);
}
else
{
--- 3236,3249 ----
/*
* Swap the physical files of the old and new heaps, then rebuild
! * indexes and discard the old heap. We can use RecentXmin for
* the table's new relfrozenxid because we rewrote all the tuples
* in ATRewriteTable, so no older Xid remains in the table. Also,
* we never try to swap toast tables by content, since we have no
* interest in letting this code work on system catalogs.
*/
! finish_heap_swap(tab->relid, OIDNewHeap,
! false, false, true, RecentXmin);
}
else
{
diff --git a/src/include/catalog/index.index 2804c63..29c4717 100644
*** a/src/include/catalog/index.h
--- b/src/include/catalog/index.h
***************
*** 71,77 **** extern double IndexBuildHeapScan(Relation heapRelation,
extern void validate_index(Oid heapId, Oid indexId, Snapshot snapshot);
extern void reindex_index(Oid indexId, bool skip_constraint_checks);
! extern bool reindex_relation(Oid relid, bool toast_too, bool heap_rebuilt);
extern bool ReindexIsProcessingHeap(Oid heapOid);
extern bool ReindexIsProcessingIndex(Oid indexOid);
--- 71,80 ----
extern void validate_index(Oid heapId, Oid indexId, Snapshot snapshot);
extern void reindex_index(Oid indexId, bool skip_constraint_checks);
!
! #define REINDEX_CHECK_CONSTRAINTS 0x1
! #define REINDEX_SUPPRESS_INDEX_USE 0x2
! extern bool reindex_relation(Oid relid, bool toast_too, int flags);
extern bool ReindexIsProcessingHeap(Oid heapOid);
extern bool ReindexIsProcessingIndex(Oid indexOid);
diff --git a/src/include/commands/index 427d802..518e896 100644
*** a/src/include/commands/cluster.h
--- b/src/include/commands/cluster.h
***************
*** 29,34 **** extern Oid make_new_heap(Oid OIDOldHeap, Oid NewTableSpace);
--- 29,35 ----
extern void finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
bool is_system_catalog,
bool swap_toast_by_content,
+ bool check_constraints,
TransactionId frozenXid);
#endif /* CLUSTER_H */
diff --git a/src/test/regress/expecteindex 387aeaf..955b578 100644
*** a/src/test/regress/expected/alter_table.out
--- b/src/test/regress/expected/alter_table.out
***************
*** 1774,1779 **** DEBUG: Rebuilding index "t_strarr_idx"
--- 1774,1781 ----
DEBUG: Rebuilding index "t_square_idx"
DEBUG: Rebuilding index "t_expr_idx"
DEBUG: Rebuilding index "t_touchy_f_idx"
+ ERROR: could not create unique index "t_touchy_f_idx"
+ DETAIL: Key (touchy_f(constraint1))=(100) is duplicated.
ALTER TABLE t ALTER constraint2 TYPE trickint; -- noop-e
DEBUG: Rewriting table "t"
DEBUG: Rebuilding index "t_constraint4_key"
***************
*** 1792,1816 **** DEBUG: Rebuilding index "t_strarr_idx"
DEBUG: Rebuilding index "t_square_idx"
DEBUG: Rebuilding index "t_touchy_f_idx"
DEBUG: Rebuilding index "t_expr_idx"
! -- Temporary fixup until behavior of the previous two improves.
! ALTER TABLE t ALTER constraint1 TYPE int, ALTER constraint2 TYPE int;
! DEBUG: Rewriting table "t"
! DEBUG: Rebuilding index "t_constraint4_key"
! DEBUG: Rebuilding index "t_integral_key"
! DEBUG: Rebuilding index "t_rational_key"
! DEBUG: Rebuilding index "t_daytimetz_key"
! DEBUG: Rebuilding index "t_daytime_key"
! DEBUG: Rebuilding index "t_stamptz_key"
! DEBUG: Rebuilding index "t_stamp_key"
! DEBUG: Rebuilding index "t_timegap_key"
! DEBUG: Rebuilding index "t_bits_key"
! DEBUG: Rebuilding index "t_network_key"
! DEBUG: Rebuilding index "t_string_idx"
! DEBUG: Rebuilding index "t_string_idx1"
! DEBUG: Rebuilding index "t_strarr_idx"
! DEBUG: Rebuilding index "t_square_idx"
! DEBUG: Rebuilding index "t_touchy_f_idx"
! DEBUG: Rebuilding index "t_expr_idx"
-- Change a column with an outgoing foreign key constraint.
ALTER TABLE t ALTER constraint3 TYPE numeric(8,1); -- rewrite, FK error
DEBUG: Rewriting table "t"
--- 1794,1801 ----
DEBUG: Rebuilding index "t_square_idx"
DEBUG: Rebuilding index "t_touchy_f_idx"
DEBUG: Rebuilding index "t_expr_idx"
! ERROR: could not create unique index "t_expr_idx"
! DETAIL: Key ((1))=(1) is duplicated.
-- Change a column with an outgoing foreign key constraint.
ALTER TABLE t ALTER constraint3 TYPE numeric(8,1); -- rewrite, FK error
DEBUG: Rewriting table "t"
diff --git a/src/test/regress/sql/alter_table.sqindex 18d60e1..fd12181 100644
*** a/src/test/regress/sql/alter_table.sql
--- b/src/test/regress/sql/alter_table.sql
***************
*** 1246,1253 **** FROM pg_trigger WHERE tgrelid = 't'::regclass ORDER BY tgname;
ALTER TABLE t ALTER constraint0 TYPE trickint; -- verify-e
ALTER TABLE t ALTER constraint1 TYPE trickint; -- noop-e
ALTER TABLE t ALTER constraint2 TYPE trickint; -- noop-e
- -- Temporary fixup until behavior of the previous two improves.
- ALTER TABLE t ALTER constraint1 TYPE int, ALTER constraint2 TYPE int;
-- Change a column with an outgoing foreign key constraint.
ALTER TABLE t ALTER constraint3 TYPE numeric(8,1); -- rewrite, FK error
--- 1246,1251 ----
On Thu, Jan 20, 2011 at 2:22 PM, Noah Misch <noah@leadboat.com> wrote:
Okay. I've attached a new patch version based on that strategy.
Thanks. Committed and back-patched to 9.0 (but I didn't use your
regression test).
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company