propagating replica identity to partitions
Hello
If you do ALTER TABLE .. REPLICA IDENTITY to a partitioned table, the
command operates on the parent table itself and does not propagate to
partitions. Why is this? Maybe not recursing was the right call when
we only had regular inheritance (back in 9.4), but since partitioned
tables got introduced, I think it is completely the other way around:
not recursing is an usability fail.
At the same time, I think that psql failing to display the replica
identity for partitioned tables is just an oversight and not designed
in.
I propose to change the behavior to:
1. When replica identity is changed on a partitioned table, all partitions
are updated also. Do we need to care about regular inheritance?
My inclination is not to touch those; this'd become the first case
in ATPrepCmd that recurses on partitioning but not inheritance.
2. When a partition is created, the replica identity is set to the
same that the parent table has. If it's type index, figure out the
corresponding index in the partition, set that. If the index doesn't
exist, raise an error (i.e. replica identity cannot be set to an
index until it has propagated to all children).
3. psql should display replica identity for partitioned tables.
Thoughts?
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
I propose to change the behavior to:
... this patch.
I'll now look more carefully at the cases involving indexes; thus far I
was looking at the ones using FULL. Those seem to work as intended.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-index_get_partition.patchtext/x-diff; charset=us-asciiDownload
From f205a06c9aa6061dd1bac01ef63017aa6811b5f9 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Tue, 22 Jan 2019 18:00:31 -0300
Subject: [PATCH 1/2] index_get_partition
---
src/backend/catalog/partition.c | 35 +++++++++++++++++++++++++++++++++++
src/backend/commands/tablecmds.c | 40 +++++++++++-----------------------------
src/include/catalog/partition.h | 1 +
3 files changed, 47 insertions(+), 29 deletions(-)
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 0d3bc3a2c7..66012bb28a 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -146,6 +146,41 @@ get_partition_ancestors_worker(Relation inhRel, Oid relid, List **ancestors)
}
/*
+ * Return the OID of the index, in the given partition, that is a child of the
+ * given index or InvalidOid if there isn't one.
+ */
+Oid
+index_get_partition(Relation partition, Oid indexId)
+{
+ List *idxlist = RelationGetIndexList(partition);
+ ListCell *l;
+
+ foreach(l, idxlist)
+ {
+ Oid partIdx = lfirst_oid(l);
+ HeapTuple tup;
+ Form_pg_class classForm;
+ bool ispartition;
+
+ tup = SearchSysCache1(RELOID, ObjectIdGetDatum(partIdx));
+ if (!tup)
+ elog(ERROR, "cache lookup failed for relation %u", partIdx);
+ classForm = (Form_pg_class) GETSTRUCT(tup);
+ ispartition = classForm->relispartition;
+ ReleaseSysCache(tup);
+ if (!ispartition)
+ continue;
+ if (get_partition_parent(lfirst_oid(l)) == indexId)
+ {
+ list_free(idxlist);
+ return partIdx;
+ }
+ }
+
+ return InvalidOid;
+}
+
+/*
* map_partition_varattnos - maps varattno of any Vars in expr from the
* attno's of 'from_rel' to the attno's of 'to_rel' partition, each of which
* may be either a leaf partition or a partitioned table, but both of which
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 35a9ade059..877bac506f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15427,36 +15427,18 @@ ATExecAttachPartitionIdx(List **wqueue, Relation parentIdx, RangeVar *name)
static void
refuseDupeIndexAttach(Relation parentIdx, Relation partIdx, Relation partitionTbl)
{
- Relation pg_inherits;
- ScanKeyData key;
- HeapTuple tuple;
- SysScanDesc scan;
+ Oid existingIdx;
- pg_inherits = table_open(InheritsRelationId, AccessShareLock);
- ScanKeyInit(&key, Anum_pg_inherits_inhparent,
- BTEqualStrategyNumber, F_OIDEQ,
- ObjectIdGetDatum(RelationGetRelid(parentIdx)));
- scan = systable_beginscan(pg_inherits, InheritsParentIndexId, true,
- NULL, 1, &key);
- while (HeapTupleIsValid(tuple = systable_getnext(scan)))
- {
- Form_pg_inherits inhForm;
- Oid tab;
-
- inhForm = (Form_pg_inherits) GETSTRUCT(tuple);
- tab = IndexGetRelation(inhForm->inhrelid, false);
- if (tab == RelationGetRelid(partitionTbl))
- ereport(ERROR,
- (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("cannot attach index \"%s\" as a partition of index \"%s\"",
- RelationGetRelationName(partIdx),
- RelationGetRelationName(parentIdx)),
- errdetail("Another index is already attached for partition \"%s\".",
- RelationGetRelationName(partitionTbl))));
- }
-
- systable_endscan(scan);
- table_close(pg_inherits, AccessShareLock);
+ existingIdx = index_get_partition(partitionTbl,
+ RelationGetRelid(parentIdx));
+ if (OidIsValid(existingIdx))
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot attach index \"%s\" as a partition of index \"%s\"",
+ RelationGetRelationName(partIdx),
+ RelationGetRelationName(parentIdx)),
+ errdetail("Another index is already attached for partition \"%s\".",
+ RelationGetRelationName(partitionTbl))));
}
/*
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
index 5685d2fd57..7f0b198bcb 100644
--- a/src/include/catalog/partition.h
+++ b/src/include/catalog/partition.h
@@ -35,6 +35,7 @@ typedef struct PartitionDescData
extern Oid get_partition_parent(Oid relid);
extern List *get_partition_ancestors(Oid relid);
+extern Oid index_get_partition(Relation partition, Oid indexId);
extern List *map_partition_varattnos(List *expr, int fromrel_varno,
Relation to_rel, Relation from_rel,
bool *found_whole_row);
--
2.11.0
0002-Propagate-replica-identity-to-partitions.patchtext/x-diff; charset=us-asciiDownload
From 2d3d7c49fa8fcd14a5dcaddfffb7b4239b9cab62 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Mon, 4 Feb 2019 13:43:58 -0300
Subject: [PATCH 2/2] Propagate replica identity to partitions
---
src/backend/commands/tablecmds.c | 59 +++++++++++++--
src/bin/psql/describe.c | 3 +-
src/test/regress/expected/replica_identity.out | 99 ++++++++++++++++++++++++++
src/test/regress/sql/replica_identity.sql | 33 +++++++++
4 files changed, 189 insertions(+), 5 deletions(-)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 877bac506f..4b2ae01f15 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -300,7 +300,7 @@ static void truncate_check_activity(Relation rel);
static void RangeVarCallbackForTruncate(const RangeVar *relation,
Oid relId, Oid oldRelId, void *arg);
static List *MergeAttributes(List *schema, List *supers, char relpersistence,
- bool is_partition, List **supconstr);
+ bool is_partition, List **supconstr, char *ri_type);
static bool MergeCheckConstraint(List *constraints, char *name, Node *expr);
static void MergeAttributesIntoExisting(Relation child_rel, Relation parent_rel);
static void MergeConstraintsIntoExisting(Relation child_rel, Relation parent_rel);
@@ -485,6 +485,7 @@ static void AttachPartitionEnsureIndexes(Relation rel, Relation attachrel);
static void QueuePartitionConstraintValidation(List **wqueue, Relation scanrel,
List *partConstraint,
bool validate_default);
+static void MatchReplicaIdentity(Relation tgtrel, Relation srcrel);
static void CloneRowTriggersToPartition(Relation parent, Relation partition);
static ObjectAddress ATExecDetachPartition(Relation rel, RangeVar *name);
static ObjectAddress ATExecAttachPartitionIdx(List **wqueue, Relation rel,
@@ -527,6 +528,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
TupleDesc descriptor;
List *inheritOids;
List *old_constraints;
+ char ri_type;
List *rawDefaults;
List *cookedDefaults;
Datum reloptions;
@@ -708,7 +710,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
MergeAttributes(stmt->tableElts, inheritOids,
stmt->relation->relpersistence,
stmt->partbound != NULL,
- &old_constraints);
+ &old_constraints, &ri_type);
/*
* Create a tuple descriptor from the relation schema. Note that this
@@ -1014,6 +1016,10 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
*/
CloneForeignKeyConstraints(parentId, relationId, NULL);
+ /* Propagate REPLICA IDENTITY information too */
+ if (ri_type != REPLICA_IDENTITY_DEFAULT)
+ MatchReplicaIdentity(rel, parent);
+
table_close(parent, NoLock);
}
@@ -1873,6 +1879,8 @@ storage_name(char c)
* Output arguments:
* 'supconstr' receives a list of constraints belonging to the parents,
* updated as necessary to be valid for the child.
+ * 'ri_type' receives the replica identity type of the last parent seen,
+ * or default if none.
*
* Return value:
* Completed schema list.
@@ -1914,11 +1922,15 @@ storage_name(char c)
* (4) Otherwise the inherited default is used.
* Rule (3) is new in Postgres 7.1; in earlier releases you got a
* rather arbitrary choice of which parent default to use.
+ *
+ * It only makes sense to use the returned 'ri_type' when there's a single
+ * parent, such as in declarative partitioning.
*----------
*/
static List *
MergeAttributes(List *schema, List *supers, char relpersistence,
- bool is_partition, List **supconstr)
+ bool is_partition, List **supconstr,
+ char *ri_type)
{
ListCell *entry;
List *inhSchema = NIL;
@@ -2015,6 +2027,11 @@ MergeAttributes(List *schema, List *supers, char relpersistence,
}
/*
+ * Initialize replica identity to default; parents may change it later
+ */
+ *ri_type = REPLICA_IDENTITY_DEFAULT;
+
+ /*
* Scan the parents left-to-right, and merge their attributes to form a
* list of inherited attributes (inhSchema). Also check to see if we need
* to inherit an OID column.
@@ -2095,6 +2112,9 @@ MergeAttributes(List *schema, List *supers, char relpersistence,
? "cannot inherit from temporary relation of another session"
: "cannot create as partition of temporary relation of another session")));
+ /* Indicate replica identity back to caller */
+ *ri_type = relation->rd_rel->relreplident;
+
/*
* We should have an UNDER permission flag for this, but for now,
* demand that creator of a child table own the parent.
@@ -3935,7 +3955,9 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
case AT_ReplicaIdentity: /* REPLICA IDENTITY ... */
ATSimplePermissions(rel, ATT_TABLE | ATT_MATVIEW);
pass = AT_PASS_MISC;
- /* This command never recurses */
+ /* Recurse only on partitioned tables */
+ if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+ ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode);
/* No command-specific prep needed */
break;
case AT_EnableTrig: /* ENABLE TRIGGER variants */
@@ -14664,6 +14686,10 @@ ATExecAttachPartition(List **wqueue, Relation rel, PartitionCmd *cmd)
/* and triggers */
CloneRowTriggersToPartition(rel, attachrel);
+ /* Propagate REPLICA IDENTITY information */
+ if (rel->rd_rel->relreplident != REPLICA_IDENTITY_DEFAULT)
+ MatchReplicaIdentity(attachrel, rel);
+
/*
* Clone foreign key constraints, and setup for Phase 3 to verify them.
*/
@@ -14915,6 +14941,31 @@ AttachPartitionEnsureIndexes(Relation rel, Relation attachrel)
}
/*
+ * Set up partRel's (a partition) replica identity to match parentRel's (its
+ * parent).
+ */
+static void
+MatchReplicaIdentity(Relation partRel, Relation srcrel)
+{
+ Oid ri_index;
+
+ if (srcrel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX)
+ {
+ ri_index = index_get_partition(partRel,
+ RelationGetReplicaIndex(srcrel));
+ if (!OidIsValid(ri_index))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("replica index does not exist in partition")));
+ }
+ else
+ ri_index = InvalidOid;
+
+ relation_mark_replica_identity(partRel, srcrel->rd_rel->relreplident,
+ ri_index, true);
+}
+
+/*
* CloneRowTriggersToPartition
* subroutine for ATExecAttachPartition/DefineRelation to create row
* triggers on partitions
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 4da6719ce7..6145a000cb 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3113,7 +3113,8 @@ describeOneTableDetails(const char *schemaname,
if (verbose &&
(tableinfo.relkind == RELKIND_RELATION ||
- tableinfo.relkind == RELKIND_MATVIEW) &&
+ tableinfo.relkind == RELKIND_MATVIEW ||
+ tableinfo.relkind == RELKIND_PARTITIONED_TABLE) &&
/*
* No need to display default values; we already display a REPLICA
diff --git a/src/test/regress/expected/replica_identity.out b/src/test/regress/expected/replica_identity.out
index 175ecd2879..3051cf1551 100644
--- a/src/test/regress/expected/replica_identity.out
+++ b/src/test/regress/expected/replica_identity.out
@@ -181,3 +181,102 @@ SELECT relreplident FROM pg_class WHERE oid = 'test_replica_identity'::regclass;
DROP TABLE test_replica_identity;
DROP TABLE test_replica_identity_othertable;
+----
+-- Make sure it propagates to partitions
+----
+CREATE TABLE test_replica_identity_part (a int, b int) PARTITION BY RANGE (a);
+CREATE TABLE test_replica_identity_part1 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (0) TO (1000) PARTITION BY RANGE (a);
+CREATE TABLE test_replica_identity_part2 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (1000) TO (2000);
+CREATE TABLE test_replica_identity_part11 PARTITION OF test_replica_identity_part1
+ FOR VALUES FROM (1000) TO (1500);
+ALTER TABLE test_replica_identity_part REPLICA IDENTITY FULL;
+CREATE TABLE test_replica_identity_part3 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (2000) TO (3000);
+CREATE TABLE test_replica_identity_part4 (LIKE test_replica_identity_part);
+ALTER TABLE test_replica_identity_part ATTACH PARTITION test_replica_identity_part4
+ FOR VALUES FROM (3000) TO (4000);
+\d+ test_replica_identity_part2
+ Table "public.test_replica_identity_part2"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part FOR VALUES FROM (1000) TO (2000)
+Partition constraint: ((a IS NOT NULL) AND (a >= 1000) AND (a < 2000))
+Replica Identity: FULL
+
+\d+ test_replica_identity_part11
+ Table "public.test_replica_identity_part11"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part1 FOR VALUES FROM (1000) TO (1500)
+Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 1000) AND (a IS NOT NULL) AND (a >= 1000) AND (a < 1500))
+Replica Identity: FULL
+
+\d+ test_replica_identity_part
+ Partitioned table "public.test_replica_identity_part"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+ b | integer | | | | plain | |
+Partition key: RANGE (a)
+Partitions: test_replica_identity_part1 FOR VALUES FROM (0) TO (1000), PARTITIONED,
+ test_replica_identity_part2 FOR VALUES FROM (1000) TO (2000),
+ test_replica_identity_part3 FOR VALUES FROM (2000) TO (3000),
+ test_replica_identity_part4 FOR VALUES FROM (3000) TO (4000)
+Replica Identity: FULL
+
+\d+ test_replica_identity_part3
+ Table "public.test_replica_identity_part3"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part FOR VALUES FROM (2000) TO (3000)
+Partition constraint: ((a IS NOT NULL) AND (a >= 2000) AND (a < 3000))
+Replica Identity: FULL
+
+\d+ test_replica_identity_part4
+ Table "public.test_replica_identity_part4"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part FOR VALUES FROM (3000) TO (4000)
+Partition constraint: ((a IS NOT NULL) AND (a >= 3000) AND (a < 4000))
+Replica Identity: FULL
+
+----
+-- Check behavior with inherited tables
+----
+CREATE TABLE test_replica_identity_inh (a int);
+CREATE TABLE test_replica_identity_cld () INHERITS (test_replica_identity_inh);
+ALTER TABLE test_replica_identity_inh REPLICA IDENTITY FULL;
+CREATE TABLE test_replica_identity_cld2 () INHERITS (test_replica_identity_inh);
+\d+ test_replica_identity_inh
+ Table "public.test_replica_identity_inh"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+Child tables: test_replica_identity_cld,
+ test_replica_identity_cld2
+Replica Identity: FULL
+
+\d+ test_replica_identity_cld
+ Table "public.test_replica_identity_cld"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+Inherits: test_replica_identity_inh
+
+\d+ test_replica_identity_cld2
+ Table "public.test_replica_identity_cld2"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+Inherits: test_replica_identity_inh
+
diff --git a/src/test/regress/sql/replica_identity.sql b/src/test/regress/sql/replica_identity.sql
index b08a3623b8..a567f2b52d 100644
--- a/src/test/regress/sql/replica_identity.sql
+++ b/src/test/regress/sql/replica_identity.sql
@@ -77,3 +77,36 @@ SELECT relreplident FROM pg_class WHERE oid = 'test_replica_identity'::regclass;
DROP TABLE test_replica_identity;
DROP TABLE test_replica_identity_othertable;
+
+----
+-- Make sure it propagates to partitions
+----
+CREATE TABLE test_replica_identity_part (a int, b int) PARTITION BY RANGE (a);
+CREATE TABLE test_replica_identity_part1 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (0) TO (1000) PARTITION BY RANGE (a);
+CREATE TABLE test_replica_identity_part2 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (1000) TO (2000);
+CREATE TABLE test_replica_identity_part11 PARTITION OF test_replica_identity_part1
+ FOR VALUES FROM (1000) TO (1500);
+ALTER TABLE test_replica_identity_part REPLICA IDENTITY FULL;
+CREATE TABLE test_replica_identity_part3 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (2000) TO (3000);
+CREATE TABLE test_replica_identity_part4 (LIKE test_replica_identity_part);
+ALTER TABLE test_replica_identity_part ATTACH PARTITION test_replica_identity_part4
+ FOR VALUES FROM (3000) TO (4000);
+\d+ test_replica_identity_part2
+\d+ test_replica_identity_part11
+\d+ test_replica_identity_part
+\d+ test_replica_identity_part3
+\d+ test_replica_identity_part4
+
+----
+-- Check behavior with inherited tables
+----
+CREATE TABLE test_replica_identity_inh (a int);
+CREATE TABLE test_replica_identity_cld () INHERITS (test_replica_identity_inh);
+ALTER TABLE test_replica_identity_inh REPLICA IDENTITY FULL;
+CREATE TABLE test_replica_identity_cld2 () INHERITS (test_replica_identity_inh);
+\d+ test_replica_identity_inh
+\d+ test_replica_identity_cld
+\d+ test_replica_identity_cld2
--
2.11.0
On 2019-Feb-04, Alvaro Herrera wrote:
I'll now look more carefully at the cases involving indexes; thus far I
was looking at the ones using FULL. Those seem to work as intended.
Yeah, that didn't work so well -- it was trying to propagate the command
verbatim to each partition, and obviously the index names did not match.
So this subcommand has to reimplement the recursion internally, as in
the attached.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
v2-0001-index_get_partition.patchtext/x-diff; charset=us-asciiDownload
From f205a06c9aa6061dd1bac01ef63017aa6811b5f9 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Tue, 22 Jan 2019 18:00:31 -0300
Subject: [PATCH v2 1/2] index_get_partition
---
src/backend/catalog/partition.c | 35 +++++++++++++++++++++++++++++++++++
src/backend/commands/tablecmds.c | 40 +++++++++++-----------------------------
src/include/catalog/partition.h | 1 +
3 files changed, 47 insertions(+), 29 deletions(-)
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 0d3bc3a2c7..66012bb28a 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -146,6 +146,41 @@ get_partition_ancestors_worker(Relation inhRel, Oid relid, List **ancestors)
}
/*
+ * Return the OID of the index, in the given partition, that is a child of the
+ * given index or InvalidOid if there isn't one.
+ */
+Oid
+index_get_partition(Relation partition, Oid indexId)
+{
+ List *idxlist = RelationGetIndexList(partition);
+ ListCell *l;
+
+ foreach(l, idxlist)
+ {
+ Oid partIdx = lfirst_oid(l);
+ HeapTuple tup;
+ Form_pg_class classForm;
+ bool ispartition;
+
+ tup = SearchSysCache1(RELOID, ObjectIdGetDatum(partIdx));
+ if (!tup)
+ elog(ERROR, "cache lookup failed for relation %u", partIdx);
+ classForm = (Form_pg_class) GETSTRUCT(tup);
+ ispartition = classForm->relispartition;
+ ReleaseSysCache(tup);
+ if (!ispartition)
+ continue;
+ if (get_partition_parent(lfirst_oid(l)) == indexId)
+ {
+ list_free(idxlist);
+ return partIdx;
+ }
+ }
+
+ return InvalidOid;
+}
+
+/*
* map_partition_varattnos - maps varattno of any Vars in expr from the
* attno's of 'from_rel' to the attno's of 'to_rel' partition, each of which
* may be either a leaf partition or a partitioned table, but both of which
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 35a9ade059..877bac506f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15427,36 +15427,18 @@ ATExecAttachPartitionIdx(List **wqueue, Relation parentIdx, RangeVar *name)
static void
refuseDupeIndexAttach(Relation parentIdx, Relation partIdx, Relation partitionTbl)
{
- Relation pg_inherits;
- ScanKeyData key;
- HeapTuple tuple;
- SysScanDesc scan;
+ Oid existingIdx;
- pg_inherits = table_open(InheritsRelationId, AccessShareLock);
- ScanKeyInit(&key, Anum_pg_inherits_inhparent,
- BTEqualStrategyNumber, F_OIDEQ,
- ObjectIdGetDatum(RelationGetRelid(parentIdx)));
- scan = systable_beginscan(pg_inherits, InheritsParentIndexId, true,
- NULL, 1, &key);
- while (HeapTupleIsValid(tuple = systable_getnext(scan)))
- {
- Form_pg_inherits inhForm;
- Oid tab;
-
- inhForm = (Form_pg_inherits) GETSTRUCT(tuple);
- tab = IndexGetRelation(inhForm->inhrelid, false);
- if (tab == RelationGetRelid(partitionTbl))
- ereport(ERROR,
- (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("cannot attach index \"%s\" as a partition of index \"%s\"",
- RelationGetRelationName(partIdx),
- RelationGetRelationName(parentIdx)),
- errdetail("Another index is already attached for partition \"%s\".",
- RelationGetRelationName(partitionTbl))));
- }
-
- systable_endscan(scan);
- table_close(pg_inherits, AccessShareLock);
+ existingIdx = index_get_partition(partitionTbl,
+ RelationGetRelid(parentIdx));
+ if (OidIsValid(existingIdx))
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot attach index \"%s\" as a partition of index \"%s\"",
+ RelationGetRelationName(partIdx),
+ RelationGetRelationName(parentIdx)),
+ errdetail("Another index is already attached for partition \"%s\".",
+ RelationGetRelationName(partitionTbl))));
}
/*
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
index 5685d2fd57..7f0b198bcb 100644
--- a/src/include/catalog/partition.h
+++ b/src/include/catalog/partition.h
@@ -35,6 +35,7 @@ typedef struct PartitionDescData
extern Oid get_partition_parent(Oid relid);
extern List *get_partition_ancestors(Oid relid);
+extern Oid index_get_partition(Relation partition, Oid indexId);
extern List *map_partition_varattnos(List *expr, int fromrel_varno,
Relation to_rel, Relation from_rel,
bool *found_whole_row);
--
2.11.0
v2-0002-Propagate-replica-identity-to-partitions.patchtext/x-diff; charset=us-asciiDownload
From 688b801b8799ecd9827ee203bfb690536ac84ff4 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Mon, 4 Feb 2019 13:43:58 -0300
Subject: [PATCH v2 2/2] Propagate replica identity to partitions
---
src/backend/commands/tablecmds.c | 108 +++++++++++++++--
src/bin/psql/describe.c | 3 +-
src/test/regress/expected/replica_identity.out | 160 +++++++++++++++++++++++++
src/test/regress/sql/replica_identity.sql | 43 +++++++
4 files changed, 304 insertions(+), 10 deletions(-)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 877bac506f..22cec85ab0 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -300,7 +300,7 @@ static void truncate_check_activity(Relation rel);
static void RangeVarCallbackForTruncate(const RangeVar *relation,
Oid relId, Oid oldRelId, void *arg);
static List *MergeAttributes(List *schema, List *supers, char relpersistence,
- bool is_partition, List **supconstr);
+ bool is_partition, List **supconstr, char *ri_type);
static bool MergeCheckConstraint(List *constraints, char *name, Node *expr);
static void MergeAttributesIntoExisting(Relation child_rel, Relation parent_rel);
static void MergeConstraintsIntoExisting(Relation child_rel, Relation parent_rel);
@@ -485,6 +485,7 @@ static void AttachPartitionEnsureIndexes(Relation rel, Relation attachrel);
static void QueuePartitionConstraintValidation(List **wqueue, Relation scanrel,
List *partConstraint,
bool validate_default);
+static void MatchReplicaIdentity(Relation tgtrel, Relation srcrel);
static void CloneRowTriggersToPartition(Relation parent, Relation partition);
static ObjectAddress ATExecDetachPartition(Relation rel, RangeVar *name);
static ObjectAddress ATExecAttachPartitionIdx(List **wqueue, Relation rel,
@@ -527,6 +528,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
TupleDesc descriptor;
List *inheritOids;
List *old_constraints;
+ char ri_type;
List *rawDefaults;
List *cookedDefaults;
Datum reloptions;
@@ -708,7 +710,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
MergeAttributes(stmt->tableElts, inheritOids,
stmt->relation->relpersistence,
stmt->partbound != NULL,
- &old_constraints);
+ &old_constraints, &ri_type);
/*
* Create a tuple descriptor from the relation schema. Note that this
@@ -1014,6 +1016,10 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
*/
CloneForeignKeyConstraints(parentId, relationId, NULL);
+ /* Propagate REPLICA IDENTITY information too */
+ if (ri_type != REPLICA_IDENTITY_DEFAULT)
+ MatchReplicaIdentity(rel, parent);
+
table_close(parent, NoLock);
}
@@ -1873,6 +1879,8 @@ storage_name(char c)
* Output arguments:
* 'supconstr' receives a list of constraints belonging to the parents,
* updated as necessary to be valid for the child.
+ * 'ri_type' receives the replica identity type of the last parent seen,
+ * or default if none.
*
* Return value:
* Completed schema list.
@@ -1914,11 +1922,15 @@ storage_name(char c)
* (4) Otherwise the inherited default is used.
* Rule (3) is new in Postgres 7.1; in earlier releases you got a
* rather arbitrary choice of which parent default to use.
+ *
+ * It only makes sense to use the returned 'ri_type' when there's a single
+ * parent, such as in declarative partitioning.
*----------
*/
static List *
MergeAttributes(List *schema, List *supers, char relpersistence,
- bool is_partition, List **supconstr)
+ bool is_partition, List **supconstr,
+ char *ri_type)
{
ListCell *entry;
List *inhSchema = NIL;
@@ -2015,6 +2027,11 @@ MergeAttributes(List *schema, List *supers, char relpersistence,
}
/*
+ * Initialize replica identity to default; parents may change it later
+ */
+ *ri_type = REPLICA_IDENTITY_DEFAULT;
+
+ /*
* Scan the parents left-to-right, and merge their attributes to form a
* list of inherited attributes (inhSchema). Also check to see if we need
* to inherit an OID column.
@@ -2095,6 +2112,9 @@ MergeAttributes(List *schema, List *supers, char relpersistence,
? "cannot inherit from temporary relation of another session"
: "cannot create as partition of temporary relation of another session")));
+ /* Indicate replica identity back to caller */
+ *ri_type = relation->rd_rel->relreplident;
+
/*
* We should have an UNDER permission flag for this, but for now,
* demand that creator of a child table own the parent.
@@ -3935,7 +3955,7 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
case AT_ReplicaIdentity: /* REPLICA IDENTITY ... */
ATSimplePermissions(rel, ATT_TABLE | ATT_MATVIEW);
pass = AT_PASS_MISC;
- /* This command never recurses */
+ /* Recursion occurs during execution phase */
/* No command-specific prep needed */
break;
case AT_EnableTrig: /* ENABLE TRIGGER variants */
@@ -12756,7 +12776,7 @@ ATExecDropOf(Relation rel, LOCKMODE lockmode)
*/
static void
relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid,
- bool is_internal)
+ bool is_internal, LOCKMODE lockmode)
{
Relation pg_index;
Relation pg_class;
@@ -12847,6 +12867,42 @@ relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid,
}
table_close(pg_index, RowExclusiveLock);
+
+ /*
+ * If there are any partitions, handle them too.
+ */
+ if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+ {
+ PartitionDesc pd = RelationGetPartitionDesc(rel);
+
+ for (int i = 0; i < pd->nparts; i++)
+ {
+ Relation part = table_open(pd->oids[i], lockmode);
+ Oid idxOid;
+
+ if (ri_type == REPLICA_IDENTITY_INDEX)
+ {
+ idxOid = index_get_partition(part, indexOid);
+ if (!OidIsValid(idxOid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("replica index does not exist in partition \"%s\"",
+ RelationGetRelationName(part))));
+
+ LockRelationOid(idxOid, ShareLock);
+ }
+ else
+ idxOid = InvalidOid;
+
+ idxOid = ri_type == REPLICA_IDENTITY_INDEX ?
+ index_get_partition(part, indexOid) : InvalidOid;
+
+ relation_mark_replica_identity(part, ri_type, idxOid, true,
+ lockmode);
+
+ table_close(part, NoLock);
+ }
+ }
}
/*
@@ -12861,17 +12917,20 @@ ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKMODE lockmode
if (stmt->identity_type == REPLICA_IDENTITY_DEFAULT)
{
- relation_mark_replica_identity(rel, stmt->identity_type, InvalidOid, true);
+ relation_mark_replica_identity(rel, stmt->identity_type, InvalidOid, true,
+ lockmode);
return;
}
else if (stmt->identity_type == REPLICA_IDENTITY_FULL)
{
- relation_mark_replica_identity(rel, stmt->identity_type, InvalidOid, true);
+ relation_mark_replica_identity(rel, stmt->identity_type, InvalidOid, true,
+ lockmode);
return;
}
else if (stmt->identity_type == REPLICA_IDENTITY_NOTHING)
{
- relation_mark_replica_identity(rel, stmt->identity_type, InvalidOid, true);
+ relation_mark_replica_identity(rel, stmt->identity_type, InvalidOid, true,
+ lockmode);
return;
}
else if (stmt->identity_type == REPLICA_IDENTITY_INDEX)
@@ -12959,7 +13018,8 @@ ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKMODE lockmode
}
/* This index is suitable for use as a replica identity. Mark it. */
- relation_mark_replica_identity(rel, stmt->identity_type, indexOid, true);
+ relation_mark_replica_identity(rel, stmt->identity_type, indexOid, true,
+ lockmode);
index_close(indexRel, NoLock);
}
@@ -14664,6 +14724,10 @@ ATExecAttachPartition(List **wqueue, Relation rel, PartitionCmd *cmd)
/* and triggers */
CloneRowTriggersToPartition(rel, attachrel);
+ /* Propagate REPLICA IDENTITY information */
+ if (rel->rd_rel->relreplident != REPLICA_IDENTITY_DEFAULT)
+ MatchReplicaIdentity(attachrel, rel);
+
/*
* Clone foreign key constraints, and setup for Phase 3 to verify them.
*/
@@ -14915,6 +14979,32 @@ AttachPartitionEnsureIndexes(Relation rel, Relation attachrel)
}
/*
+ * Set up partRel's (a partition) replica identity to match parentRel's (its
+ * parent).
+ */
+static void
+MatchReplicaIdentity(Relation partRel, Relation srcrel)
+{
+ Oid ri_index;
+
+ if (srcrel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX)
+ {
+ ri_index = index_get_partition(partRel,
+ RelationGetReplicaIndex(srcrel));
+ if (!OidIsValid(ri_index))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("replica index does not exist in partition \"%s\"",
+ RelationGetRelationName(partRel))));
+ }
+ else
+ ri_index = InvalidOid;
+
+ relation_mark_replica_identity(partRel, srcrel->rd_rel->relreplident,
+ ri_index, true, AccessExclusiveLock);
+}
+
+/*
* CloneRowTriggersToPartition
* subroutine for ATExecAttachPartition/DefineRelation to create row
* triggers on partitions
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 4da6719ce7..6145a000cb 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3113,7 +3113,8 @@ describeOneTableDetails(const char *schemaname,
if (verbose &&
(tableinfo.relkind == RELKIND_RELATION ||
- tableinfo.relkind == RELKIND_MATVIEW) &&
+ tableinfo.relkind == RELKIND_MATVIEW ||
+ tableinfo.relkind == RELKIND_PARTITIONED_TABLE) &&
/*
* No need to display default values; we already display a REPLICA
diff --git a/src/test/regress/expected/replica_identity.out b/src/test/regress/expected/replica_identity.out
index 175ecd2879..d6014df840 100644
--- a/src/test/regress/expected/replica_identity.out
+++ b/src/test/regress/expected/replica_identity.out
@@ -181,3 +181,163 @@ SELECT relreplident FROM pg_class WHERE oid = 'test_replica_identity'::regclass;
DROP TABLE test_replica_identity;
DROP TABLE test_replica_identity_othertable;
+----
+-- Make sure it propagates to partitions
+----
+CREATE TABLE test_replica_identity_part (a int, b int) PARTITION BY RANGE (a);
+CREATE TABLE test_replica_identity_part1 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (0) TO (1000) PARTITION BY RANGE (a);
+CREATE TABLE test_replica_identity_part2 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (1000) TO (2000);
+CREATE TABLE test_replica_identity_part11 PARTITION OF test_replica_identity_part1
+ FOR VALUES FROM (1000) TO (1500);
+ALTER TABLE test_replica_identity_part REPLICA IDENTITY FULL;
+CREATE TABLE test_replica_identity_part3 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (2000) TO (3000);
+CREATE TABLE test_replica_identity_part4 (LIKE test_replica_identity_part);
+ALTER TABLE test_replica_identity_part ATTACH PARTITION test_replica_identity_part4
+ FOR VALUES FROM (3000) TO (4000);
+\d+ test_replica_identity_part2
+ Table "public.test_replica_identity_part2"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part FOR VALUES FROM (1000) TO (2000)
+Partition constraint: ((a IS NOT NULL) AND (a >= 1000) AND (a < 2000))
+Replica Identity: FULL
+
+\d+ test_replica_identity_part11
+ Table "public.test_replica_identity_part11"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part1 FOR VALUES FROM (1000) TO (1500)
+Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 1000) AND (a IS NOT NULL) AND (a >= 1000) AND (a < 1500))
+Replica Identity: FULL
+
+\d+ test_replica_identity_part
+ Partitioned table "public.test_replica_identity_part"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+ b | integer | | | | plain | |
+Partition key: RANGE (a)
+Partitions: test_replica_identity_part1 FOR VALUES FROM (0) TO (1000), PARTITIONED,
+ test_replica_identity_part2 FOR VALUES FROM (1000) TO (2000),
+ test_replica_identity_part3 FOR VALUES FROM (2000) TO (3000),
+ test_replica_identity_part4 FOR VALUES FROM (3000) TO (4000)
+Replica Identity: FULL
+
+\d+ test_replica_identity_part3
+ Table "public.test_replica_identity_part3"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part FOR VALUES FROM (2000) TO (3000)
+Partition constraint: ((a IS NOT NULL) AND (a >= 2000) AND (a < 3000))
+Replica Identity: FULL
+
+\d+ test_replica_identity_part4
+ Table "public.test_replica_identity_part4"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part FOR VALUES FROM (3000) TO (4000)
+Partition constraint: ((a IS NOT NULL) AND (a >= 3000) AND (a < 4000))
+Replica Identity: FULL
+
+ALTER TABLE test_replica_identity_part ALTER a SET NOT NULL;
+CREATE UNIQUE INDEX trip_b_idx ON test_replica_identity_part (a);
+ALTER TABLE test_replica_identity_part REPLICA IDENTITY USING INDEX trip_b_idx;
+\d+ test_replica_identity_part2
+ Table "public.test_replica_identity_part2"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | not null | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part FOR VALUES FROM (1000) TO (2000)
+Partition constraint: ((a IS NOT NULL) AND (a >= 1000) AND (a < 2000))
+Indexes:
+ "test_replica_identity_part2_a_idx" UNIQUE, btree (a) REPLICA IDENTITY
+
+\d+ test_replica_identity_part11
+ Table "public.test_replica_identity_part11"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | not null | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part1 FOR VALUES FROM (1000) TO (1500)
+Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 1000) AND (a IS NOT NULL) AND (a >= 1000) AND (a < 1500))
+Indexes:
+ "test_replica_identity_part11_a_idx" UNIQUE, btree (a) REPLICA IDENTITY
+
+\d+ test_replica_identity_part
+ Partitioned table "public.test_replica_identity_part"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | not null | | plain | |
+ b | integer | | | | plain | |
+Partition key: RANGE (a)
+Indexes:
+ "trip_b_idx" UNIQUE, btree (a) REPLICA IDENTITY
+Partitions: test_replica_identity_part1 FOR VALUES FROM (0) TO (1000), PARTITIONED,
+ test_replica_identity_part2 FOR VALUES FROM (1000) TO (2000),
+ test_replica_identity_part3 FOR VALUES FROM (2000) TO (3000),
+ test_replica_identity_part4 FOR VALUES FROM (3000) TO (4000)
+
+\d+ test_replica_identity_part3
+ Table "public.test_replica_identity_part3"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | not null | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part FOR VALUES FROM (2000) TO (3000)
+Partition constraint: ((a IS NOT NULL) AND (a >= 2000) AND (a < 3000))
+Indexes:
+ "test_replica_identity_part3_a_idx" UNIQUE, btree (a) REPLICA IDENTITY
+
+\d+ test_replica_identity_part4
+ Table "public.test_replica_identity_part4"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | not null | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part FOR VALUES FROM (3000) TO (4000)
+Partition constraint: ((a IS NOT NULL) AND (a >= 3000) AND (a < 4000))
+Indexes:
+ "test_replica_identity_part4_a_idx" UNIQUE, btree (a) REPLICA IDENTITY
+
+----
+-- Check behavior with inherited tables
+----
+CREATE TABLE test_replica_identity_inh (a int);
+CREATE TABLE test_replica_identity_cld () INHERITS (test_replica_identity_inh);
+ALTER TABLE test_replica_identity_inh REPLICA IDENTITY FULL;
+CREATE TABLE test_replica_identity_cld2 () INHERITS (test_replica_identity_inh);
+\d+ test_replica_identity_inh
+ Table "public.test_replica_identity_inh"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+Child tables: test_replica_identity_cld,
+ test_replica_identity_cld2
+Replica Identity: FULL
+
+\d+ test_replica_identity_cld
+ Table "public.test_replica_identity_cld"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+Inherits: test_replica_identity_inh
+
+\d+ test_replica_identity_cld2
+ Table "public.test_replica_identity_cld2"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+Inherits: test_replica_identity_inh
+
diff --git a/src/test/regress/sql/replica_identity.sql b/src/test/regress/sql/replica_identity.sql
index b08a3623b8..9e309796f2 100644
--- a/src/test/regress/sql/replica_identity.sql
+++ b/src/test/regress/sql/replica_identity.sql
@@ -77,3 +77,46 @@ SELECT relreplident FROM pg_class WHERE oid = 'test_replica_identity'::regclass;
DROP TABLE test_replica_identity;
DROP TABLE test_replica_identity_othertable;
+
+----
+-- Make sure it propagates to partitions
+----
+CREATE TABLE test_replica_identity_part (a int, b int) PARTITION BY RANGE (a);
+CREATE TABLE test_replica_identity_part1 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (0) TO (1000) PARTITION BY RANGE (a);
+CREATE TABLE test_replica_identity_part2 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (1000) TO (2000);
+CREATE TABLE test_replica_identity_part11 PARTITION OF test_replica_identity_part1
+ FOR VALUES FROM (1000) TO (1500);
+ALTER TABLE test_replica_identity_part REPLICA IDENTITY FULL;
+CREATE TABLE test_replica_identity_part3 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (2000) TO (3000);
+CREATE TABLE test_replica_identity_part4 (LIKE test_replica_identity_part);
+ALTER TABLE test_replica_identity_part ATTACH PARTITION test_replica_identity_part4
+ FOR VALUES FROM (3000) TO (4000);
+\d+ test_replica_identity_part2
+\d+ test_replica_identity_part11
+\d+ test_replica_identity_part
+\d+ test_replica_identity_part3
+\d+ test_replica_identity_part4
+
+ALTER TABLE test_replica_identity_part ALTER a SET NOT NULL;
+CREATE UNIQUE INDEX trip_b_idx ON test_replica_identity_part (a);
+ALTER TABLE test_replica_identity_part REPLICA IDENTITY USING INDEX trip_b_idx;
+\d+ test_replica_identity_part2
+\d+ test_replica_identity_part11
+\d+ test_replica_identity_part
+\d+ test_replica_identity_part3
+\d+ test_replica_identity_part4
+
+
+----
+-- Check behavior with inherited tables
+----
+CREATE TABLE test_replica_identity_inh (a int);
+CREATE TABLE test_replica_identity_cld () INHERITS (test_replica_identity_inh);
+ALTER TABLE test_replica_identity_inh REPLICA IDENTITY FULL;
+CREATE TABLE test_replica_identity_cld2 () INHERITS (test_replica_identity_inh);
+\d+ test_replica_identity_inh
+\d+ test_replica_identity_cld
+\d+ test_replica_identity_cld2
--
2.11.0
Actually, index-based replica identities failed in pg_dump: we first
create the index ONLY on the partitioned table (marking it as invalid),
so when we immediately do the ALTER TABLE/REPLICA IDENTITY, it rejects
the invalid index. If we change the code to allow invalid indexes on
partitioned tables to become replica identities, we hit the problem that
the index didn't exist when processing the partition list. In order to
fix that, I added a flag so that partitions are allowed not to have the
index, in hopes that the missing index are created in subsequent
commands; those indexes should become valid & identity afterwards.
There's a small emerging problem, which is that if you have an invalid
index marked as replica identity, you cannot create any more partitions;
the reason is that we want to propagate the replica identity to the
partition, but the index is not there (since invalid indexes are not
propagated). I don't think this case is worth supporting; it can be
fixed but requires some work[1]In DefineRelation, we obtain the list of indexes to clone by RelationGetIndexList, which ignores invalid indexes. In order to implement this we'd need to include invalid indexes in that list (possibly by just not using RelationGetIndexList.).
New patch attached.
[1]: In DefineRelation, we obtain the list of indexes to clone by RelationGetIndexList, which ignores invalid indexes. In order to implement this we'd need to include invalid indexes in that list (possibly by just not using RelationGetIndexList.)
RelationGetIndexList, which ignores invalid indexes. In order to
implement this we'd need to include invalid indexes in that list
(possibly by just not using RelationGetIndexList.)
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
v3-0001-index_get_partition.patchtext/x-diff; charset=us-asciiDownload
From f205a06c9aa6061dd1bac01ef63017aa6811b5f9 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Tue, 22 Jan 2019 18:00:31 -0300
Subject: [PATCH v3 1/2] index_get_partition
---
src/backend/catalog/partition.c | 35 +++++++++++++++++++++++++++++++++++
src/backend/commands/tablecmds.c | 40 +++++++++++-----------------------------
src/include/catalog/partition.h | 1 +
3 files changed, 47 insertions(+), 29 deletions(-)
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 0d3bc3a2c7..66012bb28a 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -146,6 +146,41 @@ get_partition_ancestors_worker(Relation inhRel, Oid relid, List **ancestors)
}
/*
+ * Return the OID of the index, in the given partition, that is a child of the
+ * given index or InvalidOid if there isn't one.
+ */
+Oid
+index_get_partition(Relation partition, Oid indexId)
+{
+ List *idxlist = RelationGetIndexList(partition);
+ ListCell *l;
+
+ foreach(l, idxlist)
+ {
+ Oid partIdx = lfirst_oid(l);
+ HeapTuple tup;
+ Form_pg_class classForm;
+ bool ispartition;
+
+ tup = SearchSysCache1(RELOID, ObjectIdGetDatum(partIdx));
+ if (!tup)
+ elog(ERROR, "cache lookup failed for relation %u", partIdx);
+ classForm = (Form_pg_class) GETSTRUCT(tup);
+ ispartition = classForm->relispartition;
+ ReleaseSysCache(tup);
+ if (!ispartition)
+ continue;
+ if (get_partition_parent(lfirst_oid(l)) == indexId)
+ {
+ list_free(idxlist);
+ return partIdx;
+ }
+ }
+
+ return InvalidOid;
+}
+
+/*
* map_partition_varattnos - maps varattno of any Vars in expr from the
* attno's of 'from_rel' to the attno's of 'to_rel' partition, each of which
* may be either a leaf partition or a partitioned table, but both of which
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 35a9ade059..877bac506f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15427,36 +15427,18 @@ ATExecAttachPartitionIdx(List **wqueue, Relation parentIdx, RangeVar *name)
static void
refuseDupeIndexAttach(Relation parentIdx, Relation partIdx, Relation partitionTbl)
{
- Relation pg_inherits;
- ScanKeyData key;
- HeapTuple tuple;
- SysScanDesc scan;
+ Oid existingIdx;
- pg_inherits = table_open(InheritsRelationId, AccessShareLock);
- ScanKeyInit(&key, Anum_pg_inherits_inhparent,
- BTEqualStrategyNumber, F_OIDEQ,
- ObjectIdGetDatum(RelationGetRelid(parentIdx)));
- scan = systable_beginscan(pg_inherits, InheritsParentIndexId, true,
- NULL, 1, &key);
- while (HeapTupleIsValid(tuple = systable_getnext(scan)))
- {
- Form_pg_inherits inhForm;
- Oid tab;
-
- inhForm = (Form_pg_inherits) GETSTRUCT(tuple);
- tab = IndexGetRelation(inhForm->inhrelid, false);
- if (tab == RelationGetRelid(partitionTbl))
- ereport(ERROR,
- (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("cannot attach index \"%s\" as a partition of index \"%s\"",
- RelationGetRelationName(partIdx),
- RelationGetRelationName(parentIdx)),
- errdetail("Another index is already attached for partition \"%s\".",
- RelationGetRelationName(partitionTbl))));
- }
-
- systable_endscan(scan);
- table_close(pg_inherits, AccessShareLock);
+ existingIdx = index_get_partition(partitionTbl,
+ RelationGetRelid(parentIdx));
+ if (OidIsValid(existingIdx))
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot attach index \"%s\" as a partition of index \"%s\"",
+ RelationGetRelationName(partIdx),
+ RelationGetRelationName(parentIdx)),
+ errdetail("Another index is already attached for partition \"%s\".",
+ RelationGetRelationName(partitionTbl))));
}
/*
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
index 5685d2fd57..7f0b198bcb 100644
--- a/src/include/catalog/partition.h
+++ b/src/include/catalog/partition.h
@@ -35,6 +35,7 @@ typedef struct PartitionDescData
extern Oid get_partition_parent(Oid relid);
extern List *get_partition_ancestors(Oid relid);
+extern Oid index_get_partition(Relation partition, Oid indexId);
extern List *map_partition_varattnos(List *expr, int fromrel_varno,
Relation to_rel, Relation from_rel,
bool *found_whole_row);
--
2.11.0
v3-0002-Propagate-replica-identity-to-partitions.patchtext/x-diff; charset=us-asciiDownload
From cb541cdd0aa4a045db72239e8a0a6795be5820e4 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Mon, 4 Feb 2019 13:43:58 -0300
Subject: [PATCH v3 2/2] Propagate replica identity to partitions
---
src/backend/commands/tablecmds.c | 132 ++++++++++++++++++--
src/bin/psql/describe.c | 3 +-
src/test/regress/expected/replica_identity.out | 160 +++++++++++++++++++++++++
src/test/regress/sql/replica_identity.sql | 43 +++++++
4 files changed, 326 insertions(+), 12 deletions(-)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 877bac506f..b26872e78f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -300,7 +300,7 @@ static void truncate_check_activity(Relation rel);
static void RangeVarCallbackForTruncate(const RangeVar *relation,
Oid relId, Oid oldRelId, void *arg);
static List *MergeAttributes(List *schema, List *supers, char relpersistence,
- bool is_partition, List **supconstr);
+ bool is_partition, List **supconstr, char *ri_type);
static bool MergeCheckConstraint(List *constraints, char *name, Node *expr);
static void MergeAttributesIntoExisting(Relation child_rel, Relation parent_rel);
static void MergeConstraintsIntoExisting(Relation child_rel, Relation parent_rel);
@@ -485,6 +485,7 @@ static void AttachPartitionEnsureIndexes(Relation rel, Relation attachrel);
static void QueuePartitionConstraintValidation(List **wqueue, Relation scanrel,
List *partConstraint,
bool validate_default);
+static void MatchReplicaIdentity(Relation tgtrel, Relation srcrel);
static void CloneRowTriggersToPartition(Relation parent, Relation partition);
static ObjectAddress ATExecDetachPartition(Relation rel, RangeVar *name);
static ObjectAddress ATExecAttachPartitionIdx(List **wqueue, Relation rel,
@@ -527,6 +528,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
TupleDesc descriptor;
List *inheritOids;
List *old_constraints;
+ char ri_type;
List *rawDefaults;
List *cookedDefaults;
Datum reloptions;
@@ -708,7 +710,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
MergeAttributes(stmt->tableElts, inheritOids,
stmt->relation->relpersistence,
stmt->partbound != NULL,
- &old_constraints);
+ &old_constraints, &ri_type);
/*
* Create a tuple descriptor from the relation schema. Note that this
@@ -1014,6 +1016,10 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
*/
CloneForeignKeyConstraints(parentId, relationId, NULL);
+ /* Propagate REPLICA IDENTITY information too */
+ if (ri_type != REPLICA_IDENTITY_DEFAULT)
+ MatchReplicaIdentity(rel, parent);
+
table_close(parent, NoLock);
}
@@ -1873,6 +1879,8 @@ storage_name(char c)
* Output arguments:
* 'supconstr' receives a list of constraints belonging to the parents,
* updated as necessary to be valid for the child.
+ * 'ri_type' receives the replica identity type of the last parent seen,
+ * or default if none.
*
* Return value:
* Completed schema list.
@@ -1914,11 +1922,15 @@ storage_name(char c)
* (4) Otherwise the inherited default is used.
* Rule (3) is new in Postgres 7.1; in earlier releases you got a
* rather arbitrary choice of which parent default to use.
+ *
+ * It only makes sense to use the returned 'ri_type' when there's a single
+ * parent, such as in declarative partitioning.
*----------
*/
static List *
MergeAttributes(List *schema, List *supers, char relpersistence,
- bool is_partition, List **supconstr)
+ bool is_partition, List **supconstr,
+ char *ri_type)
{
ListCell *entry;
List *inhSchema = NIL;
@@ -2015,6 +2027,11 @@ MergeAttributes(List *schema, List *supers, char relpersistence,
}
/*
+ * Initialize replica identity to default; parents may change it later
+ */
+ *ri_type = REPLICA_IDENTITY_DEFAULT;
+
+ /*
* Scan the parents left-to-right, and merge their attributes to form a
* list of inherited attributes (inhSchema). Also check to see if we need
* to inherit an OID column.
@@ -2095,6 +2112,9 @@ MergeAttributes(List *schema, List *supers, char relpersistence,
? "cannot inherit from temporary relation of another session"
: "cannot create as partition of temporary relation of another session")));
+ /* Indicate replica identity back to caller */
+ *ri_type = relation->rd_rel->relreplident;
+
/*
* We should have an UNDER permission flag for this, but for now,
* demand that creator of a child table own the parent.
@@ -3935,7 +3955,7 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
case AT_ReplicaIdentity: /* REPLICA IDENTITY ... */
ATSimplePermissions(rel, ATT_TABLE | ATT_MATVIEW);
pass = AT_PASS_MISC;
- /* This command never recurses */
+ /* Recursion occurs during execution phase */
/* No command-specific prep needed */
break;
case AT_EnableTrig: /* ENABLE TRIGGER variants */
@@ -12753,10 +12773,13 @@ ATExecDropOf(Relation rel, LOCKMODE lockmode)
*
* Iff ri_type = REPLICA_IDENTITY_INDEX, indexOid must be the Oid of a suitable
* index. Otherwise, it should be InvalidOid.
+ *
+ * 'permissive' means to ignore the case where a partitioned table contains
+ * a partition without the index. If false, raise an error.
*/
static void
relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid,
- bool is_internal)
+ bool is_internal, bool permissive, LOCKMODE lockmode)
{
Relation pg_index;
Relation pg_class;
@@ -12847,6 +12870,58 @@ relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid,
}
table_close(pg_index, RowExclusiveLock);
+
+ /*
+ * If there are any partitions, handle them too.
+ */
+ if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+ {
+ PartitionDesc pd = RelationGetPartitionDesc(rel);
+
+ for (int i = 0; i < pd->nparts; i++)
+ {
+ Relation part = table_open(pd->oids[i], lockmode);
+ Oid idxOid;
+
+ if (ri_type == REPLICA_IDENTITY_INDEX)
+ {
+ /*
+ * When using an index as identity, and some partition does not
+ * have the index, we either raise an error (if not in
+ * "permissive" mode), or ignore the partition; this happens
+ * when the index hierarchy is being restored by pg_dump. The
+ * only thing we can do at this point in that case is hope that
+ * there will be another command creating the right index soon.
+ */
+ idxOid = index_get_partition(part, indexOid);
+ if (!OidIsValid(idxOid))
+ {
+ if (permissive)
+ {
+ table_close(part, NoLock);
+ continue;
+ }
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("replica index does not exist in partition \"%s\"",
+ RelationGetRelationName(part))));
+ }
+
+ LockRelationOid(idxOid, ShareLock);
+ }
+ else
+ idxOid = InvalidOid;
+
+ idxOid = ri_type == REPLICA_IDENTITY_INDEX ?
+ index_get_partition(part, indexOid) : InvalidOid;
+
+ relation_mark_replica_identity(part, ri_type, idxOid, true,
+ permissive, lockmode);
+
+ table_close(part, NoLock);
+ }
+ }
}
/*
@@ -12861,17 +12936,20 @@ ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKMODE lockmode
if (stmt->identity_type == REPLICA_IDENTITY_DEFAULT)
{
- relation_mark_replica_identity(rel, stmt->identity_type, InvalidOid, true);
+ relation_mark_replica_identity(rel, stmt->identity_type, InvalidOid, true,
+ false, lockmode);
return;
}
else if (stmt->identity_type == REPLICA_IDENTITY_FULL)
{
- relation_mark_replica_identity(rel, stmt->identity_type, InvalidOid, true);
+ relation_mark_replica_identity(rel, stmt->identity_type, InvalidOid, true,
+ false, lockmode);
return;
}
else if (stmt->identity_type == REPLICA_IDENTITY_NOTHING)
{
- relation_mark_replica_identity(rel, stmt->identity_type, InvalidOid, true);
+ relation_mark_replica_identity(rel, stmt->identity_type, InvalidOid, true,
+ false, lockmode);
return;
}
else if (stmt->identity_type == REPLICA_IDENTITY_INDEX)
@@ -12925,8 +13003,9 @@ ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKMODE lockmode
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot use partial index \"%s\" as replica identity",
RelationGetRelationName(indexRel))));
- /* And neither are invalid indexes. */
- if (!indexRel->rd_index->indisvalid)
+ /* And neither are invalid indexes, except on partitioned tables. */
+ if (!indexRel->rd_index->indisvalid &&
+ rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot use invalid index \"%s\" as replica identity",
@@ -12959,7 +13038,8 @@ ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKMODE lockmode
}
/* This index is suitable for use as a replica identity. Mark it. */
- relation_mark_replica_identity(rel, stmt->identity_type, indexOid, true);
+ relation_mark_replica_identity(rel, stmt->identity_type, indexOid, true,
+ true, lockmode);
index_close(indexRel, NoLock);
}
@@ -14664,6 +14744,10 @@ ATExecAttachPartition(List **wqueue, Relation rel, PartitionCmd *cmd)
/* and triggers */
CloneRowTriggersToPartition(rel, attachrel);
+ /* Propagate REPLICA IDENTITY information */
+ if (rel->rd_rel->relreplident != REPLICA_IDENTITY_DEFAULT)
+ MatchReplicaIdentity(attachrel, rel);
+
/*
* Clone foreign key constraints, and setup for Phase 3 to verify them.
*/
@@ -14915,6 +14999,32 @@ AttachPartitionEnsureIndexes(Relation rel, Relation attachrel)
}
/*
+ * Set up partRel's (a partition) replica identity to match parentRel's (its
+ * parent).
+ */
+static void
+MatchReplicaIdentity(Relation partRel, Relation srcrel)
+{
+ Oid ri_index;
+
+ if (srcrel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX)
+ {
+ ri_index = index_get_partition(partRel,
+ RelationGetReplicaIndex(srcrel));
+ if (!OidIsValid(ri_index))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("replica index does not exist in partition \"%s\"",
+ RelationGetRelationName(partRel))));
+ }
+ else
+ ri_index = InvalidOid;
+
+ relation_mark_replica_identity(partRel, srcrel->rd_rel->relreplident,
+ ri_index, true, false, AccessExclusiveLock);
+}
+
+/*
* CloneRowTriggersToPartition
* subroutine for ATExecAttachPartition/DefineRelation to create row
* triggers on partitions
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 4da6719ce7..6145a000cb 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3113,7 +3113,8 @@ describeOneTableDetails(const char *schemaname,
if (verbose &&
(tableinfo.relkind == RELKIND_RELATION ||
- tableinfo.relkind == RELKIND_MATVIEW) &&
+ tableinfo.relkind == RELKIND_MATVIEW ||
+ tableinfo.relkind == RELKIND_PARTITIONED_TABLE) &&
/*
* No need to display default values; we already display a REPLICA
diff --git a/src/test/regress/expected/replica_identity.out b/src/test/regress/expected/replica_identity.out
index 175ecd2879..d6014df840 100644
--- a/src/test/regress/expected/replica_identity.out
+++ b/src/test/regress/expected/replica_identity.out
@@ -181,3 +181,163 @@ SELECT relreplident FROM pg_class WHERE oid = 'test_replica_identity'::regclass;
DROP TABLE test_replica_identity;
DROP TABLE test_replica_identity_othertable;
+----
+-- Make sure it propagates to partitions
+----
+CREATE TABLE test_replica_identity_part (a int, b int) PARTITION BY RANGE (a);
+CREATE TABLE test_replica_identity_part1 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (0) TO (1000) PARTITION BY RANGE (a);
+CREATE TABLE test_replica_identity_part2 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (1000) TO (2000);
+CREATE TABLE test_replica_identity_part11 PARTITION OF test_replica_identity_part1
+ FOR VALUES FROM (1000) TO (1500);
+ALTER TABLE test_replica_identity_part REPLICA IDENTITY FULL;
+CREATE TABLE test_replica_identity_part3 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (2000) TO (3000);
+CREATE TABLE test_replica_identity_part4 (LIKE test_replica_identity_part);
+ALTER TABLE test_replica_identity_part ATTACH PARTITION test_replica_identity_part4
+ FOR VALUES FROM (3000) TO (4000);
+\d+ test_replica_identity_part2
+ Table "public.test_replica_identity_part2"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part FOR VALUES FROM (1000) TO (2000)
+Partition constraint: ((a IS NOT NULL) AND (a >= 1000) AND (a < 2000))
+Replica Identity: FULL
+
+\d+ test_replica_identity_part11
+ Table "public.test_replica_identity_part11"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part1 FOR VALUES FROM (1000) TO (1500)
+Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 1000) AND (a IS NOT NULL) AND (a >= 1000) AND (a < 1500))
+Replica Identity: FULL
+
+\d+ test_replica_identity_part
+ Partitioned table "public.test_replica_identity_part"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+ b | integer | | | | plain | |
+Partition key: RANGE (a)
+Partitions: test_replica_identity_part1 FOR VALUES FROM (0) TO (1000), PARTITIONED,
+ test_replica_identity_part2 FOR VALUES FROM (1000) TO (2000),
+ test_replica_identity_part3 FOR VALUES FROM (2000) TO (3000),
+ test_replica_identity_part4 FOR VALUES FROM (3000) TO (4000)
+Replica Identity: FULL
+
+\d+ test_replica_identity_part3
+ Table "public.test_replica_identity_part3"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part FOR VALUES FROM (2000) TO (3000)
+Partition constraint: ((a IS NOT NULL) AND (a >= 2000) AND (a < 3000))
+Replica Identity: FULL
+
+\d+ test_replica_identity_part4
+ Table "public.test_replica_identity_part4"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part FOR VALUES FROM (3000) TO (4000)
+Partition constraint: ((a IS NOT NULL) AND (a >= 3000) AND (a < 4000))
+Replica Identity: FULL
+
+ALTER TABLE test_replica_identity_part ALTER a SET NOT NULL;
+CREATE UNIQUE INDEX trip_b_idx ON test_replica_identity_part (a);
+ALTER TABLE test_replica_identity_part REPLICA IDENTITY USING INDEX trip_b_idx;
+\d+ test_replica_identity_part2
+ Table "public.test_replica_identity_part2"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | not null | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part FOR VALUES FROM (1000) TO (2000)
+Partition constraint: ((a IS NOT NULL) AND (a >= 1000) AND (a < 2000))
+Indexes:
+ "test_replica_identity_part2_a_idx" UNIQUE, btree (a) REPLICA IDENTITY
+
+\d+ test_replica_identity_part11
+ Table "public.test_replica_identity_part11"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | not null | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part1 FOR VALUES FROM (1000) TO (1500)
+Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 1000) AND (a IS NOT NULL) AND (a >= 1000) AND (a < 1500))
+Indexes:
+ "test_replica_identity_part11_a_idx" UNIQUE, btree (a) REPLICA IDENTITY
+
+\d+ test_replica_identity_part
+ Partitioned table "public.test_replica_identity_part"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | not null | | plain | |
+ b | integer | | | | plain | |
+Partition key: RANGE (a)
+Indexes:
+ "trip_b_idx" UNIQUE, btree (a) REPLICA IDENTITY
+Partitions: test_replica_identity_part1 FOR VALUES FROM (0) TO (1000), PARTITIONED,
+ test_replica_identity_part2 FOR VALUES FROM (1000) TO (2000),
+ test_replica_identity_part3 FOR VALUES FROM (2000) TO (3000),
+ test_replica_identity_part4 FOR VALUES FROM (3000) TO (4000)
+
+\d+ test_replica_identity_part3
+ Table "public.test_replica_identity_part3"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | not null | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part FOR VALUES FROM (2000) TO (3000)
+Partition constraint: ((a IS NOT NULL) AND (a >= 2000) AND (a < 3000))
+Indexes:
+ "test_replica_identity_part3_a_idx" UNIQUE, btree (a) REPLICA IDENTITY
+
+\d+ test_replica_identity_part4
+ Table "public.test_replica_identity_part4"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | not null | | plain | |
+ b | integer | | | | plain | |
+Partition of: test_replica_identity_part FOR VALUES FROM (3000) TO (4000)
+Partition constraint: ((a IS NOT NULL) AND (a >= 3000) AND (a < 4000))
+Indexes:
+ "test_replica_identity_part4_a_idx" UNIQUE, btree (a) REPLICA IDENTITY
+
+----
+-- Check behavior with inherited tables
+----
+CREATE TABLE test_replica_identity_inh (a int);
+CREATE TABLE test_replica_identity_cld () INHERITS (test_replica_identity_inh);
+ALTER TABLE test_replica_identity_inh REPLICA IDENTITY FULL;
+CREATE TABLE test_replica_identity_cld2 () INHERITS (test_replica_identity_inh);
+\d+ test_replica_identity_inh
+ Table "public.test_replica_identity_inh"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+Child tables: test_replica_identity_cld,
+ test_replica_identity_cld2
+Replica Identity: FULL
+
+\d+ test_replica_identity_cld
+ Table "public.test_replica_identity_cld"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+Inherits: test_replica_identity_inh
+
+\d+ test_replica_identity_cld2
+ Table "public.test_replica_identity_cld2"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | | | plain | |
+Inherits: test_replica_identity_inh
+
diff --git a/src/test/regress/sql/replica_identity.sql b/src/test/regress/sql/replica_identity.sql
index b08a3623b8..9e309796f2 100644
--- a/src/test/regress/sql/replica_identity.sql
+++ b/src/test/regress/sql/replica_identity.sql
@@ -77,3 +77,46 @@ SELECT relreplident FROM pg_class WHERE oid = 'test_replica_identity'::regclass;
DROP TABLE test_replica_identity;
DROP TABLE test_replica_identity_othertable;
+
+----
+-- Make sure it propagates to partitions
+----
+CREATE TABLE test_replica_identity_part (a int, b int) PARTITION BY RANGE (a);
+CREATE TABLE test_replica_identity_part1 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (0) TO (1000) PARTITION BY RANGE (a);
+CREATE TABLE test_replica_identity_part2 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (1000) TO (2000);
+CREATE TABLE test_replica_identity_part11 PARTITION OF test_replica_identity_part1
+ FOR VALUES FROM (1000) TO (1500);
+ALTER TABLE test_replica_identity_part REPLICA IDENTITY FULL;
+CREATE TABLE test_replica_identity_part3 PARTITION OF test_replica_identity_part
+ FOR VALUES FROM (2000) TO (3000);
+CREATE TABLE test_replica_identity_part4 (LIKE test_replica_identity_part);
+ALTER TABLE test_replica_identity_part ATTACH PARTITION test_replica_identity_part4
+ FOR VALUES FROM (3000) TO (4000);
+\d+ test_replica_identity_part2
+\d+ test_replica_identity_part11
+\d+ test_replica_identity_part
+\d+ test_replica_identity_part3
+\d+ test_replica_identity_part4
+
+ALTER TABLE test_replica_identity_part ALTER a SET NOT NULL;
+CREATE UNIQUE INDEX trip_b_idx ON test_replica_identity_part (a);
+ALTER TABLE test_replica_identity_part REPLICA IDENTITY USING INDEX trip_b_idx;
+\d+ test_replica_identity_part2
+\d+ test_replica_identity_part11
+\d+ test_replica_identity_part
+\d+ test_replica_identity_part3
+\d+ test_replica_identity_part4
+
+
+----
+-- Check behavior with inherited tables
+----
+CREATE TABLE test_replica_identity_inh (a int);
+CREATE TABLE test_replica_identity_cld () INHERITS (test_replica_identity_inh);
+ALTER TABLE test_replica_identity_inh REPLICA IDENTITY FULL;
+CREATE TABLE test_replica_identity_cld2 () INHERITS (test_replica_identity_inh);
+\d+ test_replica_identity_inh
+\d+ test_replica_identity_cld
+\d+ test_replica_identity_cld2
--
2.11.0
On Tue, Feb 5, 2019 at 12:54 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Actually, index-based replica identities failed in pg_dump: we first
create the index ONLY on the partitioned table (marking it as invalid),
so when we immediately do the ALTER TABLE/REPLICA IDENTITY, it rejects
the invalid index. If we change the code to allow invalid indexes on
partitioned tables to become replica identities, we hit the problem that
the index didn't exist when processing the partition list. In order to
fix that, I added a flag so that partitions are allowed not to have the
index, in hopes that the missing index are created in subsequent
commands; those indexes should become valid & identity afterwards.There's a small emerging problem, which is that if you have an invalid
index marked as replica identity, you cannot create any more partitions;
the reason is that we want to propagate the replica identity to the
partition, but the index is not there (since invalid indexes are not
propagated). I don't think this case is worth supporting; it can be
fixed but requires some work[1].New patch attached.
Could there be a race condition somewhere? The idea and the code looks
reasonable, but when I'm testing ALTER TABLE ... REPLICA IDENTITY with this
patch and concurrent partition creation, I've got the following error on ATTACH
PARTITION:
ERROR: 42P17: replica index does not exist in partition "test373"
LOCATION: MatchReplicaIdentity, tablecmds.c:15018
This error seems unstable, other time I've got a deadlock. And I don't observe
this behaviour on the master.
On 2019-Feb-07, Dmitry Dolgov wrote:
Could there be a race condition somewhere? The idea and the code looks
reasonable, but when I'm testing ALTER TABLE ... REPLICA IDENTITY with this
patch and concurrent partition creation, I've got the following error on ATTACH
PARTITION:ERROR: 42P17: replica index does not exist in partition "test373"
LOCATION: MatchReplicaIdentity, tablecmds.c:15018This error seems unstable, other time I've got a deadlock. And I don't observe
this behaviour on the master.
Clearly there is a problem somewhere. I'll investigate. Thanks for
testing.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Feb-07, Dmitry Dolgov wrote:
Could there be a race condition somewhere? The idea and the code looks
reasonable, but when I'm testing ALTER TABLE ... REPLICA IDENTITY with this
patch and concurrent partition creation, I've got the following error on ATTACH
PARTITION:ERROR: 42P17: replica index does not exist in partition "test373"
LOCATION: MatchReplicaIdentity, tablecmds.c:15018This error seems unstable, other time I've got a deadlock. And I don't observe
this behaviour on the master.
Can you share your reproducer?
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Feb 15, 2019 at 10:02 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
On 2019-Feb-07, Dmitry Dolgov wrote:
Could there be a race condition somewhere? The idea and the code looks
reasonable, but when I'm testing ALTER TABLE ... REPLICA IDENTITY with this
patch and concurrent partition creation, I've got the following error on ATTACH
PARTITION:ERROR: 42P17: replica index does not exist in partition "test373"
LOCATION: MatchReplicaIdentity, tablecmds.c:15018This error seems unstable, other time I've got a deadlock. And I don't observe
this behaviour on the master.Can you share your reproducer?
Sure, I was running concurrently the attached script, that creates a chain of
nested partitions test1,test2,..., and in a separate session:
alter table test1 replica identity full;
Checking this time, I've got from the script:
ERROR: 40P01: deadlock detected
DETAIL: Process 10547 waits for AccessShareLock on relation 30449
of database
29898; blocked by process 9714.
Process 9714 waits for AccessExclusiveLock on relation 30454 of
database 29898;
blocked by process 10547.
HINT: See server log for query details.
LOCATION: DeadLockReport, deadlock.c:1140
Time: 1001.917 ms (00:01.002)
Attachments:
On Mon, Feb 4, 2019 at 11:30 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
If you do ALTER TABLE .. REPLICA IDENTITY to a partitioned table, the
command operates on the parent table itself and does not propagate to
partitions. Why is this? Maybe not recursing was the right call when
we only had regular inheritance (back in 9.4), but since partitioned
tables got introduced, I think it is completely the other way around:
not recursing is an usability fail.
This sounds an awful like the TABLESPACE case. I wonder how many more
similar things there are.
It's not unreasonable to use the parent's REPLICA IDENTITY setting as
the default for new partitions, much as we now do for the TABLESPACE,
because the parent's replica identity is otherwise without meaning.
But I'm less convinced that it's reasonable to have changing the
parent's replica identity recurse to existing children, because:
1. That's different from the TABLESPACE case. We shouldn't just treat
each new instance of this problem as a green field where we can pick
any old behavior at random; it should be consistent as far as
reasonable, and
2. It confuses a change to the default for new partitions with a
change to the value for existing partitions. Say that you've got 5
existing partitions that use one setting, but now you want new
partitions to use a different setting. You can't get that with your
proposed semantics, because trying to change the default will change
all the existing partitions to the new value also, even though that's
not what you wanted.
We should really, really try to figure out all of the things that are
like this -- a property that is meaningless for the partitioned table
itself but may serve as a useful default for its children -- and try
to make them all work the same, ideally in one release, rather than
changing them at different times, back-patching sometimes, and having
no consistency in the details.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2019-Feb-19, Robert Haas wrote:
It's not unreasonable to use the parent's REPLICA IDENTITY setting as
the default for new partitions, much as we now do for the TABLESPACE,
because the parent's replica identity is otherwise without meaning.
But I'm less convinced that it's reasonable to have changing the
parent's replica identity recurse to existing children, because:1. That's different from the TABLESPACE case. We shouldn't just treat
each new instance of this problem as a green field where we can pick
any old behavior at random; it should be consistent as far as
reasonable,
Maybe we should be using the inheritance marker in the command to note
whether to recurse to partitions? That is, if you say
ALTER TABLE ONLY parent SET REPLICA IDENTITY
then we don't recurse and just change the parent table and future
partitions, whereas if you leave out the ONLY then it does affect
existing partitions.
However, I think TABLESPACE and any other property that involves
rewriting tables wholesale can reasonably be expected to behave
differently (w.r.t. recursing) from commands that just modify the
catalogs. I think we already have such a distinction somewhere.
EXPLAIN ALTER TABLE would sure be handy :-)
2. It confuses a change to the default for new partitions with a
change to the value for existing partitions. Say that you've got 5
existing partitions that use one setting, but now you want new
partitions to use a different setting. You can't get that with your
proposed semantics, because trying to change the default will change
all the existing partitions to the new value also, even though that's
not what you wanted.
If we make sure to heed ONLY (and I admit to not thinking about it in my
original patch) then I think this angle is covered.
We should really, really try to figure out all of the things that are
like this -- a property that is meaningless for the partitioned table
itself but may serve as a useful default for its children -- and try
to make them all work the same, ideally in one release, rather than
changing them at different times, back-patching sometimes, and having
no consistency in the details.
I have no argument against somebody doing that, but I don't think I have
the resources to do in in time for 12; on the other hand, leaving a
known misfeature unfixed because nobody has looked for hypothetical
similar bugs would be bad.
I'm not proposing to back-patch this, but I would love it if I can
borrow somebody's time machine to retroactively fix it for 11.0.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Feb 19, 2019 at 3:40 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Maybe we should be using the inheritance marker in the command to note
whether to recurse to partitions? That is, if you say
ALTER TABLE ONLY parent SET REPLICA IDENTITY
then we don't recurse and just change the parent table and future
partitions, whereas if you leave out the ONLY then it does affect
existing partitions.However, I think TABLESPACE and any other property that involves
rewriting tables wholesale can reasonably be expected to behave
differently (w.r.t. recursing) from commands that just modify the
catalogs. I think we already have such a distinction somewhere.
I don't really follow why that should be different, or why the user
should be expected to know or care whether a particular property
involves rewriting.
I have no argument against somebody doing that, but I don't think I have
the resources to do in in time for 12; on the other hand, leaving a
known misfeature unfixed because nobody has looked for hypothetical
similar bugs would be bad.
Oh, come on. If you don't have time to study the issue and come up
with a plan that can at least in theory be applied to all similar
cases in a principled way, whether or not you have time to apply it to
all of those cases, then you have no business committing anything at
all. You're basically saying that you don't have time to do this
right, so you're just going to do something at random and hope it
doesn't create too many problems later. That's terrible.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2019-Feb-19, Robert Haas wrote:
On Tue, Feb 19, 2019 at 3:40 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Maybe we should be using the inheritance marker in the command to note
whether to recurse to partitions? That is, if you say
ALTER TABLE ONLY parent SET REPLICA IDENTITY
then we don't recurse and just change the parent table and future
partitions, whereas if you leave out the ONLY then it does affect
existing partitions.However, I think TABLESPACE and any other property that involves
rewriting tables wholesale can reasonably be expected to behave
differently (w.r.t. recursing) from commands that just modify the
catalogs. I think we already have such a distinction somewhere.I don't really follow why that should be different, or why the user
should be expected to know or care whether a particular property
involves rewriting.
OK, let me concede that point -- it's not rewriting that makes things
act differently, but rather TABLESPACE (as well as some other things)
behave that way. ALTER TABLE ... SET DATA TYPE is the obvious
counterexample.
The Notes section of ALTER TABLE says:
: The actions for identity columns (ADD GENERATED, SET etc., DROP IDENTITY), as
: well as the actions TRIGGER, CLUSTER, OWNER, and TABLESPACE never recurse to
: descendant tables; that is, they always act as though ONLY were specified.
: Adding a constraint recurses only for CHECK constraints that are not marked NO
: INHERIT.
Since REPLICA IDENTITY does not appear in this list, the documented
behavior is to recurse, per the description of the "name" parameter:
: The name (optionally schema-qualified) of an existing table to
: alter. If ONLY is specified before the table name, only that table
: is altered. If ONLY is not specified, the table and all its
: descendant tables (if any) are altered. Optionally, * can be
: specified after the table name to explicitly indicate that
: descendant tables are included.
I didn't come up with this on my own, as you imply.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Feb 19, 2019 at 5:41 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
OK, let me concede that point -- it's not rewriting that makes things
act differently, but rather TABLESPACE (as well as some other things)
behave that way. ALTER TABLE ... SET DATA TYPE is the obvious
counterexample.The Notes section of ALTER TABLE says:
: The actions for identity columns (ADD GENERATED, SET etc., DROP IDENTITY), as
: well as the actions TRIGGER, CLUSTER, OWNER, and TABLESPACE never recurse to
: descendant tables; that is, they always act as though ONLY were specified.
: Adding a constraint recurses only for CHECK constraints that are not marked NO
: INHERIT.
I don't see that in the NOTES section for ALTER TABLE. Are you
looking at some patch, or at master?
More generally, our documentation in this area seems to be somewhat
lacking. For instance, the TABLESPACE section of the ALTER TABLE
documentation appears to make no mention of what exactly the behavior
is when applied to a partition table. It doesn't seem sufficient to
simply say "well, it doesn't recurse," because that would imply that
it doesn't do anything at all, which isn't the case.
Since REPLICA IDENTITY does not appear in this list, the documented
behavior is to recurse, per the description of the "name" parameter:: The name (optionally schema-qualified) of an existing table to
: alter. If ONLY is specified before the table name, only that table
: is altered. If ONLY is not specified, the table and all its
: descendant tables (if any) are altered. Optionally, * can be
: specified after the table name to explicitly indicate that
: descendant tables are included.I didn't come up with this on my own, as you imply.
Fair enough. I don't think it's entirely useful to think about this
in terms of which operations do and don't recurse; the question in my
mind is WHY some things recurse and other things don't, and what will
be easiest for users to understand. Obviously any property where the
parents and children have to match, like adding a column or changing a
data type, has to always recurse; nothing else is sensible. For
others, it seems we have a choice. It's unclear to me why we'd choose
to make REPLICA IDENTITY recurse by default and, say, OWNER not
recurse. In both cases, the default assumption is presumably that the
whole partitioning hierarchy should match, but in a particular case a
user could want to do something else. Consequently, I don't
understand why a user would want one of those operations to descend to
children by default and the other not. It feels like we're choosing
the behavior in individual cases, as Tom would say, with the aid of a
dartboard. Maybe there's some coherent principle here that I'm just
missing.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2019-Feb-20, Robert Haas wrote:
On Tue, Feb 19, 2019 at 5:41 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
OK, let me concede that point -- it's not rewriting that makes things
act differently, but rather TABLESPACE (as well as some other things)
behave that way. ALTER TABLE ... SET DATA TYPE is the obvious
counterexample.The Notes section of ALTER TABLE says:
: The actions for identity columns (ADD GENERATED, SET etc., DROP IDENTITY), as
: well as the actions TRIGGER, CLUSTER, OWNER, and TABLESPACE never recurse to
: descendant tables; that is, they always act as though ONLY were specified.
: Adding a constraint recurses only for CHECK constraints that are not marked NO
: INHERIT.I don't see that in the NOTES section for ALTER TABLE. Are you
looking at some patch, or at master?
I was looking here:
https://www.postgresql.org/docs/11/sql-altertable.html
More generally, our documentation in this area seems to be somewhat
lacking. For instance, the TABLESPACE section of the ALTER TABLE
documentation appears to make no mention of what exactly the behavior
is when applied to a partition table. It doesn't seem sufficient to
simply say "well, it doesn't recurse," because that would imply that
it doesn't do anything at all, which isn't the case.
That's true. Maybe we can add a short blurb in the stanza for the SET
TABLESPACE form in the Description section. It currently says:
: This form changes the table's tablespace to the specified tablespace
: and moves the data file(s) associated with the table to the new
: tablespace. Indexes on the table, if any, are not moved; but they
: can be moved separately with additional SET TABLESPACE commands. All
: tables in the current database in a tablespace can be moved by using
: the ALL IN TABLESPACE form, which will lock all tables to be moved
: first and then move each one. This form also supports OWNED BY,
: which will only move tables owned by the roles specified. If the
: NOWAIT option is specified then the command will fail if it is
: unable to acquire all of the locks required immediately. Note that
: system catalogs are not moved by this command, use ALTER DATABASE or
: explicit ALTER TABLE invocations instead if desired. The
: information_schema relations are not considered part of the system
: catalogs and will be moved. See also CREATE TABLESPACE.
Maybe the ALL IN TABLESPACE and OWNED BY sub-forms should be split to a
separate para. I suggest:
: This form changes the table's tablespace to the specified tablespace
: and moves the data file(s) associated with the table to the new
: tablespace. Indexes on the table, if any, are not moved; but they
: can be moved separately with additional SET TABLESPACE commands.
: When applied to a partitioned table, nothing is moved, but any
: partitions created afterwards with CREATE TABLE PARTITION OF
: will use that tablespace.
:
: All
: tables in the current database in a tablespace can be moved by using
: the ALL IN TABLESPACE form, which will lock all tables to be moved
: first and then move each one. This form also supports OWNED BY,
: which will only move tables owned by the roles specified. If the
: NOWAIT option is specified then the command will fail if it is
: unable to acquire all of the locks required immediately. Note that
: system catalogs are not moved by this command, use ALTER DATABASE or
: explicit ALTER TABLE invocations instead if desired. The
: information_schema relations are not considered part of the system
: catalogs and will be moved. See also CREATE TABLESPACE.
Since REPLICA IDENTITY does not appear in this list, the documented
behavior is to recurse, per the description of the "name" parameter:: The name (optionally schema-qualified) of an existing table to
: alter. If ONLY is specified before the table name, only that table
: is altered. If ONLY is not specified, the table and all its
: descendant tables (if any) are altered. Optionally, * can be
: specified after the table name to explicitly indicate that
: descendant tables are included.I didn't come up with this on my own, as you imply.
Fair enough. I don't think it's entirely useful to think about this
in terms of which operations do and don't recurse; the question in my
mind is WHY some things recurse and other things don't, and what will
be easiest for users to understand.
I think the reason tablespace was made not to recurse was to avoid
acquiring access exclusive lock on too many tables at once, but I'm not
sure. This is very old behavior.
Obviously any property where the
parents and children have to match, like adding a column or changing a
data type, has to always recurse; nothing else is sensible. For
others, it seems we have a choice. It's unclear to me why we'd choose
to make REPLICA IDENTITY recurse by default and, say, OWNER not
recurse.
Ah. I did argue that OWNER should recurse:
/messages/by-id/20171017163203.uw7hmlqonidlfeqj@alvherre.pgsql
but it was too late then to change it for pg10. We can change it now,
surely.
In both cases, the default assumption is presumably that the
whole partitioning hierarchy should match, but in a particular case a
user could want to do something else. Consequently, I don't
understand why a user would want one of those operations to descend to
children by default and the other not.
I agree that OWNER TO should recurse for partitioned tables. I'm -0 on
changing it for legacy inheritance, but if the majority is to change it
for both, let's do that.
It feels like we're choosing the behavior in individual cases, as Tom
would say, with the aid of a dartboard. Maybe there's some coherent
principle here that I'm just missing.
I don't think that's a thing we're doing now. Rather we all did it as a
group years ago, and we're now finding that some of the darts landed in
the wrong spot of the board. I don't disagree with fixing all the ones
we know about (I volunteer to do that), but I don't want to conduct a
full audit of tablecmds.c.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, 20 Feb 2019 at 17:38, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:
Fair enough. I don't think it's entirely useful to think about this
in terms of which operations do and don't recurse; the question in my
mind is WHY some things recurse and other things don't, and what will
be easiest for users to understand.I think the reason tablespace was made not to recurse was to avoid
acquiring access exclusive lock on too many tables at once, but I'm not
sure. This is very old behavior.
-1 for changing that; here's why:
Partitioning is designed to support very large tables.
Since the table is very large there is a valid use case for moving older
partitions to other tablespaces, one at a time.
If we moved all partitions at once, while holding AccessExclusiveLock, it
would a) likely fail with out of space errors, b) if you are unlucky enough
for it to succeed it would lock the table for a ridiculously long time. The
main use case for changing the tablespace is to define where new partitions
should be created. So in this specific case only, recursion makes no sense.
Obviously any property where the
parents and children have to match, like adding a column or changing a
data type, has to always recurse; nothing else is sensible. For
others, it seems we have a choice. It's unclear to me why we'd choose
to make REPLICA IDENTITY recurse by default and, say, OWNER not
recurse.Ah. I did argue that OWNER should recurse:
/messages/by-id/20171017163203.uw7hmlqonidlfeqj@alvherre.pgsql
but it was too late then to change it for pg10. We can change it now,
surely.In both cases, the default assumption is presumably that the
whole partitioning hierarchy should match, but in a particular case a
user could want to do something else. Consequently, I don't
understand why a user would want one of those operations to descend to
children by default and the other not.I agree that OWNER TO should recurse for partitioned tables.
+1
That was clearly just missed. It's a strange thought to have a partitioned
table with different pieces owned by different users. So strange that the
lack of complaints about the recursion are surely because nobody ever tried
it in a real situation. I would prefer to disallow differences in ownership
across partitions, since that almost certainly prevents running sensible
DDL and its just a huge footgun.
Recursion should be the default for partitioned tables.
I'm -0 on
changing it for legacy inheritance, but if the majority is to change it
for both, let's do that.
-1 for changing legacy inheritance. Two separate features. Inheritance has
been around for many years and its feature set is stable. No need to touch
it.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Feb 20, 2019 at 12:38 PM Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
I don't see that in the NOTES section for ALTER TABLE. Are you
looking at some patch, or at master?I was looking here:
https://www.postgresql.org/docs/11/sql-altertable.html
OK, I was looking at the wrong thing. Not enough caffeine, apparently.
Maybe the ALL IN TABLESPACE and OWNED BY sub-forms should be split to a
separate para. I suggest:: This form changes the table's tablespace to the specified tablespace
: and moves the data file(s) associated with the table to the new
: tablespace. Indexes on the table, if any, are not moved; but they
: can be moved separately with additional SET TABLESPACE commands.
: When applied to a partitioned table, nothing is moved, but any
: partitions created afterwards with CREATE TABLE PARTITION OF
: will use that tablespace.
:
: All
: tables in the current database in a tablespace can be moved by using
: the ALL IN TABLESPACE form, which will lock all tables to be moved
: first and then move each one. This form also supports OWNED BY,
: which will only move tables owned by the roles specified. If the
: NOWAIT option is specified then the command will fail if it is
: unable to acquire all of the locks required immediately. Note that
: system catalogs are not moved by this command, use ALTER DATABASE or
: explicit ALTER TABLE invocations instead if desired. The
: information_schema relations are not considered part of the system
: catalogs and will be moved. See also CREATE TABLESPACE.
Seems reasonable.
I think the reason tablespace was made not to recurse was to avoid
acquiring access exclusive lock on too many tables at once, but I'm not
sure. This is very old behavior.Obviously any property where the
parents and children have to match, like adding a column or changing a
data type, has to always recurse; nothing else is sensible. For
others, it seems we have a choice. It's unclear to me why we'd choose
to make REPLICA IDENTITY recurse by default and, say, OWNER not
recurse.Ah. I did argue that OWNER should recurse:
/messages/by-id/20171017163203.uw7hmlqonidlfeqj@alvherre.pgsql
but it was too late then to change it for pg10. We can change it now,
surely.
Yeah, we could. I wonder, though, whether we should just make
everything recurse. I think that's what people are commonly going to
want, at least for partitioned tables, and it doesn't seem to me that
it would hurt anything to make the inheritance case work that way,
too. Right now it looks like we have this list of exceptions:
- actions for identity columns (ADD GENERATED, SET etc., DROP IDENTITY)
- TRIGGER
- CLUSTER
- OWNER
- TABLESPACE never recurse to descendant tables
- Adding a constraint recurses only for CHECK constraints that are not
marked NO INHERIT.
I have a feeling we eventually want this list to be empty, right? We
want a partitioned table to work as much like a non-partitioned table
as possible, unless the user asks for some other behavior. Going from
six exceptions to four and maybe having some of them depend on whether
it's partitioning or inheritance doesn't seem like it's as good and
clear as trying to adopt a really uniform policy.
I don't buy Simon's argument that we should treat TABLESPACE
differently because the tables might be really big and take a long
time to move. I agree that this could well be true, but nobody is
proposing to remove the ability to move tables individually or to use
ONLY here. Allowing TABLESPACE to recurse just gives people one
additional choice that they don't have today: to move everything at
once. We don't lose any functionality by enabling that.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, 20 Feb 2019 at 18:51, Robert Haas <robertmhaas@gmail.com> wrote:
I don't buy Simon's argument that we should treat TABLESPACE
differently because the tables might be really big and take a long
time to move. I agree that this could well be true, but nobody is
proposing to remove the ability to move tables individually or to use
ONLY here. Allowing TABLESPACE to recurse just gives people one
additional choice that they don't have today: to move everything at
once. We don't lose any functionality by enabling that.
Doing that would add the single largest footgun in Postgres, given that
command's current behavior and the size of partitioned tables.
If it moved partitions concurrently I'd feel differently.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Feb-20, Robert Haas wrote:
On Wed, Feb 20, 2019 at 12:38 PM Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
Maybe the ALL IN TABLESPACE and OWNED BY sub-forms should be split to a
separate para. I suggest:: This form changes the table's tablespace to the specified tablespace
: and moves the data file(s) associated with the table to the new
: tablespace. Indexes on the table, if any, are not moved; but they
: can be moved separately with additional SET TABLESPACE commands.
: When applied to a partitioned table, nothing is moved, but any
: partitions created afterwards with CREATE TABLE PARTITION OF
: will use that tablespace.
:
: All
: tables in the current database in a tablespace can be moved by using
: the ALL IN TABLESPACE form, which will lock all tables to be moved
: first and then move each one. This form also supports OWNED BY,
: which will only move tables owned by the roles specified. If the
: NOWAIT option is specified then the command will fail if it is
: unable to acquire all of the locks required immediately. Note that
: system catalogs are not moved by this command, use ALTER DATABASE or
: explicit ALTER TABLE invocations instead if desired. The
: information_schema relations are not considered part of the system
: catalogs and will be moved. See also CREATE TABLESPACE.Seems reasonable.
Pushed this bit, thanks.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Added Peter E to CC; question at the very end.
On 2019-Feb-20, Robert Haas wrote:
Yeah, we could. I wonder, though, whether we should just make
everything recurse. I think that's what people are commonly going to
want, at least for partitioned tables, and it doesn't seem to me that
it would hurt anything to make the inheritance case work that way,
too. Right now it looks like we have this list of exceptions:- actions for identity columns (ADD GENERATED, SET etc., DROP IDENTITY)
- TRIGGER
- CLUSTER
- OWNER
- TABLESPACE never recurse to descendant tables
- Adding a constraint recurses only for CHECK constraints that are not
marked NO INHERIT.I have a feeling we eventually want this list to be empty, right? We
want a partitioned table to work as much like a non-partitioned table
as possible, unless the user asks for some other behavior. Going from
six exceptions to four and maybe having some of them depend on whether
it's partitioning or inheritance doesn't seem like it's as good and
clear as trying to adopt a really uniform policy.I don't buy Simon's argument that we should treat TABLESPACE
differently because the tables might be really big and take a long
time to move. I agree that this could well be true, but nobody is
proposing to remove the ability to move tables individually or to use
ONLY here. Allowing TABLESPACE to recurse just gives people one
additional choice that they don't have today: to move everything at
once. We don't lose any functionality by enabling that.
Tablespaces already behave a little bit especially in another sense:
it doesn't recurse to indexes. I think it's not possible to implement a
three-way flag, where you tell it to move only the table, or to recurse
to child tables but leave local indexes alone, or just to move
everything. If we don't move the child indexes, are we really recursing
in that sense?
I think changing the behavior of tablespaces is likely to cause pain for
people with existing scripts that expect the command not to recurse. We
would have to add a switch of some kind to provide the old behavior in
order not to break those, and that doesn't seem so good either.
I agree we definitely want to treat a partitioned table as similarly as
possible to a non-partitioned table, but in the particular case of
tablespace it seems to me better not to mess with the behavior.
CLUSTER: I'm not sure about this one. For legacy inheritance there's
no principled way to figure out the right index to recurse to. For
partitioning that seems a very simple problem, but I do wonder if
there's any use case at all for ALTER TABLE CLUSTER there. My
inclination would be to reject ALTER TABLE CLUSTER on a partitioned
table altogether, if it isn't already.
OWNER: let's just change this one to recurse always. I think we have a
consensus about this one.
TRIGGER: I think it would be good to recurse, based on the trigger name.
I'm not sure what to do about legacy inheritance; the trigger isn't
likely to be there. An option is to silently ignore each inheritance
child (not partition) if the trigger isn't present on it.
We all seem to agree that REPLICA IDENTITY should recurse.
Maybe ADD GENERATED AS IDENTITY / DROP IDENTITY should recurse; not
really sure about this one. Peter?
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Feb 28, 2019 at 07:41:11PM -0300, Alvaro Herrera wrote:
We all seem to agree that REPLICA IDENTITY should recurse.
(entering the ring)
FWIW, I agree that having REPLICA IDENTITY recurse on partitions feels
more natural, as much as being able to use ALTER TABLE ONLY to only
update the definition on a given partition member.
--
Michael
On Thu, Feb 28, 2019 at 6:13 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Tablespaces already behave a little bit especially in another sense:
it doesn't recurse to indexes. I think it's not possible to implement a
three-way flag, where you tell it to move only the table, or to recurse
to child tables but leave local indexes alone, or just to move
everything. If we don't move the child indexes, are we really recursing
in that sense?
I don't quite follow this. If you want to change the default for new
partitions, you would use ONLY, which updates the value for the parent
(which is cheap) but doesn't touch the children. If you want to move
it all, you would omit ONLY. I'm not sureI quite understand the third
case - what do you mean by moving the child tables but leaving the
local indexes alone?
I think changing the behavior of tablespaces is likely to cause pain for
people with existing scripts that expect the command not to recurse. We
would have to add a switch of some kind to provide the old behavior in
order not to break those, and that doesn't seem so good either.I agree we definitely want to treat a partitioned table as similarly as
possible to a non-partitioned table, but in the particular case of
tablespace it seems to me better not to mess with the behavior.
You may be right, but I'm not 100% convinced. I don't understand why
changing the behavior for tablespaces would be a grant shocker and
changing the behavior for OWNER TO would be a big nothing. If you
mess up the first one, you'll realize it's running for too long and go
"oh, oops" and fix it next time. If you mess up the second one,
you'll have a security hole, be exploited by hackers, suffer civil and
criminal investigations due to a massive data security breach, and end
your life in penury and in prison, friendless and alone. Or maybe
not, but the point is that BOTH of these things have an established
behavior such that changing it may surprise some people, and actually
I would argue that the surprise for the TABLESPACE will tend to be
more obvious and less likely to have subtle, unintended consequences.
I hate to open a can of worms here, but there's also the relationship
between OWNER TO and GRANT/REVOKE; that might also need a little
thought.
CLUSTER: I'm not sure about this one. For legacy inheritance there's
no principled way to figure out the right index to recurse to. For
partitioning that seems a very simple problem, but I do wonder if
there's any use case at all for ALTER TABLE CLUSTER there. My
inclination would be to reject ALTER TABLE CLUSTER on a partitioned
table altogether, if it isn't already.
Hmm, I disagree. I think this should work and set the value for the
whole hierarchy. That seems well-defined and useful -- why is it any
less useful than for a standalone table?
OWNER: let's just change this one to recurse always. I think we have a
consensus about this one.
We really don't have many votes either way, AFAICS. I'm not direly
opposed, but I'd like to see a more vigorous attempt to make them ALL
recurse rather than changing one per release for the next 8 years.
TRIGGER: I think it would be good to recurse, based on the trigger name.
I'm not sure what to do about legacy inheritance; the trigger isn't
likely to be there. An option is to silently ignore each inheritance
child (not partition) if the trigger isn't present on it.
Hmm, I think this one should maybe recurse to triggers that cascaded
downwards, which would of its nature apply to partitioning but not to
inheritance.
We all seem to agree that REPLICA IDENTITY should recurse.
My feeling on this is the same as for OWNER.
Maybe ADD GENERATED AS IDENTITY / DROP IDENTITY should recurse; not
really sure about this one. Peter?
I don't know enough about this to have an opinion.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2019-02-28 23:41, Alvaro Herrera wrote:
Maybe ADD GENERATED AS IDENTITY / DROP IDENTITY should recurse; not
really sure about this one. Peter?
No, the intention is that only the partition root has the identity
property. If you wanted to make it recurse, then you'd need to arrange
it so that all partitions refer to the same sequence, but that's
currently not possible.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Unless there are any objections to fixing the REPLICA IDENTITY bug, I
intend to push that tomorrow.
People can continue to discuss changing the behavior of other
subcommands where reasonable (OWNER TO) or even for the cases others
consider not reasonable (TABLESPACE), but there is no consensus of doing
that, and no patches either. I suppose those changes (in the name of
uncompromising consistency) will have to wait till pg13.
Thanks
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Mar 20, 2019 at 4:42 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Unless there are any objections to fixing the REPLICA IDENTITY bug, I
intend to push that tomorrow.
I still think that this is an ill-considered, piecemeal approach to a
problem that deserves a better solution.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2019-Mar-21, Robert Haas wrote:
On Wed, Mar 20, 2019 at 4:42 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Unless there are any objections to fixing the REPLICA IDENTITY bug, I
intend to push that tomorrow.I still think that this is an ill-considered, piecemeal approach to a
problem that deserves a better solution.
I already argued that TABLESPACE and OWNER TO are documented to work
that way, and have been for a long time, whereas REPLICA IDENTITY has
never been. If you want to change long-standing behavior, be my guest,
but that's not my patch. On the other hand, there's no consensus that
those should be changed, whereas there no opposition specifically
against changing this one, and in fact it was reported as a bug to me by
actual users.
FWIW I refrained from pushing today because after posting I realized
that I've failed to investigate Dolgov's reported problem.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Mar 21, 2019 at 5:01 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
I already argued that TABLESPACE and OWNER TO are documented to work
that way, and have been for a long time, whereas REPLICA IDENTITY has
never been. If you want to change long-standing behavior, be my guest,
but that's not my patch. On the other hand, there's no consensus that
those should be changed, whereas there no opposition specifically
against changing this one, and in fact it was reported as a bug to me by
actual users.
Well, you have a commit bit, and I cannot prevent you from using it,
and nobody else is backing me up here, but it doesn't change my
opinion.
I think it is HIGHLY irresponsible for you to try to characterize
clear behavior changes in this area as "bug fixes." The fact that a
user say that something is a bug does not make it a bug -- and you
have been a committer long enough to know the difference between
repairing a defect and inventing entirely new behavior. Yet you keep
doing the latter and calling the former.
Commit 33e6c34c32677a168bee4bc6c335aa8d73211a56 is a clear behavior
change for partitioned indexes and yet has the innocuous subject line
"Fix tablespace handling for partitioned indexes." Unbelievably, it
was back-patched into 11.1. Everyone except you agreed that it
created an inconsistency between tables and indexes, so commit
ca4103025dfe26eaaf6a500dec9170fbb176eebc repaired that by doing the
same thing for tables. That one wasn't back-patched, but it was still
described as "Fix tablespace handling for partitioned tables", even
though I said repeatedly that it wasn't a fix, because the actual
behavior was the design behavior, and as the major reviewer and
committer of those patches I ought to know. That latter patch also
broke stuff which it looks like you haven't fixed yet:
/messages/by-id/CAKJS1f_1c260nOt_vBJ067AZ3JXptXVRohDVMLEBmudX1YEx-A@mail.gmail.com
That email thread even includes clear definitional concerns about
whether this behavior is even properly designed:
/messages/by-id/20190306161744.22jdkg37fyi2zyke@alap3.anarazel.de
But I assume that's not going to stop you from propagating the same
kinds of problems into more places.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2019-Mar-22, Robert Haas wrote:
On Thu, Mar 21, 2019 at 5:01 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
I already argued that TABLESPACE and OWNER TO are documented to work
that way, and have been for a long time, whereas REPLICA IDENTITY has
never been. If you want to change long-standing behavior, be my guest,
but that's not my patch. On the other hand, there's no consensus that
those should be changed, whereas there no opposition specifically
against changing this one, and in fact it was reported as a bug to me by
actual users.Well, you have a commit bit, and I cannot prevent you from using it,
and nobody else is backing me up here, but it doesn't change my
opinion.
Can I ask for more opinions here? Should I apply this behavior change
to pg12 or not? If there's no consensus that it should be changed, I
wouldn't change it.
To recap: my proposed change is to make
ALTER TABLE ... REPLICA IDENTITY
when applied on a partitioned table affect all of its partitions instead
of expecting the user to invoke the command for each partition. At the
same time, I am proposing not to change to have recursive behavior other
forms of ALTER TABLE in one commit, such as TABLESPACE and OWNER TO,
which currently do not have recursive behavior.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-03-22 17:52, Alvaro Herrera wrote:
To recap: my proposed change is to make
ALTER TABLE ... REPLICA IDENTITY
when applied on a partitioned table affect all of its partitions instead
of expecting the user to invoke the command for each partition.
If you are operating on a partitioned table and set the replica identity
to the primary key or a partitioned index of that partitioned table,
then I think, by definition of what it means to be a partitioned index,
that applies to the whole partition hierarchy.
Aside from that theoretical consideration, what would be the practical
use of not doing that?
At the
same time, I am proposing not to change to have recursive behavior other
forms of ALTER TABLE in one commit, such as TABLESPACE and OWNER TO,
which currently do not have recursive behavior.
I'm slightly baffled that we would even allow having different owners on
different partitions, but that seems to be a separate discussion.
In general, it seems sensible that if you operate on a partitioned
table, the whole partition hierarchy is affected unless told otherwise.
There may be sensible exceptions, but it seems useful as the default.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Mar 22, 2019 at 07:55:11PM +0100, Peter Eisentraut wrote:
If you are operating on a partitioned table and set the replica identity
to the primary key or a partitioned index of that partitioned table,
then I think, by definition of what it means to be a partitioned index,
that applies to the whole partition hierarchy.Aside from that theoretical consideration, what would be the practical
use of not doing that?
FWIW I agree about the part of inheriting the replica identity of the
parent when defining a partition on it. That's also.. Instinctive.
I'm slightly baffled that we would even allow having different owners on
different partitions, but that seems to be a separate discussion.
Different owners can make sense for multiple layers of partitions
where the children have less restrictions than the children. Imagine
for example a table listing the population of a country, with children
partitioned by regions, and grand-children partitioned by cities. The
top-most parent could be owned by a minister, and lower levels apply
to the region administrator, down to the city administrators.
--
Michael
On Sat, 23 Mar 2019 at 01:34, Michael Paquier <michael@paquier.xyz> wrote:
I'm slightly baffled that we would even allow having different owners on
different partitions, but that seems to be a separate discussion.
Different owners can make sense for multiple layers of partitions
where the children have less restrictions than the children. Imagine
for example a table listing the population of a country, with children
partitioned by regions, and grand-children partitioned by cities. The
top-most parent could be owned by a minister, and lower levels apply
to the region administrator, down to the city administrators.
That use case is possible using different privileges.
Having different owners makes it *very* difficult to administer.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
I've not had time to work on this issue, and there are a few remaining
problems in the patch; mostly that I will need to change the way pg_dump
represents the replica identity so that it is only defined when the
index is dumped in all partitions rather than immediately after creating
the index. Also, the bug reported by Dmitry Dolgov.
So I'm marking this as Returned with Feedback, with the intention to
resubmit for November.
Thanks,
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services